linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Rami Rosen	e57d5cf2f8	devcg: remove parent_cgroup. In devcgroup_css_alloc(), there is no longer need for parent_cgroup. bd2953ebbb("devcg: propagate local changes down the hierarchy") made the variable parent_cgroup redundant. This patch removes parent_cgroup from devcgroup_css_alloc(). Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-04-18 11:34:35 -07:00
Mimi Zohar	df2c2afba4	ima: eliminate passing d_name.name to process_measurement() Passing a pointer to the dentry name, as a parameter to process_measurement(), causes a race condition with rename() and is unnecessary, as the dentry name is already accessible via the file parameter. In the normal case, we use the full pathname as provided by brpm->filename, bprm->interp, or ima_d_path(). Only on ima_d_path() failure, do we fallback to using the d_name.name, which points either to external memory or d_iname. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-04-17 17:20:57 -07:00
Eric Dumazet	ca10b9e9a8	selinux: add a skb_owned_by() hook Commit `90ba9b1986` (tcp: tcp_make_synack() can use alloc_skb()) broke certain SELinux/NetLabel configurations by no longer correctly assigning the sock to the outgoing SYNACK packet. Cost of atomic operations on the LISTEN socket is quite big, and we would like it to happen only if really needed. This patch introduces a new security_ops->skb_owned_by() method, that is a void operation unless selinux is active. Reported-by: Miroslav Vadkerti <mvadkert@redhat.com> Diagnosed-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-security-module@vger.kernel.org Acked-by: James Morris <james.l.morris@oracle.com> Tested-by: Paul Moore <pmoore@redhat.com> Acked-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-09 13:23:11 -04:00
Tejun Heo	8adf12b0ff	devcg: remove broken_hierarchy tag `bd2953ebbb` ("devcg: propagate local changes down the hierarchy") implemented proper hierarchy support. Remove the broken tag. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Aristeu Rozanski <aris@redhat.com>	2013-04-08 08:31:59 -07:00
Casey Schaufler	958d2c2f4a	Smack: include magic.h in smackfs.c As reported for linux-next: Tree for Apr 2 (smack) Add the required include for smackfs.c Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-04-03 13:13:51 +11:00
Jeff Layton	094f7b69ea	selinux: make security_sb_clone_mnt_opts return an error on context mismatch I had the following problem reported a while back. If you mount the same filesystem twice using NFSv4 with different contexts, then the second context= option is ignored. For instance: # mount server:/export /mnt/test1 # mount server:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0 # ls -dZ /mnt/test1 drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test1 # ls -dZ /mnt/test2 drwxrwxrwt. root root system_u:object_r:nfs_t:s0 /mnt/test2 When we call into SELinux to set the context of a "cloned" superblock, it will currently just bail out when it notices that we're reusing an existing superblock. Since the existing superblock is already set up and presumably in use, we can't go overwriting its context with the one from the "original" sb. Because of this, the second context= option in this case cannot take effect. This patch fixes this by turning security_sb_clone_mnt_opts into an int return operation. When it finds that the "new" superblock that it has been handed is already set up, it checks to see whether the contexts on the old superblock match it. If it does, then it will just return success, otherwise it'll return -EBUSY and emit a printk to tell the admin why the second mount failed. Note that this patch may cause casualties. The NFSv4 code relies on being able to walk down to an export from the pseudoroot. If you mount filesystems that are nested within one another with different contexts, then this patch will make those mounts fail in new and "exciting" ways. For instance, suppose that /export is a separate filesystem on the server: # mount server:/ /mnt/test1 # mount salusa:/export /mnt/test2 -o context=system_u:object_r:tmp_t:s0 mount.nfs: an incorrect mount option was specified ...with the printk in the ring buffer. Because we might eventually walk down to /mnt/test1/export, the mount is denied due to this patch. The second mount needs the pseudoroot superblock, but that's already present with the wrong context. OTOH, if we mount these in the reverse order, then both mounts work, because the pseudoroot superblock created when mounting /export is discarded once that mount is done. If we then however try to walk into that directory, the automount fails for the similar reasons: # cd /mnt/test1/scratch/ -bash: cd: /mnt/test1/scratch: Device or resource busy The story I've gotten from the SELinux folks that I've talked to is that this is desirable behavior. In SELinux-land, mounting the same data under different contexts is wrong -- there can be only one. Cc: Steve Dickson <steved@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-04-02 11:30:13 +11:00
David S. Miller	a210576cf8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/mac80211/sta_info.c net/wireless/core.h Two minor conflicts in wireless. Overlapping additions of extern declarations in net/wireless/core.h and a bug fix overlapping with the addition of a boolean parameter to __ieee80211_key_free(). Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-01 13:36:50 -04:00
Linus Torvalds	2c3de1c2d7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns fixes from Eric W Biederman: "The bulk of the changes are fixing the worst consequences of the user namespace design oversight in not considering what happens when one namespace starts off as a clone of another namespace, as happens with the mount namespace. The rest of the changes are just plain bug fixes. Many thanks to Andy Lutomirski for pointing out many of these issues." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Restrict when proc and sysfs can be mounted ipc: Restrict mounting the mqueue filesystem vfs: Carefully propogate mounts across user namespaces vfs: Add a mount flag to lock read only bind mounts userns: Don't allow creation if the user is chrooted yama: Better permission check for ptraceme pid: Handle the exit of a multi-threaded init. scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.	2013-03-28 13:43:46 -07:00
Hong zhi guo	77954983ad	selinux: replace obsolete NLMSG_* with type safe nlmsg_* Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-28 14:25:49 -04:00
Eric W. Biederman	eddc0a3abf	yama: Better permission check for ptraceme Change the permission check for yama_ptrace_ptracee to the standard ptrace permission check, testing if the traceer has CAP_SYS_PTRACE in the tracees user namespace. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-26 13:17:58 -07:00
Aristeu Rozanski	bd2953ebbb	devcg: propagate local changes down the hierarchy This patch makes exception changes to propagate down in hierarchy respecting when possible local exceptions. New exceptions allowing additional access to devices won't be propagated, but it'll be possible to add an exception to access all of part of the newly allowed device(s). New exceptions disallowing access to devices will be propagated down and the local group's exceptions will be revalidated for the new situation. Example: A / \ B group behavior exceptions A allow "b 8:* rwm", "c 116:1 rw" B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm" If a new exception is added to group A: # echo "c 116:* r" > A/devices.deny it'll propagate down and after revalidating B's local exceptions, the exception "c 116:2 rwm" will be removed. In case parent's exceptions change and local exceptions are not allowed anymore, they'll be deleted. v7: - do not allow behavior change when the cgroup has children - update documentation v6: fixed issues pointed by Serge Hallyn - only copy parent's exceptions while propagating behavior if the local behavior is different - while propagating exceptions, do not clear and copy parent's: it'd be against the premise we don't propagate access to more devices v5: fixed issues pointed by Serge Hallyn - updated documentation - not propagating when an exception is written to devices.allow - when propagating a new behavior, clean the local exceptions list if they're for a different behavior v4: fixed issues pointed by Tejun Heo - separated function to walk the tree and collect valid propagation targets v3: fixed issues pointed by Tejun Heo - update documentation - move css_online/css_offline changes to a new patch - use cgroup_for_each_descendant_pre() instead of own descendant walk - move exception_copy rework to a separared patch - move exception_clean rework to a separated patch v2: fixed issues pointed by Tejun Heo - instead of keeping the local settings that won't apply anymore, remove them Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-03-20 07:50:21 -07:00
Aristeu Rozanski	1909554c97	devcg: use css_online and css_offline Allocate resources and change behavior only when online. This is needed in order to determine if a node is suitable for hierarchy propagation or if it's being removed. Locking: Both functions take devcgroup_mutex to make changes to device_cgroup structure. Hierarchy propagation will also take devcgroup_mutex before walking the tree while walking the tree itself is protected by rcu lock. Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-03-20 07:50:17 -07:00
Aristeu Rozanski	c39a2a3018	devcg: prepare may_access() for hierarchy support Currently may_access() is only able to verify if an exception is valid for the current cgroup, which has the same behavior. With hierarchy, it'll be also used to verify if a cgroup local exception is valid towards its cgroup parent, which might have different behavior. v2: - updated patch description - rebased on top of a new patch to expand the may_access() logic to make it more clear - fixed argument description order in may_access() Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-03-20 07:50:13 -07:00
Aristeu Rozanski	26898fdff3	devcg: expand may_access() logic In order to make the next patch more clear, expand may_access() logic. v2: may_access() returns bool now Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Aristeu Rozanski <aris@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2013-03-20 07:50:09 -07:00
Igor Zhbanov	cdb56b6088	Fix NULL pointer dereference in smack_inode_unlink() and smack_inode_rmdir() This patch fixes kernel Oops because of wrong common_audit_data type in smack_inode_unlink() and smack_inode_rmdir(). When SMACK security module is enabled and SMACK logging is on (/smack/logging is not zero) and you try to delete the file which 1) you cannot delete due to SMACK rules and logging of failures is on or 2) you can delete and logging of success is on, you will see following: Unable to handle kernel NULL pointer dereference at virtual address 000002d7 [<...>] (strlen+0x0/0x28) [<...>] (audit_log_untrustedstring+0x14/0x28) [<...>] (common_lsm_audit+0x108/0x6ac) [<...>] (smack_log+0xc4/0xe4) [<...>] (smk_curacc+0x80/0x10c) [<...>] (smack_inode_unlink+0x74/0x80) [<...>] (security_inode_unlink+0x2c/0x30) [<...>] (vfs_unlink+0x7c/0x100) [<...>] (do_unlinkat+0x144/0x16c) The function smack_inode_unlink() (and smack_inode_rmdir()) need to log two structures of different types. First of all it does: smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_DENTRY); smk_ad_setfield_u_fs_path_dentry(&ad, dentry); This will set common audit data type to LSM_AUDIT_DATA_DENTRY and store dentry for auditing (by function smk_curacc(), which in turn calls dump_common_audit_data(), which is actually uses provided data and logs it). /* * You need write access to the thing you're unlinking / rc = smk_curacc(smk_of_inode(ip), MAY_WRITE, &ad); if (rc == 0) { / * You also need write access to the containing directory */ Then this function wants to log anoter data: smk_ad_setfield_u_fs_path_dentry(&ad, NULL); smk_ad_setfield_u_fs_inode(&ad, dir); The function sets inode field, but don't change common_audit_data type. rc = smk_curacc(smk_of_inode(dir), MAY_WRITE, &ad); } So the dump_common_audit() function incorrectly interprets inode structure as dentry, and Oops will happen. This patch reinitializes common_audit_data structures with correct type. Also I removed unneeded smk_ad_setfield_u_fs_path_dentry(&ad, NULL); initialization, because both dentry and inode pointers are stored in the same union. Signed-off-by: Igor Zhbanov <i.zhbanov@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>	2013-03-19 14:38:17 -07:00
Rafal Krypa	e05b6f982a	Smack: add support for modification of existing rules Rule modifications are enabled via /smack/change-rule. Format is as follows: "Subject Object rwaxt rwaxt" First two strings are subject and object labels up to 255 characters. Third string contains permissions to enable. Fourth string contains permissions to disable. All unmentioned permissions will be left unchanged. If no rule previously existed, it will be created. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Rafal Krypa <r.krypa@samsung.com>	2013-03-19 14:16:42 -07:00
Jarkko Sakkinen	cee7e44334	smack: SMACK_MAGIC to include/uapi/linux/magic.h SMACK_MAGIC moved to a proper place for easy user space access (i.e. libsmack). Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@iki.fi>	2013-03-19 14:16:15 -07:00
Rafal Krypa	a87d79ad7c	Smack: add missing support for transmute bit in smack_str_from_perm() This fixes audit logs for granting or denial of permissions to show information about transmute bit. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Rafal Krypa <r.krypa@samsung.com>	2013-03-19 14:15:41 -07:00
Rafal Krypa	d15d9fad16	Smack: prevent revoke-subject from failing when unseen label is written to it Special file /smack/revoke-subject will silently accept labels that are not present on the subject label list. Nothing has to be done for such labels, as there are no rules for them to revoke. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Rafal Krypa <r.krypa@samsung.com>	2013-03-19 14:15:21 -07:00
Dan Carpenter	4502403dcf	selinux: use GFP_ATOMIC under spin_lock The call tree here is: sk_clone_lock() <- takes bh_lock_sock(newsk); xfrm_sk_clone_policy() __xfrm_sk_clone_policy() clone_policy() <- uses GFP_ATOMIC for allocations security_xfrm_policy_clone() security_ops->xfrm_policy_clone_security() selinux_xfrm_policy_clone() Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-03-19 00:33:09 +11:00
Lai Jiangshan	921f3ac4c3	tomoyo: use DEFINE_SRCU() to define tomoyo_ss DEFINE_STATIC_SRCU() defines srcu struct and do init at build time. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-03-18 23:51:31 +11:00
Mathieu Desnoyers	8aec0f5d41	Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to compat_process_vm_rw() shows that the compatibility code requires an explicit "access_ok()" check before calling compat_rw_copy_check_uvector(). The same difference seems to appear when we compare fs/read_write.c:do_readv_writev() to fs/compat.c:compat_do_readv_writev(). This subtle difference between the compat and non-compat requirements should probably be debated, as it seems to be error-prone. In fact, there are two others sites that use this function in the Linux kernel, and they both seem to get it wrong: Now shifting our attention to fs/aio.c, we see that aio_setup_iocb() also ends up calling compat_rw_copy_check_uvector() through aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to be missing. Same situation for security/keys/compat.c:compat_keyctl_instantiate_key_iov(). I propose that we add the access_ok() check directly into compat_rw_copy_check_uvector(), so callers don't have to worry about it, and it therefore makes the compat call code similar to its non-compat counterpart. Place the access_ok() check in the same location where copy_from_user() can trigger a -EFAULT error in the non-compat code, so the ABI behaviors are alike on both compat and non-compat. While we are here, fix compat_do_readv_writev() so it checks for compat_rw_copy_check_uvector() negative return values. And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error handling. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-12 11:05:45 -07:00
David Howells	0da9dfdd2c	keys: fix race with concurrent install_user_keyrings() This fixes CVE-2013-1792. There is a race in install_user_keyrings() that can cause a NULL pointer dereference when called concurrently for the same user if the uid and uid-session keyrings are not yet created. It might be possible for an unprivileged user to trigger this by calling keyctl() from userspace in parallel immediately after logging in. Assume that we have two threads both executing lookup_user_key(), both looking for KEY_SPEC_USER_SESSION_KEYRING. THREAD A THREAD B =============================== =============================== ==>call install_user_keyrings(); if (!cred->user->session_keyring) ==>call install_user_keyrings() ... user->uid_keyring = uid_keyring; if (user->uid_keyring) return 0; <== key = cred->user->session_keyring [== NULL] user->session_keyring = session_keyring; atomic_inc(&key->usage); [oops] At the point thread A dereferences cred->user->session_keyring, thread B hasn't updated user->session_keyring yet, but thread A assumes it is populated because install_user_keyrings() returned ok. The race window is really small but can be exploited if, for example, thread B is interrupted or preempted after initializing uid_keyring, but before doing setting session_keyring. This couldn't be reproduced on a stock kernel. However, after placing systemtap probe on 'user->session_keyring = session_keyring;' that introduced some delay, the kernel could be crashed reliably. Fix this by checking both pointers before deciding whether to return. Alternatively, the test could be done away with entirely as it is checked inside the mutex - but since the mutex is global, that may not be the best way. Signed-off-by: David Howells <dhowells@redhat.com> Reported-by: Mateusz Guzik <mguzik@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-03-12 16:44:31 +11:00
Eric W. Biederman	ba0e3427b0	userns: Stop oopsing in key_change_session_keyring Dave Jones <davej@redhat.com> writes: > Just hit this on Linus' current tree. > > [ 89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 > [ 89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0 > [ 89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0 > [ 89.624901] Oops: 0000 [#1] PREEMPT SMP > [ 89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii > [ 89.637846] CPU 2 > [ 89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H > [ 89.639850] RIP: 0010:[<ffffffff810784b0>] [<ffffffff810784b0>] commit_creds+0x250/0x2f0 > [ 89.641161] RSP: 0018:ffff880115657eb8 EFLAGS: 00010207 > [ 89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000 > [ 89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600 > [ 89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000 > [ 89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600 > [ 89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000 > [ 89.647431] FS: 00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000 > [ 89.648660] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0 > [ 89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490) > [ 89.654128] Stack: > [ 89.654433] 0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78 > [ 89.655769] ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000 > [ 89.657073] ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58 > [ 89.658399] Call Trace: > [ 89.658822] [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140 > [ 89.659845] [<ffffffff8106c665>] task_work_run+0xa5/0xd0 > [ 89.660698] [<ffffffff81002911>] do_notify_resume+0x71/0xb0 > [ 89.661581] [<ffffffff816c9a4a>] int_signal+0x12/0x17 > [ 89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b > [ 89.667778] RIP [<ffffffff810784b0>] commit_creds+0x250/0x2f0 > [ 89.668733] RSP <ffff880115657eb8> > [ 89.669301] CR2: 00000000000000c8 > > My fastest trinity induced oops yet! > > > Appears to be.. > > if ((set_ns == subset_ns->parent) && > 850: 48 8b 8a c8 00 00 00 mov 0xc8(%rdx),%rcx > > from the inlined cred_cap_issubset By historical accident we have been reading trying to set new->user_ns from new->user_ns. Which is totally silly as new->user_ns is NULL (as is every other field in new except session_keyring at that point). The intent is clearly to copy all of the fields from old to new so copy old->user_ns into into new->user_ns. Cc: stable@vger.kernel.org Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-03 19:35:38 -08:00
Linus Torvalds	56a79b7b02	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more VFS bits from Al Viro: "Unfortunately, it looks like xattr series will have to wait until the next cycle ;-/ This pile contains 9p cleanups and fixes (races in v9fs_fid_add() etc), fixup for nommu breakage in shmem.c, several cleanups and a bit more file_inode() work" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: constify path_get/path_put and fs_struct.c stuff fix nommu breakage in shmem.c cache the value of file_inode() in struct file 9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry 9p: make sure ->lookup() adds fid to the right dentry 9p: untangle ->lookup() a bit 9p: double iput() in ->lookup() if d_materialise_unique() fails 9p: v9fs_fid_add() can't fail now v9fs: get rid of v9fs_dentry 9p: turn fid->dlist into hlist 9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine more file_inode() open-coded instances selinux: opened file can't have NULL or negative ->f_path.dentry (In the meantime, the hlist traversal macros have changed, so this required a semantic conflict fixup for the newly hlistified fid->dlist)	2013-03-03 13:23:03 -08:00
Sasha Levin	b67bfe0d42	hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S \| hlist_for_each_entry_continue(a, - b, c) S \| hlist_for_each_entry_from(a, - b, c) S \| hlist_for_each_entry_rcu(a, - b, c, d) S \| hlist_for_each_entry_rcu_bh(a, - b, c, d) S \| hlist_for_each_entry_continue_rcu_bh(a, - b, c) S \| for_each_busy_worker(a, c, - b, d) S \| ax25_uid_for_each(a, - b, c) S \| ax25_for_each(a, - b, c) S \| inet_bind_bucket_for_each(a, - b, c) S \| sctp_for_each_hentry(a, - b, c) S \| sk_for_each(a, - b, c) S \| sk_for_each_rcu(a, - b, c) S \| sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S \| sk_for_each_safe(a, - b, c, d) S \| sk_for_each_bound(a, - b, c) S \| hlist_for_each_entry_safe(a, - b, c, d, e) S \| hlist_for_each_entry_continue_rcu(a, - b, c) S \| nr_neigh_for_each(a, - b, c) S \| nr_neigh_for_each_safe(a, - b, c, d) S \| nr_node_for_each(a, - b, c) S \| nr_node_for_each_safe(a, - b, c, d) S \| - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S \| - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S \| for_each_host(a, - b, c) S \| for_each_host_safe(a, - b, c, d) S \| for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:24 -08:00
Al Viro	45e09bd51b	selinux: opened file can't have NULL or negative ->f_path.dentry Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-27 13:22:14 -05:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Al Viro	182be68478	kill f_vfsmnt very few users left... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-26 02:46:10 -05:00
Mimi Zohar	446d64e3e1	block: fix part_pack_uuid() build error Commit "85865c1 ima: add policy support for file system uuid" introduced a CONFIG_BLOCK dependency. This patch defines a wrapper called blk_part_pack_uuid(), which returns -EINVAL, when CONFIG_BLOCK is not defined. security/integrity/ima/ima_policy.c:538:4: error: implicit declaration of function 'part_pack_uuid' [-Werror=implicit-function-declaration] Changelog v2: - Reference commit number in patch description Changelog v1: - rename ima_part_pack_uuid() to blk_part_pack_uuid() - resolve scripts/checkpatch.pl warnings Changelog v0: - fix UUID scripts/Lindent msgs Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: David Rientjes <rientjes@google.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-02-26 03:10:52 +11:00
Mimi Zohar	a2c2c3a71c	ima: "remove enforce checking duplication" merge fix Commit "750943a ima: remove enforce checking duplication" combined the 'in IMA policy' and 'enforcing file integrity' checks. For the non-file, kernel module verification, a specific check for 'enforcing file integrity' was not added. This patch adds the check. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-02-26 02:46:38 +11:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Jerry Snitselaar	53eb8c82d5	device_cgroup: don't grab mutex in rcu callback Commit `103a197c0c` ("security/device_cgroup: lock assert fails in dev_exception_clean()") grabs devcgroup_mutex to fix assert failure, but a mutex can't be grabbed in rcu callback. Since there shouldn't be any other references when css_free is called, mutex isn't needed for list cleanup in devcgroup_css_free(). Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Aristeu Rozanski <aris@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:15 -08:00
Linus Torvalds	33673dcb37	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "This is basically a maintenance update for the TPM driver and EVM/IMA" Fix up conflicts in lib/digsig.c and security/integrity/ima/ima_main.c * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (45 commits) tpm/ibmvtpm: build only when IBM pseries is configured ima: digital signature verification using asymmetric keys ima: rename hash calculation functions ima: use new crypto_shash API instead of old crypto_hash ima: add policy support for file system uuid evm: add file system uuid to EVM hmac tpm_tis: check pnp_acpi_device return code char/tpm/tpm_i2c_stm_st33: drop temporary variable for return value char/tpm/tpm_i2c_stm_st33: remove dead assignment in tpm_st33_i2c_probe char/tpm/tpm_i2c_stm_st33: Remove __devexit attribute char/tpm/tpm_i2c_stm_st33: Don't use memcpy for one byte assignment tpm_i2c_stm_st33: removed unused variables/code TPM: Wait for TPM_ACCESS tpmRegValidSts to go high at startup tpm: Fix cancellation of TPM commands (interrupt mode) tpm: Fix cancellation of TPM commands (polling mode) tpm: Store TPM vendor ID TPM: Work around buggy TPMs that block during continue self test tpm_i2c_stm_st33: fix oops when i2c client is unavailable char/tpm: Use struct dev_pm_ops for power management TPM: STMicroelectronics ST33 I2C BUILD STUFF ...	2013-02-21 08:18:12 -08:00
David Howells	fe9453a1dc	KEYS: Revert one application of "Fix unreachable code" patch A patch to fix some unreachable code in search_my_process_keyrings() got applied twice by two different routes upstream as commits `e67eab39be` and `b010520ab3` (both "fix unreachable code"). Unfortunately, the second application removed something it shouldn't have and this wasn't detected by GIT. This is due to the patch not having sufficient lines of context to distinguish the two places of application. The effect of this is relatively minor: inside the kernel, the keyring search routines may search multiple keyrings and then prioritise the errors if no keys or negative keys are found in any of them. With the extra deletion, the presence of a negative key in the thread keyring (causing ENOKEY) is incorrectly overridden by an error searching the process keyring. So revert the second application of the patch. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 07:56:25 -08:00
Dmitry Kasatkin	e0751257a6	ima: digital signature verification using asymmetric keys Asymmetric keys were introduced in linux-3.7 to verify the signature on signed kernel modules. The asymmetric keys infrastructure abstracts the signature verification from the crypto details. This patch adds IMA/EVM signature verification using asymmetric keys. Support for additional signature verification methods can now be delegated to the asymmetric key infrastructure. Although the module signature header and the IMA/EVM signature header could use the same format, to minimize the signature length and save space in the extended attribute, this patch defines a new IMA/EVM header format. The main difference is that the key identifier is a sha1[12 - 19] hash of the key modulus and exponent, similar to the current implementation. The only purpose of the key identifier is to identify the corresponding key in the kernel keyring. ima-evm-utils was updated to support the new signature format. While asymmetric signature verification functionality supports many different hash algorithms, the hash used in this patch is calculated during the IMA collection phase, based on the configured algorithm. The default algorithm is sha1, but for backwards compatibility md5 is supported. Due to this current limitation, signatures should be generated using a sha1 hash algorithm. Changes in this patch: - Functionality has been moved to separate source file in order to get rid of in source #ifdefs. - keyid is derived according to the RFC 3280. It does not require to assign IMA/EVM specific "description" when loading X509 certificate. Kernel asymmetric key subsystem automatically generate the description. Also loading a certificate does not require using of ima-evm-utils and can be done using keyctl only. - keyid size is reduced to 32 bits to save xattr space. Key search is done using partial match functionality of asymmetric_key_match(). - Kconfig option title was changed Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-02-06 21:22:18 -05:00
Dmitry Kasatkin	50af554466	ima: rename hash calculation functions Rename hash calculation functions to reflect meaning and change argument order in conventional way. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-02-06 10:41:13 -05:00
Dmitry Kasatkin	76bb28f612	ima: use new crypto_shash API instead of old crypto_hash Old crypto hash API internally uses shash API. Using shash API directly is more efficient. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-02-06 10:41:12 -05:00
Dmitry Kasatkin	85865c1fa1	ima: add policy support for file system uuid The IMA policy permits specifying rules to enable or disable measurement/appraisal/audit based on the file system magic number. If, for example, the policy contains an ext4 measurement rule, the rule is enabled for all ext4 partitions. Sometimes it might be necessary to enable measurement/appraisal/audit only for one partition and disable it for another partition of the same type. With the existing IMA policy syntax, this can not be done. This patch provides support for IMA policy rules to specify the file system by its UUID (eg. fsuuid=397449cd-687d-4145-8698-7fed4a3e0363). For partitions not being appraised, it might be a good idea to mount file systems with the 'noexec' option to prevent executing non-verified binaries. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-02-06 10:40:29 -05:00
Dmitry Kasatkin	74de668424	evm: add file system uuid to EVM hmac EVM uses the same key for all file systems to calculate the HMAC, making it possible to paste inodes from one file system on to another one, without EVM being able to detect it. To prevent such an attack, it is necessary to make the EVM HMAC file system specific. This patch uses the file system UUID, a file system unique identifier, to bind the EVM HMAC to the file system. The value inode->i_sb->s_uuid is used for the HMAC hash calculation, instead of using it for deriving the file system specific key. Initializing the key for every inode HMAC calculation is a bit more expensive operation than adding the uuid to the HMAC hash. Changing the HMAC calculation method or adding additional info to the calculation, requires existing EVM labeled file systems to be relabeled. This patch adds a Kconfig HMAC version option for backwards compatability. Changelog v1: - squash "hmac version setting" Changelog v0: - add missing Kconfig depends (Mimi) Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-02-06 10:40:28 -05:00
Linus Torvalds	22f8379815	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking updates from David Miller: "Much more accumulated than I would have liked due to an unexpected bout with a nasty flu: 1) AH and ESP input don't set ECN field correctly because the transport head of the SKB isn't set correctly, fix from Li RongQing. 2) If netfilter conntrack zones are disabled, we can return an uninitialized variable instead of the proper error code. Fix from Borislav Petkov. 3) Fix double SKB free in ath9k driver beacon handling, from Felix Feitkau. 4) Remove bogus assumption about netns cleanup ordering in nf_conntrack, from Pablo Neira Ayuso. 5) Remove a bogus BUG_ON in the new TCP fastopen code, from Eric Dumazet. It uses spin_is_locked() in it's test and is therefore unsuitable for UP. 6) Fix SELINUX labelling regressions added by the tuntap multiqueue changes, from Paul Moore. 7) Fix CRC errors with jumbo frame receive in tg3 driver, from Nithin Nayak Sujir. 8) CXGB4 driver sets interrupt coalescing parameters only on first queue, rather than all of them. Fix from Thadeu Lima de Souza Cascardo. 9) Fix regression in the dispatch of read/write registers in dm9601 driver, from Tushar Behera. 10) ipv6_append_data miscalculates header length, from Romain KUNTZ. 11) Fix PMTU handling regressions on ipv4 routes, from Steffen Klassert, Timo Teräs, and Julian Anastasov. 12) In 3c574_cs driver, add necessary parenthesis to "x << y & z" expression. From Nickolai Zeldovich. 13) macvlan_get_size() causes underallocation netlink message space, fix from Eric Dumazet. 14) Avoid division by zero in xfrm_replay_advance_bmp(), from Nickolai Zeldovich. Amusingly the zero check was already there, we were just performing it after the modulus :-) 15) Some more splice bug fixes from Eric Dumazet, which fix things mostly eminating from how we now more aggressively use high-order pages in SKBs. 16) Fix size calculation bug when freeing hash tables in the IPSEC xfrm code, from Michal Kubecek. 17) Fix PMTU event propagation into socket cached routes, from Steffen Klassert. 18) Fix off by one in TX buffer release in netxen driver, from Eric Dumazet. 19) Fix rediculous memory allocation requirements introduced by the tuntap multiqueue changes, from Jason Wang. 20) Remove bogus AMD platform workaround in r8169 driver that causes major problems in normal operation, from Timo Teräs. 21) virtio-net set affinity and select queue don't handle discontiguous cpu numbers properly, fix from Wanlong Gao. 22) Fix a route refcounting issue in loopback driver, from Eric Dumazet. There's a similar fix coming that we might add to the macvlan driver as well. 23) Fix SKB leaks in batman-adv's distributed arp table code, from Matthias Schiffer. 24) r8169 driver gives descriptor ownership back the hardware before we're done reading the VLAN tag out of it, fix from Francois Romieu. 25) Checksums not calculated properly in GRE tunnel driver fix from Pravin B Shelar. 26) Fix SCTP memory leak on namespace exit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (101 commits) dm9601: support dm9620 variant SCTP: Free the per-net sysctl table on net exit. v2 net: phy: icplus: fix broken INTR pin settings net: phy: icplus: Use the RGMII interface mode to configure clock delays IP_GRE: Fix kernel panic in IP_GRE with GRE csum. sctp: set association state to established in dupcook_a handler ip6mr: limit IPv6 MRT_TABLE identifiers r8169: fix vlan tag read ordering. net: cdc_ncm: use IAD provided by the USB core batman-adv: filter ARP packets with invalid MAC addresses in DAT batman-adv: check for more types of invalid IP addresses in DAT batman-adv: fix skb leak in batadv_dat_snoop_incoming_arp_reply() net: loopback: fix a dst refcounting issue virtio-net: reset virtqueue affinity when doing cpu hotplug virtio-net: split out clean affinity function virtio-net: fix the set affinity bug when CPU IDs are not consecutive can: pch_can: fix invalid error codes can: ti_hecc: fix invalid error codes can: c_can: fix invalid error codes r8169: remove the obsolete and incorrect AMD workaround ...	2013-01-28 11:41:37 -08:00
Mimi Zohar	5a73fcfa88	ima: differentiate appraise status only for hook specific rules Different hooks can require different methods for appraising a file's integrity. As a result, an integrity appraisal status is cached on a per hook basis. Only a hook specific rule, requires the inode to be re-appraised. This patch eliminates unnecessary appraisals. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>	2013-01-22 16:10:39 -05:00
Mimi Zohar	d79d72e024	ima: per hook cache integrity appraisal status With the new IMA policy 'appraise_type=' option, different hooks can require different methods for appraising a file's integrity. For example, the existing 'ima_appraise_tcb' policy defines a generic rule, requiring all root files to be appraised, without specfying the appraisal method. A more specific rule could require all kernel modules, for example, to be signed. appraise fowner=0 func=MODULE_CHECK appraise_type=imasig appraise fowner=0 As a result, the integrity appraisal results for the same inode, but for different hooks, could differ. This patch caches the integrity appraisal results on a per hook basis. Changelog v2: - Rename ima_cache_status() to ima_set_cache_status() - Rename and move get_appraise_status() to ima_get_cache_status() Changelog v0: - include IMA_APPRAISE/APPRAISED_SUBMASK in IMA_DO/DONE_MASK (Dmitry) - Support independent MODULE_CHECK appraise status. - fixed IMA_XXXX_APPRAISE/APPRAISED flags Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>	2013-01-22 16:10:36 -05:00
Mimi Zohar	f578c08ec9	ima: increase iint flag size In preparation for hook specific appraise status results, increase the iint flags size. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>	2013-01-22 16:10:34 -05:00
Dmitry Kasatkin	0e5a247cb3	ima: added policy support for 'security.ima' type The 'security.ima' extended attribute may contain either the file data's hash or a digital signature. This patch adds support for requiring a specific extended attribute type. It extends the IMA policy with a new keyword 'appraise_type=imasig'. (Default is hash.) Changelog v2: - Fixed Documentation/ABI/testing/ima_policy option syntax Changelog v1: - Differentiate between 'required' vs. 'actual' extended attribute Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-22 16:10:31 -05:00
Jerry Snitselaar	103a197c0c	security/device_cgroup: lock assert fails in dev_exception_clean() devcgroup_css_free() calls dev_exception_clean() without the devcgroup_mutex being locked. Shutting down a kvm virt was giving me the following trace: [36280.732764] ------------[ cut here ]------------ [36280.732778] WARNING: at /home/snits/dev/linux/security/device_cgroup.c:172 dev_exception_clean+0xa9/0xc0() [36280.732782] Hardware name: Studio XPS 8100 [36280.732785] Modules linked in: xt_REDIRECT fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc nf_conntrack_ipv4 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter it87 hwmon_vid xt_state nf_conntrack ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq coretemp snd_seq_device crc32c_intel snd_pcm snd_page_alloc snd_timer snd broadcom tg3 serio_raw i7core_edac edac_core ptp pps_core lpc_ich pcspkr mfd_core soundcore microcode i2c_i801 nfsd auth_rpcgss nfs_acl lockd vhost_net sunrpc tun macvtap macvlan kvm_intel kvm uinput binfmt_misc autofs4 usb_storage firewire_ohci firewire_core crc_itu_t radeon drm_kms_helper ttm [36280.732921] Pid: 933, comm: libvirtd Tainted: G W 3.8.0-rc3-00307-g4c217de #1 [36280.732922] Call Trace: [36280.732927] [<ffffffff81044303>] warn_slowpath_common+0x93/0xc0 [36280.732930] [<ffffffff8104434a>] warn_slowpath_null+0x1a/0x20 [36280.732932] [<ffffffff812deaf9>] dev_exception_clean+0xa9/0xc0 [36280.732934] [<ffffffff812deb2a>] devcgroup_css_free+0x1a/0x30 [36280.732938] [<ffffffff810ccd76>] cgroup_diput+0x76/0x210 [36280.732941] [<ffffffff8119eac0>] d_delete+0x120/0x180 [36280.732943] [<ffffffff81195cff>] vfs_rmdir+0xef/0x130 [36280.732945] [<ffffffff81195e47>] do_rmdir+0x107/0x1c0 [36280.732949] [<ffffffff8132d17e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [36280.732951] [<ffffffff81198646>] sys_rmdir+0x16/0x20 [36280.732954] [<ffffffff8173bd82>] system_call_fastpath+0x16/0x1b [36280.732956] ---[ end trace ca39dced899a7d9f ]--- Signed-off-by: Jerry Snitselaar <jerry.snitselaar@oracle.com> Cc: stable@kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-01-22 00:27:55 +11:00
Dmitry Kasatkin	a67adb9974	evm: checking if removexattr is not a NULL The following lines of code produce a kernel oops. fd = socket(PF_FILE, SOCK_STREAM\|SOCK_CLOEXEC\|SOCK_NONBLOCK, 0); fchmod(fd, 0666); [ 139.922364] BUG: unable to handle kernel NULL pointer dereference at (null) [ 139.924982] IP: [< (null)>] (null) [ 139.924982] *pde = 00000000 [ 139.924982] Oops: 0000 [#5] SMP [ 139.924982] Modules linked in: fuse dm_crypt dm_mod i2c_piix4 serio_raw evdev binfmt_misc button [ 139.924982] Pid: 3070, comm: acpid Tainted: G D 3.8.0-rc2-kds+ #465 Bochs Bochs [ 139.924982] EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0 [ 139.924982] EIP is at 0x0 [ 139.924982] EAX: cf5ef000 EBX: cf5ef000 ECX: c143d600 EDX: c15225f2 [ 139.924982] ESI: cf4d2a1c EDI: cf4d2a1c EBP: cc02df10 ESP: cc02dee4 [ 139.924982] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 139.924982] CR0: 80050033 CR2: 00000000 CR3: 0c059000 CR4: 000006d0 [ 139.924982] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 139.924982] DR6: ffff0ff0 DR7: 00000400 [ 139.924982] Process acpid (pid: 3070, ti=cc02c000 task=d7705340 task.ti=cc02c000) [ 139.924982] Stack: [ 139.924982] c1203c88 00000000 cc02def4 cf4d2a1c ae21eefa 471b60d5 1083c1ba c26a5940 [ 139.924982] e891fb5e 00000041 00000004 cc02df1c c1203964 00000000 cc02df4c c10e20c3 [ 139.924982] 00000002 00000000 00000000 22222222 c1ff2222 cf5ef000 00000000 d76efb08 [ 139.924982] Call Trace: [ 139.924982] [<c1203c88>] ? evm_update_evmxattr+0x5b/0x62 [ 139.924982] [<c1203964>] evm_inode_post_setattr+0x22/0x26 [ 139.924982] [<c10e20c3>] notify_change+0x25f/0x281 [ 139.924982] [<c10cbf56>] chmod_common+0x59/0x76 [ 139.924982] [<c10e27a1>] ? put_unused_fd+0x33/0x33 [ 139.924982] [<c10cca09>] sys_fchmod+0x39/0x5c [ 139.924982] [<c13f4f30>] syscall_call+0x7/0xb [ 139.924982] Code: Bad EIP value. This happens because sockets do not define the removexattr operation. Before removing the xattr, verify the removexattr function pointer is not NULL. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-01-22 00:27:50 +11:00
Dmitry Kasatkin	a175b8bb29	ima: forbid write access to files with digital signatures This patch forbids write access to files with digital signatures, as they are considered immutable. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 17:50:05 -05:00
Dmitry Kasatkin	ea1046d4c5	ima: move full pathname resolution to separate function Define a new function ima_d_path(), which returns the full pathname. This function will be used further, for example, by the directory verification code. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 17:50:03 -05:00
Dmitry Kasatkin	ee86633174	integrity: reduce storage size for ima_status and evm_status This patch reduces size of the iint structure by 8 bytes. It saves about 15% of iint cache memory. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 17:50:01 -05:00
Mimi Zohar	16cac49f72	ima: rename FILE_MMAP to MMAP_CHECK Rename FILE_MMAP hook to MMAP_CHECK to be consistent with the other hook names. Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2013-01-16 17:49:59 -05:00
Dmitry Kasatkin	b51524635b	ima: remove security.ima hexdump Hexdump is not really helping. Audit messages prints error messages. Remove it. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 17:49:57 -05:00
Dmitry Kasatkin	750943a307	ima: remove enforce checking duplication Based on the IMA appraisal policy, files are appraised. For those files appraised, the IMA hooks return the integrity appraisal result, assuming IMA-appraisal is in enforcing mode. This patch combines both of these criteria (in policy and enforcing file integrity), removing the checking duplication. Changelog v1: - Update hook comments Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 17:49:44 -05:00
Dmitry Kasatkin	def3e8b9ee	ima: set appraise status in fix mode only when xattr is fixed When a file system is mounted read-only, setting the xattr value in fix mode fails with an error code -EROFS. The xattr should be fixed after the file system is remounted read-write. This patch verifies that the set xattr succeeds, before setting the appraise status value to INTEGRITY_PASS. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 15:47:07 -05:00
Dmitry Kasatkin	e90805656d	evm: remove unused cleanup functions EVM cannot be built as a kernel module. Remove the unncessary __exit functions. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2013-01-16 15:47:05 -05:00
Mimi Zohar	7163a99384	ima: re-initialize IMA policy LSM info Although the IMA policy does not change, the LSM policy can be reloaded, leaving the IMA LSM based rules referring to the old, stale LSM policy. This patch updates the IMA LSM based rules to reflect the reloaded LSM policy. Reported-by: Sven Vermeulen <sven.vermeulen@siphos.be> tested-by: Sven Vermeulen <sven.vermeulen@siphos.be> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Casey Schaufler <casey@schaufler-ca.com>	2013-01-16 15:47:03 -05:00
Paul Moore	5dbbaf2de8	tun: fix LSM/SELinux labeling of tun/tap devices This patch corrects some problems with LSM/SELinux that were introduced with the multiqueue patchset. The problem stems from the fact that the multiqueue work changed the relationship between the tun device and its associated socket; before the socket persisted for the life of the device, however after the multiqueue changes the socket only persisted for the life of the userspace connection (fd open). For non-persistent devices this is not an issue, but for persistent devices this can cause the tun device to lose its SELinux label. We correct this problem by adding an opaque LSM security blob to the tun device struct which allows us to have the LSM security state, e.g. SELinux labeling information, persist for the lifetime of the tun device. In the process we tweak the LSM hooks to work with this new approach to TUN device/socket labeling and introduce a new LSM hook, security_tun_dev_attach_queue(), to approve requests to attach to a TUN queue via TUNSETQUEUE. The SELinux code has been adjusted to match the new LSM hooks, the other LSMs do not make use of the LSM TUN controls. This patch makes use of the recently added "tun_socket:attach_queue" permission to restrict access to the TUNSETQUEUE operation. On older SELinux policies which do not define the "tun_socket:attach_queue" permission the access control decision for TUNSETQUEUE will be handled according to the SELinux policy's unknown permission setting. Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Eric Paris <eparis@parisplace.org> Tested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-14 18:16:59 -05:00
Paul Moore	6f96c142f7	selinux: add the "attach_queue" permission to the "tun_socket" class Add a new permission to align with the new TUN multiqueue support, "tun_socket:attach_queue". The corresponding SELinux reference policy patch is show below: diff --git a/policy/flask/access_vectors b/policy/flask/access_vectors index 28802c5..a0664a1 100644 --- a/policy/flask/access_vectors +++ b/policy/flask/access_vectors @@ -827,6 +827,9 @@ class kernel_service class tun_socket inherits socket +{ + attach_queue +} class x_pointer inherits x_device Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Eric Paris <eparis@parisplace.org> Tested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-14 18:16:59 -05:00
Mimi Zohar	a7f2a366f6	ima: fallback to MODULE_SIG_ENFORCE for existing kernel module syscall The new kernel module syscall appraises kernel modules based on policy. If the IMA policy requires kernel module checking, fallback to module signature enforcing for the existing syscall. Without CONFIG_MODULE_SIG_FORCE enabled, the kernel module's integrity is unknown, return -EACCES. Changelog v1: - Fix ima_module_check() return result (Tetsuo Handa) Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2012-12-24 09:35:48 -05:00
Alan Cox	e67eab39be	keys: fix unreachable code We set ret to NULL then test it. Remove the bogus test Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-20 17:40:21 -08:00
Linus Torvalds	9eb127cc04	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Really fix tuntap SKB use after free bug, from Eric Dumazet. 2) Adjust SKB data pointer to point past the transport header before calling icmpv6_notify() so that the headers are in the state which that function expects. From Duan Jiong. 3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason Wang. 4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov. 5) Don't destroy mutex after freeing up device private in mac802154, fix also from Konstantin Khlebnikov. 6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch. 7) SCTP HMAC kconfig rework, from Neil Horman. 8) Fix SCTP jprobes function signature, otherwise things explode, from Daniel Borkmann. 9) Fix typo in ipv6-offload Makefile variable reference, from Simon Arlott. 10) Don't fail USBNET open just because remote wakeup isn't supported, from Oliver Neukum. 11) be2net driver bug fixes from Sathya Perla. 12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David Woodhouse. 13) Fix MTU changing regression in 8139cp driver, from John Greene. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits) solos-pci: ensure all TX packets are aligned to 4 bytes solos-pci: add firmware upgrade support for new models solos-pci: remove superfluous debug output solos-pci: add GPIO support for newer versions on Geos board 8139cp: Prevent dev_close/cp_interrupt race on MTU change net: qmi_wwan: add ZTE MF880 drivers/net: Use of_match_ptr() macro in smsc911x.c drivers/net: Use of_match_ptr() macro in smc91x.c ipv6: addrconf.c: remove unnecessary "if" bridge: Correctly encode addresses when dumping mdb entries bridge: Do not unregister all PF_BRIDGE rtnl operations use generic usbnet_manage_power() usbnet: generic manage_power() usbnet: handle PM failure gracefully ksz884x: fix receive polling race condition qlcnic: update driver version qlcnic: fix unused variable warnings net: fec: forbid FEC_PTP on SoCs that do not support be2net: fix wrong frag_idx reported by RX CQ be2net: fix be_close() to ensure all events are ack'ed ...	2012-12-19 20:29:15 -08:00
Linus Torvalds	7a684c452e	Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJQ0VKlAAoJENkgDmzRrbjxjuEQALVHpD1cSmryOzVwkNn7rVGP PV3KVbUs+qzUCm2c3AafIIlSBm2LOUl+cR3uNC7di8aHarRF3VHkK2OQ4Fx97ECd KKBqAyY3R0q1mAKujb/MWwiK0YgosEDIOzGGn2yQhNFsxKqnMB02P4j82IO7+g+w Cc3XuDyWHoH2I+ySgz0Q8NHAqufD/DMZUKud7jw2Lsv6PuICJ1Oqgl/Gd/muxort 4a5tV3tjhRGywHS/8b2fbDUXkybC5NKK0FN+gyoaROmJ/THeHEQDGXZT9bc2vmVx HvRy/5k8dzQ6LAJ2mLnPvy0pmv0u7NYMvjxTxxUlUkFMkYuVticikQfwSYDbDPt4 mbsLxchpgi8z4x8HltEERffCX5tldo/5hz1uemqhqIsMRIrRFnlHkSIgkGjVHf2u LXQBLT8uTm6C0VyNQPrI/hUZzIax7WtKbPSoK9lmExNbKqloEFh/mVXvfQxei2kp wnUZcnmPIqSvw7b4CWu7HibMYu2VvGBgm3YIfJRi4AQme1mzFYLpZoxF5Pj+Ykbt T//Hb1EsNQTTFCg7MZhnJSAw/EVUvNDUoullORClyqw6+xxjVKqWpPJgYDRfWOlJ Xa+s7DNrL+Oo1WWR8l5ruoQszbR8szIyeyPKKxRUcQj2zsqghoWuzKAx2saSEw3W pNkoJU+dGC7kG/yVAS8N =uoJj -----END PGP SIGNATURE----- Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module update from Rusty Russell: "Nothing all that exciting; a new module-from-fd syscall for those who want to verify the source of the module (ChromeOS) and/or use standard IMA on it or other security hooks." * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: MODSIGN: Fix kbuild output when using default extra_certificates MODSIGN: Avoid using .incbin in C source modules: don't hand 0 to vmalloc. module: Remove a extra null character at the top of module->strtab. ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants ASN.1: Define indefinite length marker constant moduleparam: use __UNIQUE_ID() __UNIQUE_ID() MODSIGN: Add modules_sign make target powerpc: add finit_module syscall. ima: support new kernel module syscall add finit_module syscall to asm-generic ARM: add finit_module syscall to ARM security: introduce kernel_module_from_file hook module: add flags arg to sys_finit_module() module: add syscall to load module from fd	2012-12-19 07:55:08 -08:00
Linus Torvalds	a2faf2fc53	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull (again) user namespace infrastructure changes from Eric Biederman: "Those bugs, those darn embarrasing bugs just want don't want to get fixed. Linus I just updated my mirror of your kernel.org tree and it appears you successfully pulled everything except the last 4 commits that fix those embarrasing bugs. When you get a chance can you please repull my branch" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Fix typo in description of the limitation of userns_install userns: Add a more complete capability subset test to commit_creds userns: Require CAP_SYS_ADMIN for most uses of setns. Fix cap_capable to only allow owners in the parent user namespace to have caps.	2012-12-18 10:55:28 -08:00
Linus Torvalds	6a2b60b17b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...	2012-12-17 15:44:47 -08:00
Linus Torvalds	2a74dbb9a8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "A quiet cycle for the security subsystem with just a few maintenance updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: create a sysfs mount point for smackfs Smack: use select not depends in Kconfig Yama: remove locking from delete path Yama: add RCU to drop read locking drivers/char/tpm: remove tasklet and cleanup KEYS: Use keyring_alloc() to create special keyrings KEYS: Reduce initial permissions on keys KEYS: Make the session and process keyrings per-thread seccomp: Make syscall skipping and nr changes more consistent key: Fix resource leak keys: Fix unreachable code KEYS: Add payload preparsing opportunity prior to key instantiate or update	2012-12-16 15:40:50 -08:00
Amerigo Wang	9dd9ff9953	bridge: update selinux perm table for RTM_NEWMDB and RTM_DELMDB Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-15 17:14:38 -08:00
Eric W. Biederman	520d9eabce	Fix cap_capable to only allow owners in the parent user namespace to have caps. Andy Lutomirski pointed out that the current behavior of allowing the owner of a user namespace to have all caps when that owner is not in a parent user namespace is wrong. Add a test to ensure the owner of a user namespace is in the parent of the user namespace to fix this bug. Thankfully this bug did not apply to the initial user namespace, keeping the mischief that can be caused by this bug quite small. This is bug was introduced in v3.5 by commit `783291e690` "Simplify the user_namespace by making userns->creator a kuid." But did not matter until the permisions required to create a user namespace were relaxed allowing a user namespace to be created inside of a user namespace. The bug made it possible for the owner of a user namespace to be present in a child user namespace. Since the owner of a user nameapce is granted all capabilities it became possible for users in a grandchild user namespace to have all privilges over their parent user namspace. Reorder the checks in cap_capable. This should make the common case faster and make it clear that nothing magic happens in the initial user namespace. The reordering is safe because cred->user_ns can only be in targ_ns or targ_ns->parent but not both. Add a comment a the top of the loop to make the logic of the code clear. Add a distinct variable ns that changes as we walk up the user namespace hierarchy to make it clear which variable is changing. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-12-14 13:50:32 -08:00
Casey Schaufler	e930723741	Smack: create a sysfs mount point for smackfs There are a number of "conventions" for where to put LSM filesystems. Smack adheres to none of them. Create a mount point at /sys/fs/smackfs for mounting smackfs so that Smack can be conventional. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-12-14 10:57:23 -08:00
Casey Schaufler	111fe8bd65	Smack: use select not depends in Kconfig The components NETLABEL and SECURITY_NETWORK are required by Smack. Using "depends" in Kconfig hides the Smack option if the user hasn't figured out that they need to be enabled while using make menuconfig. Using select is a better choice. Because select is not recursive depends on NET and SECURITY are added. The reflects similar usage in TOMOYO and AppArmor. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-12-14 10:57:10 -08:00
Mimi Zohar	fdf90729e5	ima: support new kernel module syscall With the addition of the new kernel module syscall, which defines two arguments - a file descriptor to the kernel module and a pointer to a NULL terminated string of module arguments - it is now possible to measure and appraise kernel modules like any other file on the file system. This patch adds support to measure and appraise kernel modules in an extensible and consistent manner. To support filesystems without extended attribute support, additional patches could pass the signature as the first parameter. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-12-14 13:05:26 +10:30
Kees Cook	2e72d51b4a	security: introduce kernel_module_from_file hook Now that kernel module origins can be reasoned about, provide a hook to the LSMs to make policy decisions about the module file. This will let Chrome OS enforce that loadable kernel modules can only come from its read-only hash-verified root filesystem. Other LSMs can, for example, read extended attributes for signatures, etc. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-12-14 13:05:24 +10:30
Linus Torvalds	a2013a13e6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...	2012-12-13 12:00:02 -08:00
Linus Torvalds	6be35c700f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) Allow to dump, monitor, and change the bridge multicast database using netlink. From Cong Wang. 2) RFC 5961 TCP blind data injection attack mitigation, from Eric Dumazet. 3) Networking user namespace support from Eric W. Biederman. 4) tuntap/virtio-net multiqueue support by Jason Wang. 5) Support for checksum offload of encapsulated packets (basically, tunneled traffic can still be checksummed by HW). From Joseph Gasparakis. 6) Allow BPF filter access to VLAN tags, from Eric Dumazet and Daniel Borkmann. 7) Bridge port parameters over netlink and BPDU blocking support from Stephen Hemminger. 8) Improve data access patterns during inet socket demux by rearranging socket layout, from Eric Dumazet. 9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and Jon Maloy. 10) Update TCP socket hash sizing to be more in line with current day realities. The existing heurstics were choosen a decade ago. From Eric Dumazet. 11) Fix races, queue bloat, and excessive wakeups in ATM and associated drivers, from Krzysztof Mazur and David Woodhouse. 12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions in VXLAN driver, from David Stevens. 13) Add "oops_only" mode to netconsole, from Amerigo Wang. 14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also allow DCB netlink to work on namespaces other than the initial namespace. From John Fastabend. 15) Support PTP in the Tigon3 driver, from Matt Carlson. 16) tun/vhost zero copy fixes and improvements, plus turn it on by default, from Michael S. Tsirkin. 17) Support per-association statistics in SCTP, from Michele Baldessari. And many, many, driver updates, cleanups, and improvements. Too numerous to mention individually. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits) net/mlx4_en: Add support for destination MAC in steering rules net/mlx4_en: Use generic etherdevice.h functions. net: ethtool: Add destination MAC address to flow steering API bridge: add support of adding and deleting mdb entries bridge: notify mdb changes via netlink ndisc: Unexport ndisc_{build,send}_skb(). uapi: add missing netconf.h to export list pkt_sched: avoid requeues if possible solos-pci: fix double-free of TX skb in DMA mode bnx2: Fix accidental reversions. bna: Driver Version Updated to 3.1.2.1 bna: Firmware update bna: Add RX State bna: Rx Page Based Allocation bna: TX Intr Coalescing Fix bna: Tx and Rx Optimizations bna: Code Cleanup and Enhancements ath9k: check pdata variable before dereferencing it ath5k: RX timestamp is reported at end of frame ath9k_htc: RX timestamp is reported at end of frame ...	2012-12-12 18:07:07 -08:00
Linus Torvalds	d206e09036	Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "A lot of activities on cgroup side. The big changes are focused on making cgroup hierarchy handling saner. - cgroup_rmdir() had peculiar semantics - it allowed cgroup destruction to be vetoed by individual controllers and tried to drain refcnt synchronously. The vetoing never worked properly and caused good deal of contortions in cgroup. memcg was the last reamining user. Michal Hocko removed the usage and cgroup_rmdir() path has been simplified significantly. This was done in a separate branch so that the memcg people can base further memcg changes on top. - The above allowed cleaning up cgroup lifecycle management and implementation of generic cgroup iterators which are used to improve hierarchy support. - cgroup_freezer updated to allow migration in and out of a frozen cgroup and handle hierarchy. If a cgroup is frozen, all descendant cgroups are frozen. - netcls_cgroup and netprio_cgroup updated to handle hierarchy properly. - Various fixes and cleanups. - Two merge commits. One to pull in memcg and rmdir cleanups (needed to build iterators). The other pulled in cgroup/for-3.7-fixes for device_cgroup fixes so that further device_cgroup patches can be stacked on top." Fixed up a trivial conflict in mm/memcontrol.c as per Tejun (due to commit `bea8c150a7` ("memcg: fix hotplugged memory zone oops") in master touching code close to commit `2ef37d3fe4` ("memcg: Simplify mem_cgroup_force_empty_list error handling") in for-3.8) * 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (65 commits) cgroup: update Documentation/cgroups/00-INDEX cgroup_rm_file: don't delete the uncreated files cgroup: remove subsystem files when remounting cgroup cgroup: use cgroup_addrm_files() in cgroup_clear_directory() cgroup: warn about broken hierarchies only after css_online cgroup: list_del_init() on removed events cgroup: fix lockdep warning for event_control cgroup: move list add after list head initilization netprio_cgroup: allow nesting and inherit config on cgroup creation netprio_cgroup: implement netprio[_set]_prio() helpers netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx netprio_cgroup: reimplement priomap expansion netprio_cgroup: shorten variable names in extend_netdev_table() netprio_cgroup: simplify write_priomap() netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking cgroup: remove obsolete guarantee from cgroup_task_migrate. cgroup: add cgroup->id cgroup, cpuset: remove cgroup_subsys->post_clone() cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/ cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() ...	2012-12-12 08:18:24 -08:00
Cong Wang	6e73d71d84	rtnetlink: add missing message types to selinux perm table Rebased on the latest net-next tree. RTM_NEWNETCONF and RTM_GETNETCONF are missing in this table. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-10 14:09:01 -05:00
Cong Wang	ee07c6e7a6	bridge: export multicast database via netlink V5: fix two bugs pointed out by Thomas remove seq check for now, mark it as TODO V4: remove some useless #include some coding style fix V3: drop debugging printk's update selinux perm table as well V2: drop patch 1/2, export ifindex directly Redesign netlink attributes Improve netlink seq check Handle IPv6 addr as well This patch exports bridge multicast database via netlink message type RTM_GETMDB. Similar to fdb, but currently bridge-specific. We may need to support modify multicast database too (RTM_{ADD,DEL}MDB). (Thanks to Thomas for patient reviews) Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Graf <tgraf@suug.ch> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:32:52 -05:00
Dave Jones	88a693b5c1	selinux: fix sel_netnode_insert() suspicious rcu dereference =============================== [ INFO: suspicious RCU usage. ] 3.5.0-rc1+ #63 Not tainted ------------------------------- security/selinux/netnode.c:178 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by trinity-child1/8750: #0: (sel_netnode_lock){+.....}, at: [<ffffffff812d8f8a>] sel_netnode_sid+0x16a/0x3e0 stack backtrace: Pid: 8750, comm: trinity-child1 Not tainted 3.5.0-rc1+ #63 Call Trace: [<ffffffff810cec2d>] lockdep_rcu_suspicious+0xfd/0x130 [<ffffffff812d91d1>] sel_netnode_sid+0x3b1/0x3e0 [<ffffffff812d8e20>] ? sel_netnode_find+0x1a0/0x1a0 [<ffffffff812d24a6>] selinux_socket_bind+0xf6/0x2c0 [<ffffffff810cd1dd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff810cdb55>] ? lock_release_holdtime.part.9+0x15/0x1a0 [<ffffffff81093841>] ? lock_hrtimer_base+0x31/0x60 [<ffffffff812c9536>] security_socket_bind+0x16/0x20 [<ffffffff815550ca>] sys_bind+0x7a/0x100 [<ffffffff816c03d5>] ? sysret_check+0x22/0x5d [<ffffffff810d392d>] ? trace_hardirqs_on_caller+0x10d/0x1a0 [<ffffffff8133b09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff816c03a9>] system_call_fastpath+0x16/0x1b This patch below does what Paul McKenney suggested in the previous thread. Signed-off-by: Dave Jones <davej@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Eric Paris <eparis@parisplace.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-11-21 21:55:32 +11:00
Kees Cook	235e752789	Yama: remove locking from delete path Instead of locking the list during a delete, mark entries as invalid and trigger a workqueue to clean them up. This lets us easily handle task_free from interrupt context. Signed-off-by: Kees Cook <keescook@chromium.org>	2012-11-20 10:32:08 -08:00
Kees Cook	93b69d437e	Yama: add RCU to drop read locking Stop using spinlocks in the read path. Add RCU list to handle the readers. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: John Johansen <john.johansen@canonical.com>	2012-11-20 10:32:07 -08:00
Eric W. Biederman	4c44aaafa8	userns: Kill task_user_ns The task_user_ns function hides the fact that it is getting the user namespace from struct cred on the task. struct cred may go away as soon as the rcu lock is released. This leads to a race where we can dereference a stale user namespace pointer. To make it obvious a struct cred is involved kill task_user_ns. To kill the race modify the users of task_user_ns to only reference the user namespace while the rcu lock is held. Cc: Kees Cook <keescook@chromium.org> Cc: James Morris <james.l.morris@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-11-20 04:17:44 -08:00
Tejun Heo	92fb97487a	cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() Rename cgroup_subsys css lifetime related callbacks to better describe what their roles are. Also, update documentation. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>	2012-11-19 08:13:38 -08:00
Tejun Heo	4b1c7840b7	device_cgroup: add lockdep asserts device_cgroup uses RCU safe ->exceptions list which is write-protected by devcgroup_mutex and has had some issues using locking correctly. Add lockdep asserts to utility functions so that future errors can be easily detected. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: Li Zefan <lizefan@huawei.com>	2012-11-06 12:28:04 -08:00
Tejun Heo	201e72acb2	device_cgroup: fix RCU usage dev_cgroup->exceptions is protected with devcgroup_mutex for writes and RCU for reads; however, RCU usage isn't correct. * dev_exception_clean() doesn't use RCU variant of list_del() and kfree(). The function can race with may_access() and may_access() may end up dereferencing already freed memory. Use list_del_rcu() and kfree_rcu() instead. * may_access() may be called only with RCU read locked but doesn't use RCU safe traversal over ->exceptions. Use list_for_each_entry_rcu(). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: stable@vger.kernel.org Cc: Aristeu Rozanski <aris@redhat.com> Cc: Li Zefan <lizefan@huawei.com>	2012-11-06 12:25:51 -08:00
Aristeu Rozanski	64e1047713	device_cgroup: fix unchecked cgroup parent usage In `4cef7299b4` ("device_cgroup: add proper checking when changing default behavior") the cgroup parent usage is unchecked. root will not have a parent and trying to use device.{allow,deny} will cause problems. For some reason my stressing scripts didn't test the root directory so I didn't catch it on my regular tests. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2012-11-06 07:25:20 -08:00
Jiri Kosina	3bd7bf1f0f	Merge branch 'master' into for-next Sync up with Linus' tree to be able to apply Cesar's patch against newer version of the code. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-10-28 19:29:19 +01:00
Aristeu Rozanski	4cef7299b4	device_cgroup: add proper checking when changing default behavior Before changing a group's default behavior to ALLOW, we must check if its parent's behavior is also ALLOW. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-25 14:37:52 -07:00
Aristeu Rozanski	26fd8405dd	device_cgroup: stop using simple_strtoul() Convert the code to use kstrtou32() instead of simple_strtoul() which is deprecated. The real size of the variables are u32, so use kstrtou32 instead of kstrtoul Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-25 14:37:52 -07:00
Aristeu Rozanski	5b7aa7d5bb	device_cgroup: rename deny_all to behavior This was done in a v2 patch but v1 ended up being committed. The variable name is less confusing and stores the default behavior when no matching exception exists. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Dave Jones <davej@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-25 14:37:52 -07:00
Jiri Slaby	8c9506d169	cgroup: fix invalid rcu dereference Commit `ad676077a2` ("device_cgroup: convert device_cgroup internally to policy + exceptions") removed rcu locks which are needed in task_devcgroup called in this chain: devcgroup_inode_mknod OR __devcgroup_inode_permission -> __devcgroup_inode_permission -> task_devcgroup -> task_subsys_state -> task_subsys_state_check. Change the code so that task_devcgroup is safely called with rcu read lock held. =============================== [ INFO: suspicious RCU usage. ] 3.6.0-rc5-next-20120913+ #42 Not tainted ------------------------------- include/linux/cgroup.h:553 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by kdevtmpfs/23: #0: (sb_writers){.+.+.+}, at: [<ffffffff8116873f>] mnt_want_write+0x1f/0x50 #1: (&sb->s_type->i_mutex_key#3/1){+.+.+.}, at: [<ffffffff811558af>] kern_path_create+0x7f/0x170 stack backtrace: Pid: 23, comm: kdevtmpfs Not tainted 3.6.0-rc5-next-20120913+ #42 Call Trace: lockdep_rcu_suspicious+0xfd/0x130 devcgroup_inode_mknod+0x19d/0x240 vfs_mknod+0x71/0xf0 handle_create.isra.2+0x72/0x200 devtmpfsd+0x114/0x140 ? handle_create.isra.2+0x200/0x200 kthread+0xd6/0xe0 kernel_thread_helper+0x4/0x10 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-25 14:37:52 -07:00
Alan Cox	b010520ab3	keys: Fix unreachable code We set ret to NULL then test it. Remove the bogus test Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-10-25 18:00:27 +02:00
John Johansen	2e680dd61e	apparmor: fix IRQ stack overflow during free_profile BugLink: http://bugs.launchpad.net/bugs/1056078 Profile replacement can cause long chains of profiles to build up when the profile being replaced is pinned. When the pinned profile is finally freed, it puts the reference to its replacement, which may in turn nest another call to free_profile on the stack. Because this may happen for each profile in the replacedby chain this can result in a recusion that causes the stack to overflow. Break this nesting by directly walking the chain of replacedby profiles (ie. use iteration instead of recursion to free the list). This results in at most 2 levels of free_profile being called, while freeing a replacedby chain. Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-10-25 02:12:50 +11:00
John Johansen	43c422eda9	apparmor: fix apparmor OOPS in audit_log_untrustedstring+0x1c/0x40 The capability defines have moved causing the auto generated names of capabilities that apparmor uses in logging to be incorrect. Fix the autogenerated table source to uapi/linux/capability.h Reported-by: YanHong <clouds.yan@gmail.com> Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl> Analyzed-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-17 16:29:46 -07:00
Al Viro	45525b26a4	fix a leak in replace_fd() users replace_fd() began with "eats a reference, tries to insert into descriptor table" semantics; at some point I'd switched it to much saner current behaviour ("try to insert into descriptor table, grabbing a new reference if inserted; caller should do fput() in any case"), but forgot to update the callers. Mea culpa... [Spotted by Pavel Roskin, who has really weird system with pipe-fed coredumps as part of what he considers a normal boot ;-)] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-16 13:36:50 -04:00
Linus Torvalds	d25282d1c9	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...	2012-10-14 13:39:34 -07:00
Al Viro	808d4e3cfd	consitify do_mount() arguments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-11 20:02:04 -04:00
Linus Torvalds	9e2d8656f5	Merge branch 'akpm' (Andrew's patch-bomb) Merge patches from Andrew Morton: "A few misc things and very nearly all of the MM tree. A tremendous amount of stuff (again), including a significant rbtree library rework." * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (160 commits) sparc64: Support transparent huge pages. mm: thp: Use more portable PMD clearing sequenece in zap_huge_pmd(). mm: Add and use update_mmu_cache_pmd() in transparent huge page code. sparc64: Document PGD and PMD layout. sparc64: Eliminate PTE table memory wastage. sparc64: Halve the size of PTE tables sparc64: Only support 4MB huge pages and 8KB base pages. memory-hotplug: suppress "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>" warning mm: memcg: clean up mm_match_cgroup() signature mm: document PageHuge somewhat mm: use %pK for /proc/vmallocinfo mm, thp: fix mlock statistics mm, thp: fix mapped pages avoiding unevictable list on mlock memory-hotplug: update memory block's state and notify userspace memory-hotplug: preparation to notify memory block's state at memory hot remove mm: avoid section mismatch warning for memblock_type_name make GFP_NOTRACK definition unconditional cma: decrease cc.nr_migratepages after reclaiming pagelist CMA: migrate mlocked pages kpageflags: fix wrong KPF_THP on non-huge compound pages ...	2012-10-09 16:23:15 +09:00
Konstantin Khlebnikov	314e51b985	mm: kill vma flag VM_RESERVED and mm->reserved_vm counter A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA, currently it lost original meaning but still has some effects: \| effect \| alternative flags -+------------------------+--------------------------------------------- 1\| account as reserved_vm \| VM_IO 2\| skip in core dump \| VM_IO, VM_DONTDUMP 3\| do not merge or expand \| VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP 4\| do not mlock \| VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP This patch removes reserved_vm counter from mm_struct. Seems like nobody cares about it, it does not exported into userspace directly, it only reduces total_vm showed in proc. Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND \| VM_DONTDUMP. remap_pfn_range() and io_remap_pfn_range() set VM_IO\|VM_DONTEXPAND\|VM_DONTDUMP. remap_vmalloc_range() set VM_DONTEXPAND \| VM_DONTDUMP. [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:19 +09:00
Konstantin Khlebnikov	2dd8ad81e3	mm: use mm->exe_file instead of first VM_EXECUTABLE vma->vm_file Some security modules and oprofile still uses VM_EXECUTABLE for retrieving a task's executable file. After this patch they will use mm->exe_file directly. mm->exe_file is protected with mm->mmap_sem, so locking stays the same. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [arch/tile] Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> [tomoyo] Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:18 +09:00
Linus Torvalds	50e0d10232	This has three changes for asm-generic that did not really fit into any other branch as normal asm-generic changes do. One is a fix for a build warning, the other two are more interesting: * A patch from Mark Brown to allow using the common clock infrastructure on all architectures, so we can use the clock API in architecture independent device drivers. * The UAPI split patches from David Howells for the asm-generic files. There are other architecture specific series that are going through the arch maintainer tree and that depend on this one. There may be a few small merge conflicts between Mark's patch and the following arch header file split patches. In each case the solution will be to keep the new "generic-y += clkdev.h" line, even if it ends up being the only line in the Kbuild file. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIVAwUAUHLuO2CrR//JCVInAQLsKxAAoa+oSP3KGuQbLHq2wvUxAdXWDFcZgKo+ qMRejSJPI0sreJ9GJHpUjHtJ7W2gujeo9upmUIJzoWY9vrmjkhCDkaWliaQI8SmY CKB9zI2xCB9iFzHtWxocfnJzU7NvzjJm+jnIYrqkaO9HGMxL99tsv9TsBYXK/08j QmlGP5fHdGU3zZxVt5r1GL8/nfX4zn3/YEll9nJ7vqXZltIBbaksxmgPoa0QkkH8 LMeMAlgRR2DHWt58gXHyGB7Afx3QEnZBDaQpYxA446P+2gtvIhFYOnpuX14pZb7t m4IM0vOO6WzARQR6DJlRHfYJevojgGHu4Y8wkEzuWE+Hr2BqmiVct7UKqGJdqTY5 7+I7wwaJmdd3zE61LxRS9UOjJDwMh1gmsNU4+42RArQ5eLcikNR5zfYzDRLCTmnk qKZvbiaxgme2YvWazxbBT6EqmIVU6lfHHIoMLr8U0j40Cl0GCmN7EBbe7/r2Jhjs 6VnCOJ6vb4RCOJGGAcLRMQu7xEtqcCe0Zht839wl13QXewxS3QRgwg6Bjy/fwA9r jij5gf+R25J/fQW7yZv4LwcMowRE1xvpu0ebwkK3LLR8jcon71scd6f3PW/bUUpj j4tgFuJbXzOxQ4LFgBzvdVgx3wDzsQhqb/6p2l6ROdcw7xXFDdFZ4zq3h0A25wXZ J6WDO387tpg= =Aaki -----END PGP SIGNATURE----- Merge tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "This has three changes for asm-generic that did not really fit into any other branch as normal asm-generic changes do. One is a fix for a build warning, the other two are more interesting: * A patch from Mark Brown to allow using the common clock infrastructure on all architectures, so we can use the clock API in architecture independent device drivers. * The UAPI split patches from David Howells for the asm-generic files. There are other architecture specific series that are going through the arch maintainer tree and that depend on this one. There may be a few small merge conflicts between Mark's patch and the following arch header file split patches. In each case the solution will be to keep the new "generic-y += clkdev.h" line, even if it ends up being the only line in the Kbuild file." * tag 'asm-generic' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: UAPI: (Scripted) Disintegrate include/asm-generic asm-generic: Add default clkdev.h asm-generic: xor: mark static functions as __maybe_unused	2012-10-09 15:58:38 +09:00
David Howells	cf7f601c06	KEYS: Add payload preparsing opportunity prior to key instantiate or update Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (preparse)(struct key_preparsed_payload prep); void (free_preparse)(struct key_preparsed_payload prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char description; void type_data[2]; void payload; const void data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (instantiate)(struct key key, struct key_preparsed_payload prep); int (update)(struct key key, struct key_preparsed_payload prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-08 13:49:48 +10:30
Linus Torvalds	638c87a916	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull IMA bugfix (security subsystem) from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: ima: fix bug in argument order	2012-10-07 21:07:21 +09:00
Aristeu Rozanski	db9aeca97a	device_cgroup: rename whitelist to exception list This patch replaces the "whitelist" usage in the code and comments and replace them by exception list related information. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:05:14 +09:00
Aristeu Rozanski	ad676077a2	device_cgroup: convert device_cgroup internally to policy + exceptions The original model of device_cgroup is having a whitelist where all the allowed devices are listed. The problem with this approach is that is impossible to have the case of allowing everything but few devices. The reason for that lies in the way the whitelist is handled internally: since there's only a whitelist, the "all devices" entry would have to be removed and replaced by the entire list of possible devices but the ones that are being denied. Since dev_t is 32 bits long, representing the allowed devices as a bitfield is not memory efficient. This patch replaces the "whitelist" by a "exceptions" list and the default policy is kept as "deny_all" variable in dev_cgroup structure. The current interface determines that whenever "a" is written to devices.allow or devices.deny, the entry masking all devices will be added or removed, respectively. This behavior is kept and it's what will determine the default policy: # cat devices.list a : rwm # echo a >devices.deny # cat devices.list # echo a >devices.allow # cat devices.list a : rwm The interface is also preserved. For example, if one wants to block only access to /dev/null: # ls -l /dev/null crw-rw-rw- 1 root root 1, 3 Jul 24 16:17 /dev/null # echo a >devices.allow # echo "c 1:3 rwm" >devices.deny # cat /dev/null cat: /dev/null: Operation not permitted # echo >/dev/null bash: /dev/null: Operation not permitted mknod /tmp/null c 1 3 mknod: `/tmp/null': Operation not permitted # echo "c 1:3 r" >devices.allow # cat /dev/null # echo >/dev/null bash: /dev/null: Operation not permitted mknod /tmp/null c 1 3 mknod: `/tmp/null': Operation not permitted # echo "c 1:3 rw" >devices.allow # echo >/dev/null # cat /dev/null # mknod /tmp/null c 1 3 mknod: `/tmp/null': Operation not permitted # echo "c 1:3 rwm" >devices.allow # echo >/dev/null # cat /dev/null # mknod /tmp/null c 1 3 # Note that I didn't rename the functions/variables in this patch, but in the next one to make reviewing easier. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:05:14 +09:00
Aristeu Rozanski	868539a3b6	device_cgroup: introduce dev_whitelist_clean() This function cleans all the items in a whitelist and will be used by the next patches. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:05:14 +09:00
Aristeu Rozanski	66b8ef6775	device_cgroup: add "deny_all" in dev_cgroup structure deny_all will determine if the default policy is to deny all device access unless for the ones in the exception list. This variable will be used in the next patches to convert device_cgroup internally into a default policy + rules. Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:05:13 +09:00
Dmitry Kasatkin	d26e193622	ima: fix bug in argument order mask argument goes first, then func, like ima_must_measure and ima_get_action. ima_inode_post_setattr() assumes that. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-10-05 22:32:16 +10:00
David Howells	8a1ab3155c	UAPI: (Scripted) Disintegrate include/asm-generic Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Kerrisk <mtk.manpages@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-04 18:20:15 +01:00
Linus Torvalds	88265322c1	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - Integrity: add local fs integrity verification to detect offline attacks - Integrity: add digital signature verification - Simple stacking of Yama with other LSMs (per LSS discussions) - IBM vTPM support on ppc64 - Add new driver for Infineon I2C TIS TPM - Smack: add rule revocation for subject labels" Fixed conflicts with the user namespace support in kernel/auditsc.c and security/integrity/ima/ima_policy.c. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits) Documentation: Update git repository URL for Smack userland tools ima: change flags container data type Smack: setprocattr memory leak fix Smack: implement revoking all rules for a subject label Smack: remove task_wait() hook. ima: audit log hashes ima: generic IMA action flag handling ima: rename ima_must_appraise_or_measure audit: export audit_log_task_info tpm: fix tpm_acpi sparse warning on different address spaces samples/seccomp: fix 31 bit build on s390 ima: digital signature verification support ima: add support for different security.ima data types ima: add ima_inode_setxattr/removexattr function and calls ima: add inode_post_setattr call ima: replace iint spinblock with rwlock/read_lock ima: allocating iint improvements ima: add appraise action keywords and default rules ima: integrity appraisal extension vfs: move ima_file_free before releasing the file ...	2012-10-02 21:38:48 -07:00
Linus Torvalds	aab174f0df	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - big one - consolidation of descriptor-related logics; almost all of that is moved to fs/file.c (BTW, I'm seriously tempted to rename the result to fd.c. As it is, we have a situation when file_table.c is about handling of struct file and file.c is about handling of descriptor tables; the reasons are historical - file_table.c used to be about a static array of struct file we used to have way back). A lot of stray ends got cleaned up and converted to saner primitives, disgusting mess in android/binder.c is still disgusting, but at least doesn't poke so much in descriptor table guts anymore. A bunch of relatively minor races got fixed in process, plus an ext4 struct file leak. - related thing - fget_light() partially unuglified; see fdget() in there (and yes, it generates the code as good as we used to have). - also related - bits of Cyrill's procfs stuff that got entangled into that work; _not_ all of it, just the initial move to fs/proc/fd.c and switch of fdinfo to seq_file. - Alex's fs/coredump.c spiltoff - the same story, had been easier to take that commit than mess with conflicts. The rest is a separate pile, this was just a mechanical code movement. - a few misc patches all over the place. Not all for this cycle, there'll be more (and quite a few currently sit in akpm's tree)." Fix up trivial conflicts in the android binder driver, and some fairly simple conflicts due to two different changes to the sock_alloc_file() interface ("take descriptor handling from sock_alloc_file() to callers" vs "net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries" adding a dentry name to the socket) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits) MAX_LFS_FILESIZE should be a loff_t compat: fs: Generic compat_sys_sendfile implementation fs: push rcu_barrier() from deactivate_locked_super() to filesystems btrfs: reada_extent doesn't need kref for refcount coredump: move core dump functionality into its own file coredump: prevent double-free on an error path in core dumper usb/gadget: fix misannotations fcntl: fix misannotations ceph: don't abuse d_delete() on failure exits hypfs: ->d_parent is never NULL or negative vfs: delete surplus inode NULL check switch simple cases of fget_light to fdget new helpers: fdget()/fdput() switch o2hb_region_dev_write() to fget_light() proc_map_files_readdir(): don't bother with grabbing files make get_file() return its argument vhost_set_vring(): turn pollstart/pollstop into bool switch prctl_set_mm_exe_file() to fget_light() switch xfs_find_handle() to fget_light() switch xfs_swapext() to fget_light() ...	2012-10-02 20:25:04 -07:00
Linus Torvalds	aecdc33e11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...	2012-10-02 13:38:27 -07:00
David Howells	4442d7704c	Merge branch 'modsign-keys-devel' into security-next-keys Signed-off-by: David Howells <dhowells@redhat.com>	2012-10-02 19:30:19 +01:00
David Howells	f8aa23a55f	KEYS: Use keyring_alloc() to create special keyrings Use keyring_alloc() to create special keyrings now that it has a permissions parameter rather than using key_alloc() + key_instantiate_and_link(). Also document and export keyring_alloc() so that modules can use it too. Signed-off-by: David Howells <dhowells@redhat.com>	2012-10-02 19:24:56 +01:00
David Howells	96b5c8fea6	KEYS: Reduce initial permissions on keys Reduce the initial permissions on new keys to grant the possessor everything, view permission only to the user (so the keys can be seen in /proc/keys) and nothing else. This gives the creator a chance to adjust the permissions mask before other processes can access the new key or create a link to it. To aid with this, keyring_alloc() now takes a permission argument rather than setting the permissions itself. The following permissions are now set: (1) The user and user-session keyrings grant the user that owns them full permissions and grant a possessor everything bar SETATTR. (2) The process and thread keyrings grant the possessor full permissions but only grant the user VIEW. This permits the user to see them in /proc/keys, but not to do anything with them. (3) Anonymous session keyrings grant the possessor full permissions, but only grant the user VIEW and READ. This means that the user can see them in /proc/keys and can list them, but nothing else. Possibly READ shouldn't be provided either. (4) Named session keyrings grant everything an anonymous session keyring does, plus they grant the user LINK permission. The whole point of named session keyrings is that others can also subscribe to them. Possibly this should be a separate permission to LINK. (5) The temporary session keyring created by call_sbin_request_key() gets the same permissions as an anonymous session keyring. (6) Keys created by add_key() get VIEW, SEARCH, LINK and SETATTR for the possessor, plus READ and/or WRITE if the key type supports them. The used only gets VIEW now. (7) Keys created by request_key() now get the same as those created by add_key(). Reported-by: Lennart Poettering <lennart@poettering.net> Reported-by: Stef Walter <stefw@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>	2012-10-02 19:24:56 +01:00
David Howells	3a50597de8	KEYS: Make the session and process keyrings per-thread Make the session keyring per-thread rather than per-process, but still inherited from the parent thread to solve a problem with PAM and gdm. The problem is that join_session_keyring() will reject attempts to change the session keyring of a multithreaded program but gdm is now multithreaded before it gets to the point of starting PAM and running pam_keyinit to create the session keyring. See: https://bugs.freedesktop.org/show_bug.cgi?id=49211 The reason that join_session_keyring() will only change the session keyring under a single-threaded environment is that it's hard to alter the other thread's credentials to effect the change in a multi-threaded program. The problems are such as: (1) How to prevent two threads both running join_session_keyring() from racing. (2) Another thread's credentials may not be modified directly by this process. (3) The number of threads is uncertain whilst we're not holding the appropriate spinlock, making preallocation slightly tricky. (4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get another thread to replace its keyring, but that means preallocating for each thread. A reasonable way around this is to make the session keyring per-thread rather than per-process and just document that if you want a common session keyring, you must get it before you spawn any threads - which is the current situation anyway. Whilst we're at it, we can the process keyring behave in the same way. This means we can clean up some of the ickyness in the creds code. Basically, after this patch, the session, process and thread keyrings are about inheritance rules only and not about sharing changes of keyring. Reported-by: Mantas M. <grawity@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ray Strode <rstrode@redhat.com>	2012-10-02 19:24:29 +01:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
Linus Torvalds	68d47a137c	Merge branch 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup hierarchy update from Tejun Heo: "Currently, different cgroup subsystems handle nested cgroups completely differently. There's no consistency among subsystems and the behaviors often are outright broken. People at least seem to agree that the broken hierarhcy behaviors need to be weeded out if any progress is gonna be made on this front and that the fallouts from deprecating the broken behaviors should be acceptable especially given that the current behaviors don't make much sense when nested. This patch makes cgroup emit warning messages if cgroups for subsystems with broken hierarchy behavior are nested to prepare for fixing them in the future. This was put in a separate branch because more related changes were expected (didn't make it this round) and the memory cgroup wanted to pull in this and make changes on top." * 'for-3.7-hierarchy' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them	2012-10-02 10:52:28 -07:00
Linus Torvalds	033d9959ed	Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...	2012-10-02 09:54:49 -07:00
Linus Torvalds	94095a1fff	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core kernel fixes from Ingo Molnar: "This is a complex task_work series from Oleg that fixes the bug that this VFS commit tried to fix: `d35abdb288` hold task_lock around checks in keyctl but solves the problem without the lockup regression that `d35abdb288` introduced in v3.6. This series came late in v3.6 and I did not feel confident about it so late in the cycle. Might be worth backporting to -stable if it proves itself upstream." * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: task_work: Simplify the usage in ptrace_notify() and get_signal_to_deliver() task_work: Revert "hold task_lock around checks in keyctl" task_work: task_work_add() should not succeed after exit_task_work() task_work: Make task_work_add() lockless	2012-10-01 10:25:54 -07:00
Linus Torvalds	99dbb1632f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull the trivial tree from Jiri Kosina: "Tiny usual fixes all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) doc: fix old config name of kprobetrace fs/fs-writeback.c: cleanup riteback_sb_inodes kerneldoc btrfs: fix the commment for the action flags in delayed-ref.h btrfs: fix trivial typo for the comment of BTRFS_FREE_INO_OBJECTID vfs: fix kerneldoc for generic_fh_to_parent() treewide: fix comment/printk/variable typos ipr: fix small coding style issues doc: fix broken utf8 encoding nfs: comment fix platform/x86: fix asus_laptop.wled_type module parameter mfd: printk/comment fixes doc: getdelays.c: remember to close() socket on error in create_nl_socket() doc: aliasing-test: close fd on write error mmc: fix comment typos dma: fix comments spi: fix comment/printk typos in spi Coccinelle: fix typo in memdup_user.cocci tmiofb: missing NULL pointer checks tools: perf: Fix typo in tools/perf tools/testing: fix comment / output typos ...	2012-10-01 09:06:36 -07:00
David S. Miller	6a06e5e1bb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/team/team.c drivers/net/usb/qmi_wwan.c net/batman-adv/bat_iv_ogm.c net/ipv4/fib_frontend.c net/ipv4/route.c net/l2tp/l2tp_netlink.c The team, fib_frontend, route, and l2tp_netlink conflicts were simply overlapping changes. qmi_wwan and bat_iv_ogm were of the "use HEAD" variety. With help from Antonio Quartulli. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-28 14:40:49 -04:00
Alan Cox	a84a921978	key: Fix resource leak On an error iov may still have been reallocated and need freeing Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com>	2012-09-28 12:20:02 +01:00
Alan Cox	631527703d	keys: Fix unreachable code We set ret to NULL then test it. Remove the bogus test Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com>	2012-09-28 12:20:02 +01:00
James Morris	bf53083445	Linux 3.6-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk= =0v2l -----END PGP SIGNATURE----- Merge tag 'v3.6-rc7' into next Linux 3.6-rc7 Requested by David Howells so he can merge his key susbsystem work into my tree with requisite -linus changesets.	2012-09-28 13:37:32 +10:00
Al Viro	cb0942b812	make get_file() return its argument simplifies a bunch of callers... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:25 -04:00
Al Viro	c3c073f808	new helper: iterate_fd() iterates through the opened files in given descriptor table, calling a supplied function; we stop once non-zero is returned. Callback gets struct file , descriptor number and const void argument passed to iterator. It is called with files->file_lock held, so it is not allowed to block. tty_io, netprio_cgroup and selinux flush_unauthorized_files() converted to its use. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:59 -04:00
Al Viro	ee97cd872d	switch flush_unauthorized_files() to replace_fd() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:09:58 -04:00
Eric W. Biederman	d2b31ca644	userns: Teach security_path_chown to take kuids and kgids Don't make the security modules deal with raw user space uid and gids instead pass in a kuid_t and a kgid_t so that security modules only have to deal with internal kernel uids and gids. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: James Morris <james.l.morris@oracle.com> Cc: John Johansen <john.johansen@canonical.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:25 -07:00
Eric W. Biederman	8b94eea4bf	userns: Add user namespace support to IMA Use kuid's in the IMA rules. When reporting the current uid in audit logs use from_kuid to get a usable value. Cc: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:24 -07:00
Eric W. Biederman	cf9c93526f	userns: Convert EVM to deal with kuids and kgids in it's hmac computation Cc: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:24 -07:00
Eric W. Biederman	581abc09c2	userns: Convert selinux to use kuid and kgid where appropriate Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <james.l.morris@oracle.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-21 03:13:22 -07:00
Eric W. Biederman	609fcd1b3a	userns: Convert tomoyo to use kuid and kgid where appropriate Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:22 -07:00
Eric W. Biederman	2db8145293	userns: Convert apparmor to use kuid and kgid where appropriate Cc: John Johansen <john.johansen@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:21 -07:00
Dmitry Kasatkin	0a72ba7aff	ima: change flags container data type IMA audit hashes patches introduced new IMA flags and required space went beyond 8 bits. Currently the only flag is IMA_DIGSIG. This patch use 16 bit short instead of 8 bit char. Without this fix IMA signature will be replaced with hash, which should not happen. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-09-19 08:55:20 -04:00
Nicolas Dichtel	ee8372dd19	xfrm: invalidate dst on policy insertion/deletion When a policy is inserted or deleted, all dst should be recalculated. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Casey Schaufler	46a2f3b9e9	Smack: setprocattr memory leak fix The data structure allocations being done in prepare_creds are duplicated in smack_setprocattr. This results in the structure allocated in prepare_creds being orphaned and never freed. The duplicate code is removed from smack_setprocattr. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-09-18 09:51:06 -07:00
Rafal Krypa	449543b043	Smack: implement revoking all rules for a subject label Add /smack/revoke-subject special file. Writing a SMACK label to this file will set the access to '-' for all access rules with that subject label. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Rafal Krypa <r.krypa@samsung.com>	2012-09-18 09:50:52 -07:00
Casey Schaufler	c00bedb368	Smack: remove task_wait() hook. On 12/20/2011 11:20 PM, Jarkko Sakkinen wrote: > Allow SIGCHLD to be passed to child process without > explicit policy. This will help to keep the access > control policy simple and easily maintainable with > complex applications that require use of multiple > security contexts. It will also help to keep them > as isolated as possible. > > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> I have a slightly different version that applies to the current smack-next tree. Allow SIGCHLD to be passed to child process without explicit policy. This will help to keep the access control policy simple and easily maintainable with complex applications that require use of multiple security contexts. It will also help to keep them as isolated as possible. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> security/smack/smack_lsm.c \| 37 ++++++++----------------------------- 1 files changed, 8 insertions(+), 29 deletions(-)	2012-09-18 09:50:37 -07:00
Tejun Heo	8c7f6edbda	cgroup: mark subsystems with broken hierarchy support and whine if cgroups are nested for them Currently, cgroup hierarchy support is a mess. cpu related subsystems behave correctly - configuration, accounting and control on a parent properly cover its children. blkio and freezer completely ignore hierarchy and treat all cgroups as if they're directly under the root cgroup. Others show yet different behaviors. These differing interpretations of cgroup hierarchy make using cgroup confusing and it impossible to co-mount controllers into the same hierarchy and obtain sane behavior. Eventually, we want full hierarchy support from all subsystems and probably a unified hierarchy. Users using separate hierarchies expecting completely different behaviors depending on the mounted subsystem is deterimental to making any progress on this front. This patch adds cgroup_subsys.broken_hierarchy and sets it to %true for controllers which are lacking in hierarchy support. The goal of this patch is two-fold. * Move users away from using hierarchy on currently non-hierarchical subsystems, so that implementing proper hierarchy support on those doesn't surprise them. * Keep track of which controllers are broken how and nudge the subsystems to implement proper hierarchy support. For now, start with a single warning message. We can whine louder later on. v2: Fixed a typo spotted by Michal. Warning message updated. v3: Updated memcg part so that it doesn't generate warning in the cases where .use_hierarchy=false doesn't make the behavior different from root.use_hierarchy=true. Fixed a typo spotted by Glauber. v4: Check ->broken_hierarchy after cgroup creation is complete so that ->create() can affect the result per Michal. Dropped unnecessary memcg root handling per Michal. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Turner <pjt@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2012-09-14 12:01:16 -07:00
Eric W. Biederman	9a56c2db49	userns: Convert security/keys to the new userns infrastructure - Replace key_user ->user_ns equality checks with kuid_has_mapping checks. - Use from_kuid to generate key descriptions - Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t - Avoid potential problems with file descriptor passing by displaying keys in the user namespace of the opener of key status proc files. Cc: linux-security-module@vger.kernel.org Cc: keyrings@linux-nfs.org Cc: David Howells <dhowells@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:02 -07:00
Peter Moody	e7c568e0fd	ima: audit log hashes This adds an 'audit' policy action which audit logs file measurements. Changelog v6: - use new action flag handling (Dmitry Kasatkin). - removed whitespace (Mimi) Changelog v5: - use audit_log_untrustedstring. Changelog v4: - cleanup digest -> hash conversion. - use filename rather than d_path in ima_audit_measurement. Changelog v3: - Use newly exported audit_log_task_info for logging pid/ppid/uid/etc. - Update the ima_policy ABI documentation. Changelog v2: - Use 'audit' action rather than 'measure_and_audit' to permit auditing in the absence of measuring.. Changelog v1: - Initial posting. Signed-off-by: Peter Moody <pmoody@google.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-09-13 14:48:44 -04:00
Dmitry Kasatkin	45e2472e67	ima: generic IMA action flag handling Make the IMA action flag handling generic in order to support additional new actions, without requiring changes to the base implementation. New actions, like audit logging, will only need to modify the define statements. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-09-13 14:23:57 -04:00
Oleg Nesterov	b3f68f16db	task_work: Revert "hold task_lock around checks in keyctl" This reverts commit `d35abdb288`. task_lock() was added to ensure exit_mm() and thus exit_task_work() is not possible before task_work_add(). This is wrong, task_lock() must not be nested with write_lock(tasklist). And this is no longer needed, task_work_add() now fails if it is called after exit_task_work(). Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20120826191214.GA4231@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 16:47:36 +02:00
David Howells	d4f65b5d24	KEYS: Add payload preparsing opportunity prior to key instantiate or update Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (preparse)(struct key_preparsed_payload prep); void (free_preparse)(struct key_preparsed_payload prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char description; void type_data[2]; void payload; const void data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (instantiate)(struct key key, struct key_preparsed_payload prep); int (update)(struct key key, struct key_preparsed_payload prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: David Howells <dhowells@redhat.com>	2012-09-13 13:06:29 +01:00
Dmitry Kasatkin	d9d300cdb6	ima: rename ima_must_appraise_or_measure When AUDIT action support is added to the IMA, ima_must_appraise_or_measure() does not reflect the real meaning anymore. Rename it to ima_get_action(). Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-09-12 07:28:05 -04:00
Pablo Neira Ayuso	9f00d9776b	netlink: hide struct module parameter in netlink_kernel_create This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:46:30 -04:00
Pablo Neira Ayuso	9785e10aed	netlink: kill netlink_set_nonroot Replace netlink_set_nonroot by one new field `flags' in struct netlink_kernel_cfg that is passed to netlink_kernel_create. This patch also renames NL_NONROOT_* to NL_CFG_F_NONROOT_* since now the flags field in nl_table is generic (so we can add more flags if needed in the future). Also adjust all callers in the net-next tree to use these flags instead of netlink_set_nonroot. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:45:27 -04:00
Dmitry Kasatkin	8606404fa5	ima: digital signature verification support This patch adds support for digital signature based integrity appraisal. With this patch, 'security.ima' contains either the file data hash or a digital signature of the file data hash. The file data hash provides the security attribute of file integrity. In addition to file integrity, a digital signature provides the security attribute of authenticity. Unlike EVM, when the file metadata changes, the digital signature is replaced with an HMAC, modification of the file data does not cause the 'security.ima' digital signature to be replaced with a hash. As a result, after any modification, subsequent file integrity appraisals would fail. Although digitally signed files can be modified, but by not updating 'security.ima' to reflect these modifications, in essence digitally signed files could be considered 'immutable'. IMA uses a different keyring than EVM. While the EVM keyring should not be updated after initialization and locked, the IMA keyring should allow updating or adding new keys when upgrading or installing packages. Changelog v4: - Change IMA_DIGSIG to hex equivalent Changelog v3: - Permit files without any 'security.ima' xattr to be labeled properly. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-09-07 14:57:48 -04:00
Mimi Zohar	5a44b41207	ima: add support for different security.ima data types IMA-appraisal currently verifies the integrity of a file based on a known 'good' measurement value. This patch reserves the first byte of 'security.ima' as a place holder for the type of method used for verifying file data integrity. Changelog v1: - Use the newly defined 'struct evm_ima_xattr_data' Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@nokia.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-09-07 14:57:47 -04:00
Mimi Zohar	42c63330f2	ima: add ima_inode_setxattr/removexattr function and calls Based on xattr_permission comments, the restriction to modify 'security' xattr is left up to the underlying fs or lsm. Ensure that not just anyone can modify or remove 'security.ima'. Changelog v1: - Unless IMA-APPRAISE is configured, use stub ima_inode_removexattr()/setxattr() functions. (Moved ima_inode_removexattr()/setxattr() to ima_appraise.c) Changelog: - take i_mutex to fix locking (Dmitry Kasatkin) - ima_reset_appraise_flags should only be called when modifying or removing the 'security.ima' xattr. Requires CAP_SYS_ADMIN privilege. (Incorporated fix from Roberto Sassu) - Even if allowed to update security.ima, reset the appraisal flags, forcing re-appraisal. - Replace CAP_MAC_ADMIN with CAP_SYS_ADMIN - static inline ima_inode_setxattr()/ima_inode_removexattr() stubs - ima_protect_xattr should be static Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>	2012-09-07 14:57:47 -04:00
Dmitry Kasatkin	a10bf26b2f	ima: replace iint spinblock with rwlock/read_lock For performance, replace the iint spinlock with rwlock/read_lock. Eric Paris questioned this change, from spinlocks to rwlocks, saying "rwlocks have been shown to actually be slower on multi processor systems in a number of cases due to the cache line bouncing required." Based on performance measurements compiling the kernel on a cold boot with multiple jobs with/without this patch, Dmitry Kasatkin and I found that rwlocks performed better than spinlocks, but very insignificantly. For example with total compilation time around 6 minutes, with rwlocks time was 1 - 3 seconds shorter... but always like that. Changelog v2: - new patch taken from the 'allocating iint improvements' patch Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2012-09-07 14:57:46 -04:00
Dmitry Kasatkin	bf2276d10c	ima: allocating iint improvements With IMA-appraisal's removal of the iint mutex and taking the i_mutex instead, allocating the iint becomes a lot simplier, as we don't need to be concerned with two processes racing to allocate the iint. This patch cleans up and improves performance for allocating the iint. - removed redundant double i_mutex locking - combined iint allocation with tree search Changelog v2: - removed the rwlock/read_lock changes from this patch Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2012-09-07 14:57:45 -04:00
Mimi Zohar	07f6a79415	ima: add appraise action keywords and default rules Unlike the IMA measurement policy, the appraise policy can not be dependent on runtime process information, such as the task uid, as the 'security.ima' xattr is written on file close and must be updated each time the file changes, regardless of the current task uid. This patch extends the policy language with 'fowner', defines an appraise policy, which appraises all files owned by root, and defines 'ima_appraise_tcb', a new boot command line option, to enable the appraise policy. Changelog v3: - separate the measure from the appraise rules in order to support measuring without appraising and appraising without measuring. - change appraisal default for filesystems without xattr support to fail - update default appraise policy for cgroups Changelog v1: - don't appraise RAMFS (Dmitry Kasatkin) - merged rest of "ima: ima_must_appraise_or_measure API change" commit (Dmtiry Kasatkin) ima_must_appraise_or_measure() called ima_match_policy twice, which searched the policy for a matching rule. Once for a matching measurement rule and subsequently for an appraisal rule. Searching the policy twice is unnecessary overhead, which could be noticeable with a large policy. The new version of ima_must_appraise_or_measure() does everything in a single iteration using a new version of ima_match_policy(). It returns IMA_MEASURE, IMA_APPRAISE mask. With the use of action mask only one efficient matching function is enough. Removed other specific versions of matching functions. Changelog: - change 'owner' to 'fowner' to conform to the new LSM conditions posted by Roberto Sassu. - fix calls to ima_log_string() Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>	2012-09-07 14:57:45 -04:00
Mimi Zohar	2fe5d6def1	ima: integrity appraisal extension IMA currently maintains an integrity measurement list used to assert the integrity of the running system to a third party. The IMA-appraisal extension adds local integrity validation and enforcement of the measurement against a "good" value stored as an extended attribute 'security.ima'. The initial methods for validating 'security.ima' are hashed based, which provides file data integrity, and digital signature based, which in addition to providing file data integrity, provides authenticity. This patch creates and maintains the 'security.ima' xattr, containing the file data hash measurement. Protection of the xattr is provided by EVM, if enabled and configured. Based on policy, IMA calls evm_verifyxattr() to verify a file's metadata integrity and, assuming success, compares the file's current hash value with the one stored as an extended attribute in 'security.ima'. Changelov v4: - changed iint cache flags to hex values Changelog v3: - change appraisal default for filesystems without xattr support to fail Changelog v2: - fix audit msg 'res' value - removed unused 'ima_appraise=' values Changelog v1: - removed unused iint mutex (Dmitry Kasatkin) - setattr hook must not reset appraised (Dmitry Kasatkin) - evm_verifyxattr() now differentiates between no 'security.evm' xattr (INTEGRITY_NOLABEL) and no EVM 'protected' xattrs included in the 'security.evm' (INTEGRITY_NOXATTRS). - replace hash_status with ima_status (Dmitry Kasatkin) - re-initialize slab element ima_status on free (Dmitry Kasatkin) - include 'security.ima' in EVM if CONFIG_IMA_APPRAISE, not CONFIG_IMA - merged half "ima: ima_must_appraise_or_measure API change" (Dmitry Kasatkin) - removed unnecessary error variable in process_measurement() (Dmitry Kasatkin) - use ima_inode_post_setattr() stub function, if IMA_APPRAISE not configured (moved ima_inode_post_setattr() to ima_appraise.c) - make sure ima_collect_measurement() can read file Changelog: - add 'iint' to evm_verifyxattr() call (Dimitry Kasatkin) - fix the race condition between chmod, which takes the i_mutex and then iint->mutex, and ima_file_free() and process_measurement(), which take the locks in the reverse order, by eliminating iint->mutex. (Dmitry Kasatkin) - cleanup of ima_appraise_measurement() (Dmitry Kasatkin) - changes as a result of the iint not allocated for all regular files, but only for those measured/appraised. - don't try to appraise new/empty files - expanded ima_appraisal description in ima/Kconfig - IMA appraise definitions required even if IMA_APPRAISE not enabled - add return value to ima_must_appraise() stub - unconditionally set status = INTEGRITY_PASS after testing status, not before. (Found by Joe Perches) Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>	2012-09-07 14:57:44 -04:00
Kees Cook	2e4930eb7c	Yama: handle 32-bit userspace prctl When running a 64-bit kernel and receiving prctls from a 32-bit userspace, the "-1" used as an unsigned long will end up being misdetected. The kernel is looking for 0xffffffffffffffff instead of 0xffffffff. Since prctl lacks a distinct compat interface, Yama needs to handle this translation itself. As such, support either value as meaning PR_SET_PTRACER_ANY, to avoid breaking the ABI for 64-bit. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Johansen <john.johansen@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-09-08 01:06:14 +10:00
Kees Cook	c6993e4ac0	security: allow Yama to be unconditionally stacked Unconditionally call Yama when CONFIG_SECURITY_YAMA_STACKED is selected, no matter what LSM module is primary. Ubuntu and Chrome OS already carry patches to do this, and Fedora has voiced interest in doing this as well. Instead of having multiple distributions (or LSM authors) carrying these patches, just allow Yama to be called unconditionally when selected by the new CONFIG. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-09-05 14:12:31 -07:00
Paul Bolle	ec2e1ed2d7	AppArmor: remove af_names.h from .gitignore Commit `4fdef2183e` ("AppArmor: Cleanup make file to remove cruft and make it easier to read") removed all traces of af_names.h from the tree. Remove its entry in AppArmor's .gitignore file too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-09-01 08:35:34 -07:00
Kent Yoder	20328b56cd	ima: enable the IBM vTPM as the default TPM in the PPC64 case Enable tpm_ibmvtpm driver by default when IMA is enabled on PPC64 Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>	2012-08-22 16:23:23 -05:00
Kent Yoder	41ab999c80	tpm: Move tpm_get_random api into the TPM device driver Move the tpm_get_random api from the trusted keys code into the TPM device driver itself so that other callers can make use of it. Also, change the api slightly so that the number of bytes read is returned in the call, since the TPM command can potentially return fewer bytes than requested. Acked-by: David Safford <safford@linux.vnet.ibm.com> Reviewed-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>	2012-08-22 11:11:33 -05:00
Tejun Heo	3b07e9ca26	workqueue: deprecate system_nrt[_freezable]_wq system_nrt[_freezable]_wq are now spurious. Mark them deprecated and convert all users to system[_freezable]_wq. If you're cc'd and wondering what's going on: Now all workqueues are non-reentrant, so there's no reason to use system_nrt[_freezable]_wq. Please use system[_freezable]_wq instead. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-By: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Airlie <airlied@linux.ie> Cc: Jiri Kosina <jkosina@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com>	2012-08-20 14:51:24 -07:00
Kees Cook	7612bfeecc	Yama: access task_struct->comm directly The core ptrace access checking routine holds a task lock, and when reporting a failure, Yama takes a separate task lock. To avoid a potential deadlock with two ptracers taking the opposite locks, do not use get_task_comm() and just use ->comm directly since accuracy is not important for the report. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> CC: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-08-17 20:40:38 +10:00
Kees Cook	9d8dad742a	Yama: higher restrictions should block PTRACE_TRACEME The higher ptrace restriction levels should be blocking even PTRACE_TRACEME requests. The comments in the LSM documentation are misleading about when the checks happen (the parent does not go through security_ptrace_access_check() on a PTRACE_TRACEME call). Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org # 3.5.x and later Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-08-10 19:58:07 +10:00
Mel Gorman	6290c2c439	selinux: tag avc cache alloc as non-critical Failing to allocate a cache entry will only harm performance not correctness. Do not consume valuable reserve pages for something like that. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric B Munson <emunson@mgebm.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Neil Brown <neilb@suse.de> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:47 -07:00
Linus Torvalds	27c1ee3f92	Merge branch 'akpm' (Andrew's patch-bomb) Merge Andrew's first set of patches: "Non-MM patches: - lots of misc bits - tree-wide have_clk() cleanups - quite a lot of printk tweaks. I draw your attention to "printk: convert the format for KERN_<LEVEL> to a 2 byte pattern" which looks a bit scary. But afaict it's solid. - backlight updates - lib/ feature work (notably the addition and use of memweight()) - checkpatch updates - rtc updates - nilfs updates - fatfs updates (partial, still waiting for acks) - kdump, proc, fork, IPC, sysctl, taskstats, pps, etc - new fault-injection feature work" * Merge emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) drivers/misc/lkdtm.c: fix missing allocation failure check lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table() fault-injection: add tool to run command with failslab or fail_page_alloc fault-injection: add selftests for cpu and memory hotplug powerpc: pSeries reconfig notifier error injection module memory: memory notifier error injection module PM: PM notifier error injection module cpu: rewrite cpu-notifier-error-inject module fault-injection: notifier error injection c/r: fcntl: add F_GETOWNER_UIDS option resource: make sure requested range is included in the root range include/linux/aio.h: cpp->C conversions fs: cachefiles: add support for large files in filesystem caching pps: return PTR_ERR on error in device_create taskstats: check nla_reserve() return sysctl: suppress kmemleak messages ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION ipc: compat: use signed size_t types for msgsnd and msgrcv ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC ipc: add COMPAT_SHMLBA support ...	2012-07-30 17:25:34 -07:00
Cyrill Gorcunov	1d151c337d	c/r: fcntl: add F_GETOWNER_UIDS option When we restore file descriptors we would like them to look exactly as they were at dumping time. With help of fcntl it's almost possible, the missing snippet is file owners UIDs. To be able to read their values the F_GETOWNER_UIDS is introduced. This option is valid iif CONFIG_CHECKPOINT_RESTORE is turned on, otherwise returning -EINVAL. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-30 17:25:21 -07:00
Al Viro	e3fea3f70f	selinux: fix selinux_inode_setxattr oops OK, what we have so far is e.g. setxattr(path, name, whatever, 0, XATTR_REPLACE) with name being good enough to get through xattr_permission(). Then we reach security_inode_setxattr() with the desired value and size. Aha. name should begin with "security.selinux", or we won't get that far in selinux_inode_setxattr(). Suppose we got there and have enough permissions to relabel that sucker. We call security_context_to_sid() with value == NULL, size == 0. OK, we want ss_initialized to be non-zero. I.e. after everything had been set up and running. No problem... We do 1-byte kmalloc(), zero-length memcpy() (which doesn't oops, even thought the source is NULL) and put a NUL there. I.e. form an empty string. string_to_context_struct() is called and looks for the first ':' in there. Not found, -EINVAL we get. OK, security_context_to_sid_core() has rc == -EINVAL, force == 0, so it silently returns -EINVAL. All it takes now is not having CAP_MAC_ADMIN and we are fucked. All right, it might be a different bug (modulo strange code quoted in the report), but it's real. Easily fixed, AFAICS: Deal with size == 0, value == NULL case in selinux_inode_setxattr() Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Dave Jones <davej@redhat.com> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-07-30 15:36:50 +10:00
Alan Cox	3b9fc37280	smack: off by one error Consider the input case of a rule that consists entirely of non space symbols followed by a \0. Say 64 + \0 In this case strlen(data) = 64 kzalloc of subject and object are 64 byte objects sscanfdata, "%s %s %s", subject, ...) will put 65 bytes into subject. Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-07-30 15:04:17 +10:00
Josh Boyer	8ded2bbc18	posix_types.h: Cleanup stale __NFDBITS and related definitions Recently, glibc made a change to suppress sign-conversion warnings in FD_SET (glibc commit ceb9e56b3d1). This uncovered an issue with the kernel's definition of __NFDBITS if applications #include <linux/types.h> after including <sys/select.h>. A build failure would be seen when passing the -Werror=sign-compare and -D_FORTIFY_SOURCE=2 flags to gcc. It was suggested that the kernel should either match the glibc definition of __NFDBITS or remove that entirely. The current in-kernel uses of __NFDBITS can be replaced with BITS_PER_LONG, and there are no uses of the related __FDELT and __FDMASK defines. Given that, we'll continue the cleanup that was started with commit `8b3d1cda4f` ("posix_types: Remove fd_set macros") and drop the remaining unused macros. Additionally, linux/time.h has similar macros defined that expand to nothing so we'll remove those at the same time. Reported-by: Jeff Law <law@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> CC: <stable@vger.kernel.org> Signed-off-by: Josh Boyer <jwboyer@redhat.com> [ .. and fix up whitespace as per akpm ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-26 13:36:43 -07:00
Linus Torvalds	3c4cfadef6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David S Miller: 1) Remove the ipv4 routing cache. Now lookups go directly into the FIB trie and use prebuilt routes cached there. No more garbage collection, no more rDOS attacks on the routing cache. Instead we now get predictable and consistent performance, no matter what the pattern of traffic we service. This has been almost 2 years in the making. Special thanks to Julian Anastasov, Eric Dumazet, Steffen Klassert, and others who have helped along the way. I'm sure that with a change of this magnitude there will be some kind of fallout, but such things ought the be simple to fix at this point. Luckily I'm not European so I'll be around all of August to fix things :-) The major stages of this work here are each fronted by a forced merge commit whose commit message contains a top-level description of the motivations and implementation issues. 2) Pre-demux of established ipv4 TCP sockets, saves a route demux on input. 3) TCP SYN/ACK performance tweaks from Eric Dumazet. 4) Add namespace support for netfilter L4 conntrack helpers, from Gao Feng. 5) Add config mechanism for Energy Efficient Ethernet to ethtool, from Yuval Mintz. 6) Remove quadratic behavior from /proc/net/unix, from Eric Dumazet. 7) Support for connection tracker helpers in userspace, from Pablo Neira Ayuso. 8) Allow userspace driven TX load balancing functions in TEAM driver, from Jiri Pirko. 9) Kill off NLMSG_PUT and RTA_PUT macros, more gross stuff with embedded gotos. 10) TCP Small Queues, essentially minimize the amount of TCP data queued up in the packet scheduler layer. Whereas the existing BQL (Byte Queue Limits) limits the pkt_sched --> netdevice queuing levels, this controls the TCP --> pkt_sched queueing levels. From Eric Dumazet. 11) Reduce the number of get_page/put_page ops done on SKB fragments, from Alexander Duyck. 12) Implement protection against blind resets in TCP (RFC 5961), from Eric Dumazet. 13) Support the client side of TCP Fast Open, basically the ability to send data in the SYN exchange, from Yuchung Cheng. Basically, the sender queues up data with a sendmsg() call using MSG_FASTOPEN, then they do the connect() which emits the queued up fastopen data. 14) Avoid all the problems we get into in TCP when timers or PMTU events hit a locked socket. The TCP Small Queues changes added a tcp_release_cb() that allows us to queue work up to the release_sock() caller, and that's what we use here too. From Eric Dumazet. 15) Zero copy on TX support for TUN driver, from Michael S. Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1870 commits) genetlink: define lockdep_genl_is_held() when CONFIG_LOCKDEP r8169: revert "add byte queue limit support". ipv4: Change rt->rt_iif encoding. net: Make skb->skb_iif always track skb->dev ipv4: Prepare for change of rt->rt_iif encoding. ipv4: Remove all RTCF_DIRECTSRC handliing. ipv4: Really ignore ICMP address requests/replies. decnet: Don't set RTCF_DIRECTSRC. net/ipv4/ip_vti.c: Fix __rcu warnings detected by sparse. ipv4: Remove redundant assignment rds: set correct msg_namelen openvswitch: potential NULL deref in sample() tcp: dont drop MTU reduction indications bnx2x: Add new 57840 device IDs tcp: avoid oops in tcp_metrics and reset tcpm_stamp niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value niu: Fix to check for dma mapping errors. net: Fix references to out-of-scope variables in put_cmsg_compat() net: ethernet: davinci_emac: add pm_runtime support net: ethernet: davinci_emac: Remove unnecessary #include ...	2012-07-24 10:01:50 -07:00
Linus Torvalds	e05644e17e	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Nothing groundbreaking for this kernel, just cleanups and fixes, and a couple of Smack enhancements." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits) Smack: Maintainer Record Smack: don't show empty rules when /smack/load or /smack/load2 is read Smack: user access check bounds Smack: onlycap limits on CAP_MAC_ADMIN Smack: fix smack_new_inode bogosities ima: audit is compiled only when enabled ima: ima_initialized is set only if successful ima: add policy for pseudo fs ima: remove unused cleanup functions ima: free securityfs violations file ima: use full pathnames in measurement list security: Fix nommu build. samples: seccomp: add .gitignore for untracked executables tpm: check the chip reference before using it TPM: fix memleak when register hardware fails TPM: chip disabled state erronously being reported as error MAINTAINERS: TPM maintainers' contacts update Merge branches 'next-queue' and 'next' into next Remove unused code from MPI library Revert "crypto: GnuPG based MPI lib - additional sources (part 4)" ...	2012-07-23 18:49:06 -07:00
Linus Torvalds	a66d2c8f7e	Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull the big VFS changes from Al Viro: "This one is big and changes quite a few things around VFS. What's in there: - the first of two really major architecture changes - death to open intents. The former is finally there; it was very long in making, but with Miklos getting through really hard and messy final push in fs/namei.c, we finally have it. Unlike his variant, this one doesn't introduce struct opendata; what we have instead is ->atomic_open() taking preallocated struct file * and passing everything via its fields. Instead of returning struct file , it returns -E... on error, 0 on success and 1 in "deal with it yourself" case (e.g. symlink found on server, etc.). See comments before fs/namei.c:atomic_open(). That made a lot of goodies finally possible and quite a few are in that pile: ->lookup(), ->d_revalidate() and ->create() do not get struct nameidata anymore; ->lookup() and ->d_revalidate() get lookup flags instead, ->create() gets "do we want it exclusive" flag. With the introduction of new helper (kern_path_locked()) we are rid of all struct nameidata instances outside of fs/namei.c; it's still visible in namei.h, but not for long. Come the next cycle, declaration will move either to fs/internal.h or to fs/namei.c itself. [me, miklos, hch] - The second major change: behaviour of final fput(). Now we have __fput() done without any locks held by caller and not from deep in call stack. That obviously lifts a lot of constraints on the locking in there. Moreover, it's legal now to call fput() from atomic contexts (which has immediately simplified life for aio.c). We also don't need anti-recursion logics in __scm_destroy() anymore. There is a price, though - the damn thing has become partially asynchronous. For fput() from normal process we are guaranteed that pending __fput() will be done before the caller returns to userland, exits or gets stopped for ptrace. For kernel threads and atomic contexts it's done via schedule_work(), so theoretically we might need a way to make sure it's finished; so far only one such place had been found, but there might be more. There's flush_delayed_fput() (do all pending __fput()) and there's __fput_sync() (fput() analog doing __fput() immediately). I hope we won't need them often; see warnings in fs/file_table.c for details. [me, based on task_work series from Oleg merged last cycle] - sync series from Jan - large part of "death to sync_supers()" work from Artem; the only bits missing here are exofs and ext4 ones. As far as I understand, those are going via the exofs and ext4 trees resp.; once they are in, we can put ->write_super() to the rest, along with the thread calling it. - preparatory bits from unionmount series (from dhowells). - assorted cleanups and fixes all over the place, as usual. This is not the last pile for this cycle; there's at least jlayton's ESTALE work and fsfreeze series (the latter - in dire need of fixes, so I'm not sure it'll make the cut this cycle). I'll probably throw symlink/hardlink restrictions stuff from Kees into the next pile, too. Plus there's a lot of misc patches I hadn't thrown into that one - it's large enough as it is..." * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits) ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file() btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file() switch dentry_open() to struct path, make it grab references itself spufs: shift dget/mntget towards dentry_open() zoran: don't bother with struct file * in zoran_map ecryptfs: don't reinvent the wheels, please - use struct completion don't expose I_NEW inodes via dentry->d_inode tidy up namei.c a bit unobfuscate follow_up() a bit ext3: pass custom EOF to generic_file_llseek_size() ext4: use core vfs llseek code for dir seeks vfs: allow custom EOF in generic_file_llseek code vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes vfs: Remove unnecessary flushing of block devices vfs: Make sys_sync writeout also block device inodes vfs: Create function for iterating over block devices vfs: Reorder operations during sys_sync quota: Move quota syncing to ->sync_fs method quota: Split dquot_quota_sync() to writeback and cache flushing part vfs: Move noop_backing_dev_info check from sync into writeback ...	2012-07-23 12:27:27 -07:00
Al Viro	765927b2d5	switch dentry_open() to struct path, make it grab references itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-23 00:01:29 +04:00
Al Viro	d35abdb288	hold task_lock around checks in keyctl Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:58:01 +04:00
Al Viro	67d1214551	merge task_work and rcu_head, get rid of separate allocation for keyring case task_work and rcu_head are identical now; merge them (calling the result struct callback_head, rcu_head #define'd to it), kill separate allocation in security/keys since we can just use cred->rcu now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:57:56 +04:00
Al Viro	41f9d29f09	trimming task_work: kill ->data get rid of the only user of ->data; this is _not_ the final variant - in the end we'll have task_work and rcu_head identical and just use cred->rcu, at which point the separate allocation will be gone completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:57:54 +04:00
David S. Miller	abaa72d7fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c	2012-07-19 11:17:30 -07:00
Linus Torvalds	e2f3b78557	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull SELinux regression fixes from James Morris. Andrew Morton has a box that hit that open perms problem. I also renamed the "epollwakeup" selinux name for the new capability to be "block_suspend", to match the rename done by commit `d9914cf661` ("PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND"). * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: SELinux: do not check open perms if they are not known to policy SELinux: include definition of new capabilities	2012-07-18 13:42:44 -07:00
Eric Paris	3d2195c332	SELinux: do not check open perms if they are not known to policy When I introduced open perms policy didn't understand them and I implemented them as a policycap. When I added the checking of open perm to truncate I forgot to conditionalize it on the userspace defined policy capability. Running an old policy with a new kernel will not check open on open(2) but will check it on truncate. Conditionalize the truncate check the same as the open check. Signed-off-by: Eric Paris <eparis@redhat.com> Cc: stable@vger.kernel.org # 3.4.x Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-07-16 11:41:47 +10:00
Eric Paris	64919e6091	SELinux: include definition of new capabilities The kernel has added CAP_WAKE_ALARM and CAP_EPOLLWAKEUP. We need to define these in SELinux so they can be mediated by policy. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-07-16 11:40:31 +10:00
Rafal Krypa	65ee7f45cf	Smack: don't show empty rules when /smack/load or /smack/load2 is read This patch removes empty rules (i.e. with access set to '-') from the rule list presented to user space. Smack by design never removes labels nor rules from its lists. Access for a rule may be set to '-' to effectively disable it. Such rules would show up in the listing generated when /smack/load or /smack/load2 is read. This may cause clutter if many rules were disabled. As a rule with access set to '-' is equivalent to no rule at all, they may be safely hidden from the listing. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Rafal Krypa <r.krypa@samsung.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-07-13 15:49:24 -07:00
Casey Schaufler	3518721a89	Smack: user access check bounds Some of the bounds checking used on the /smack/access interface was lost when support for long labels was added. No kernel access checks are affected, however this is a case where /smack/access could be used incorrectly and fail to detect the error. This patch reintroduces the original checks. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-07-13 15:49:24 -07:00
Casey Schaufler	1880eff77e	Smack: onlycap limits on CAP_MAC_ADMIN Smack is integrated with the POSIX capabilities scheme, using the capabilities CAP_MAC_OVERRIDE and CAP_MAC_ADMIN to determine if a process is allowed to ignore Smack checks or change Smack related data respectively. Smack provides an additional restriction that if an onlycap value is set by writing to /smack/onlycap only tasks with that Smack label are allowed to use CAP_MAC_OVERRIDE. This change adds CAP_MAC_ADMIN as a capability that is affected by the onlycap mechanism. Targeted for git://git.gitorious.org/smack-next/kernel.git Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-07-13 15:49:23 -07:00
Casey Schaufler	eb982cb4cf	Smack: fix smack_new_inode bogosities In January of 2012 Al Viro pointed out three bits of code that he titled "new_inode_smack bogosities". This patch repairs these errors. 1. smack_sb_kern_mount() included a NULL check that is impossible. The check and NULL case are removed. 2. smack_kb_kern_mount() included pointless locking. The locking is removed. Since this is the only place that lock was used the lock is removed from the superblock_smack structure. 3. smk_fill_super() incorrectly and unnecessarily set the Smack label for the smackfs root inode. The assignment has been removed. Targeted for git://gitorious.org/smack-next/kernel.git Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-07-13 15:49:23 -07:00
Dmitry Kasatkin	417c6c8ee2	ima: audit is compiled only when enabled IMA auditing code was compiled even when CONFIG_AUDIT was not enabled. This patch compiles auditing code only when possible and enabled. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-07-05 16:43:59 -04:00
Dmitry Kasatkin	7ff2267af5	ima: ima_initialized is set only if successful Set ima_initialized only if initialization was successful. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-07-05 16:43:57 -04:00
Dmitry Kasatkin	8445d64dd7	ima: add policy for pseudo fs Exclude DEVPTS and BINFMT filesystems from the measurement policy. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-07-05 16:42:33 -04:00
David S. Miller	c90a9bb907	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-07-05 03:44:25 -07:00
Paul Mundt	75331a597c	security: Fix nommu build. The security + nommu configuration presently blows up with an undefined reference to BDI_CAP_EXEC_MAP: security/security.c: In function 'mmap_prot': security/security.c:687:36: error: dereferencing pointer to incomplete type security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function) security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in include backing-dev.h directly to fix it up. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-07-03 21:41:03 +10:00
Dmitry Kasatkin	c7de7adc18	ima: remove unused cleanup functions IMA cannot be used as module and does not need __exit functions. Removed them. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-07-02 16:43:30 -04:00
Dmitry Kasatkin	0ea4f8ae41	ima: free securityfs violations file On ima_fs_init() error, free securityfs violations file. Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2012-07-02 16:43:30 -04:00
Mimi Zohar	08e1b76ae3	ima: use full pathnames in measurement list The IMA measurement list contains filename hints, which can be ambigious without the full pathname. This patch replaces the filename hint with the full pathname, simplifying for userspace the correlating of file hash measurements with files. Change log v1: - Revert to short filenames, when full pathname is longer than IMA measurement buffer size. (Based on Dmitry's review) Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2012-07-02 16:43:29 -04:00
Paul Mundt	659b5e7652	security: Fix nommu build. The security + nommu configuration presently blows up with an undefined reference to BDI_CAP_EXEC_MAP: security/security.c: In function 'mmap_prot': security/security.c:687:36: error: dereferencing pointer to incomplete type security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function) security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in include backing-dev.h directly to fix it up. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-07-02 23:56:04 +10:00
Pablo Neira Ayuso	a31f2d17b3	netlink: add netlink_kernel_cfg parameter to netlink_kernel_create This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (input)(struct sk_buff skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 16:46:02 -07:00
David S. Miller	01f534d0ae	selinux: netlink: Move away from NLMSG_PUT(). And use nlmsg_data() while we're here too. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 21:54:06 -07:00
James Morris	66dd07b88a	Merge commit 'v3.5-rc2' into next	2012-06-10 22:52:10 +10:00
Alban Crequy	2597a8344c	netfilter: selinux: switch hook PFs to nfproto This patch is a cleanup. Use NFPROTO_* for consistency with other netfilter code. Signed-off-by: Alban Crequy <alban.crequy@collabora.co.uk> Reviewed-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Reviewed-by: Vincent Sanders <vincent.sanders@collabora.co.uk> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:43 +02:00
Linus Torvalds	1193755ac6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs changes from Al Viro. "A lot of misc stuff. The obvious groups: * Miklos' atomic_open series; kills the damn abuse of ->d_revalidate() by NFS, which was the major stumbling block for all work in that area. * ripping security_file_mmap() and dealing with deadlocks in the area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in general. * ->encode_fh() switched to saner API; insane fake dentry in mm/cleancache.c gone. * assorted annotations in fs (endianness, __user) * parts of Artem's ->s_dirty work (jff2 and reiserfs parts) * ->update_time() work from Josef. * other bits and pieces all over the place. Normally it would've been in two or three pull requests, but signal.git stuff had eaten a lot of time during this cycle ;-/" Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the 'truncate_range' inode method was removed by the VM changes, the VFS update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due to sparse fix added twice, with other changes nearby). * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits) nfs: don't open in ->d_revalidate vfs: retry last component if opening stale dentry vfs: nameidata_to_filp(): don't throw away file on error vfs: nameidata_to_filp(): inline __dentry_open() vfs: do_dentry_open(): don't put filp vfs: split __dentry_open() vfs: do_last() common post lookup vfs: do_last(): add audit_inode before open vfs: do_last(): only return EISDIR for O_CREAT vfs: do_last(): check LOOKUP_DIRECTORY vfs: do_last(): make ENOENT exit RCU safe vfs: make follow_link check RCU safe vfs: do_last(): use inode variable vfs: do_last(): inline walk_component() vfs: do_last(): make exit RCU safe vfs: split do_lookup() Btrfs: move over to use ->update_time fs: introduce inode operation ->update_time reiserfs: get rid of resierfs_sync_super reiserfs: mark the superblock as dirty a bit later ...	2012-06-01 10:34:35 -07:00
Al Viro	98de59bfe4	take calculation of final prot in security_mmap_file() into a helper Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:17 -04:00
Al Viro	8b3ec6814c	take security_mmap_file() outside of ->mmap_sem Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-06-01 10:37:01 -04:00
Linus Torvalds	fb21affa49	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull second pile of signal handling patches from Al Viro: "This one is just task_work_add() series + remaining prereqs for it. There probably will be another pull request from that tree this cycle - at least for helpers, to get them out of the way for per-arch fixes remaining in the tree." Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile had brought in commit `97fd75b7b8` ("kernel/irq/manage.c: use the pr_foo() infrastructure to prefix printks") which changed one of the pr_err() calls that this merge moves around. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: keys: kill task_struct->replacement_session_keyring keys: kill the dummy key_replace_session_keyring() keys: change keyctl_session_to_parent() to use task_work_add() genirq: reimplement exit_irq_thread() hook via task_work_add() task_work_add: generic process-context callbacks avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers parisc: need to check NOTIFY_RESUME when exiting from syscall move key_repace_session_keyring() into tracehook_notify_resume() TIF_NOTIFY_RESUME is defined on all targets now	2012-05-31 18:47:30 -07:00
Christopher Yeoh	ac34ebb3a6	aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after changes made to support CMA in an earlier patch. Rather than having an additional check_access parameter to these functions, the first paramater type is overloaded to allow the caller to specify CHECK_IOVEC_ONLY which means check that the contents of the iovec are valid, but do not check the memory that they point to. This is used by process_vm_readv/writev where we need to validate that a iovec passed to the syscall is valid but do not want to check the memory that it points to at this point because it refers to an address space in another process. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Boaz Harrosh	81ab6e7b26	kmod: convert two call sites to call_usermodehelper_fns() Both kernel/sys.c && security/keys/request_key.c where inlining the exact same code as call_usermodehelper_fns(); So simply convert these sites to directly use call_usermodehelper_fns(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Andrew Morton	4f1c28d241	security/keys/keyctl.c: suppress memory allocation failure warning This allocation may be large. The code is probing to see if it will succeed and if not, it falls back to vmalloc(). We should suppress any page-allocation failure messages when the fallback happens. Reported-by: Dave Jones <davej@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:26 -07:00
Al Viro	e5467859f7	split ->file_mmap() into ->mmap_addr()/->mmap_file() ... i.e. file-dependent and address-dependent checks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-31 13:11:54 -04:00
Al Viro	d007794a18	split cap_mmap_addr() out of cap_file_mmap() ... switch callers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-31 13:10:54 -04:00
Al Viro	cc1dad7183	selinuxfs snprintf() misuses a) %d does _not_ produce a page worth of output b) snprintf() doesn't return negatives - it used to in old glibc, but that's the kernel... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-29 23:28:33 -04:00
David Howells	423b978802	KEYS: Fix some sparse warnings Fix some sparse warnings in the keyrings code: (1) compat_keyctl_instantiate_key_iov() should be static. (2) There were a couple of places where a pointer was being compared against integer 0 rather than NULL. (3) keyctl_instantiate_key_common() should not take a __user-labelled iovec pointer as the caller must have copied the iovec to kernel space. (4) __key_link_begin() takes and __key_link_end() releases keyring_serialise_link_sem under some circumstances and so this should be declared. Note that adding __acquires() and __releases() for this doesn't help cure the warnings messages - something only commenting out both helps. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-25 20:51:42 +10:00
Oleg Nesterov	413cd3d9ab	keys: change keyctl_session_to_parent() to use task_work_add() Change keyctl_session_to_parent() to use task_work_add() and move key_replace_session_keyring() logic into task_work->func(). Note that we do task_work_cancel() before task_work_add() to ensure that only one work can be pending at any time. This is important, we must not allow user-space to abuse the parent's ->task_works list. The callback, replace_session_keyring(), checks PF_EXITING. I guess this is not really needed but looks better. As a side effect, this fixes the (unlikely) race. The callers of key_replace_session_keyring() and keyctl_session_to_parent() lack the necessary barriers, the parent can miss the request. Now we can remove task_struct->replacement_session_keyring and related code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Smith <dsmith@redhat.com> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-23 22:11:23 -04:00
Al Viro	1227dd773d	TIF_NOTIFY_RESUME is defined on all targets now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-23 22:09:19 -04:00
Linus Torvalds	644473e9c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...	2012-05-23 17:42:39 -07:00
Linus Torvalds	88d6ae8dc3	Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "cgroup file type addition / removal is updated so that file types are added and removed instead of individual files so that dynamic file type addition / removal can be implemented by cgroup and used by controllers. blkio controller changes which will come through block tree are dependent on this. Other changes include res_counter cleanup and disallowing kthread / PF_THREAD_BOUND threads to be attached to non-root cgroups. There's a reported bug with the file type addition / removal handling which can lead to oops on cgroup umount. The issue is being looked into. It shouldn't cause problems for most setups and isn't a security concern." Fix up trivial conflict in Documentation/feature-removal-schedule.txt * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) res_counter: Account max_usage when calling res_counter_charge_nofail() res_counter: Merge res_counter_charge and res_counter_charge_nofail cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threads cgroup: remove cgroup_subsys->populate() cgroup: get rid of populate for memcg cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg cgroup: make css->refcnt clearing on cgroup removal optional cgroup: use negative bias on css->refcnt to block css_tryget() cgroup: implement cgroup_rm_cftypes() cgroup: introduce struct cfent cgroup: relocate __d_cgrp() and __d_cft() cgroup: remove cgroup_add_file[s]() cgroup: convert memcg controller to the new cftype interface memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP cgroup: convert all non-memcg controllers to the new cftype interface cgroup: relocate cftype and cgroup_subsys definitions in controllers cgroup: merge cft_release_agent cftype array into the base files array cgroup: implement cgroup_add_cftypes() and friends cgroup: build list of all cgroups under a given cgroupfs_root cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() ...	2012-05-22 17:40:19 -07:00
Linus Torvalds	cb60e3e65c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "New notable features: - The seccomp work from Will Drewry - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski - Longer security labels for Smack from Casey Schaufler - Additional ptrace restriction modes for Yama by Kees Cook" Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits) apparmor: fix long path failure due to disconnected path apparmor: fix profile lookup for unconfined ima: fix filename hint to reflect script interpreter name KEYS: Don't check for NULL key pointer in key_validate() Smack: allow for significantly longer Smack labels v4 gfp flags for security_inode_alloc()? Smack: recursive tramsmute Yama: replace capable() with ns_capable() TOMOYO: Accept manager programs which do not start with / . KEYS: Add invalidation support KEYS: Do LRU discard in full keyrings KEYS: Permit in-place link replacement in keyring list KEYS: Perform RCU synchronisation on keys prior to key destruction KEYS: Announce key type (un)registration KEYS: Reorganise keys Makefile KEYS: Move the key config into security/keys/Kconfig KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat Yama: remove an unused variable samples/seccomp: fix dependencies on arch macros Yama: add additional ptrace scopes ...	2012-05-21 20:27:36 -07:00
James Morris	ff2bb047c4	Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into next Per pull request, for 3.5.	2012-05-22 11:21:06 +10:00
John Johansen	cffee16e8b	apparmor: fix long path failure due to disconnected path BugLink: http://bugs.launchpad.net/bugs/955892 All failures from __d_path where being treated as disconnected paths, however __d_path can also fail when the generated pathname is too long. The initial ENAMETOOLONG error was being lost, and ENAMETOOLONG was only returned if the subsequent dentry_path call resulted in that error. Other wise if the path was split across a mount point such that the dentry_path fit within the buffer when the __d_path did not the failure was treated as a disconnected path. Signed-off-by: John Johansen <john.johansen@canonical.com>	2012-05-18 11:09:52 -07:00
John Johansen	bf83208e0b	apparmor: fix profile lookup for unconfined BugLink: http://bugs.launchpad.net/bugs/978038 also affects apparmor portion of BugLink: http://bugs.launchpad.net/bugs/987371 The unconfined profile is not stored in the regular profile list, but change_profile and exec transitions may want access to it when setting up specialized transitions like switch to the unconfined profile of a new policy namespace. Signed-off-by: John Johansen <john.johansen@canonical.com>	2012-05-18 11:09:28 -07:00
Mimi Zohar	fbbb456347	ima: fix filename hint to reflect script interpreter name When IMA was first upstreamed, the bprm filename and interp were always the same. Currently, the bprm->filename and bprm->interp are the same, except for when only bprm->interp contains the interpreter name. So instead of using the bprm->filename as the IMA filename hint in the measurement list, we could replace it with bprm->interp, but this feels too fragil. The following patch is not much better, but at least there is some indication that sometimes we're passing the filename and other times the interpreter name. Reported-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-16 10:36:41 +10:00
James Morris	12fa8a2732	Merge branch 'for-1205' of http://git.gitorious.org/smack-next/kernel into next Pull request from Casey.	2012-05-16 01:11:29 +10:00
David Howells	b404aef72f	KEYS: Don't check for NULL key pointer in key_validate() Don't bother checking for NULL key pointer in key_validate() as all of the places that call it will crash anyway if the relevant key pointer is NULL by the time they call key_validate(). Therefore, the checking must be done prior to calling here. Whilst we're at it, simplify the key_validate() function a bit and mark its argument const. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-16 00:54:33 +10:00
Casey Schaufler	f7112e6c9a	Smack: allow for significantly longer Smack labels v4 V4 updated to current linux-security#next Targeted for git://gitorious.org/smack-next/kernel.git Modern application runtime environments like to use naming schemes that are structured and generated without human intervention. Even though the Smack limit of 23 characters for a label name is perfectly rational for human use there have been complaints that the limit is a problem in environments where names are composed from a set or sources, including vendor, author, distribution channel and application name. Names like softwarehouse-pgwodehouse-coolappstore-mellowmuskrats are becoming harder to avoid. This patch introduces long label support in Smack. Labels are now limited to 255 characters instead of the old 23. The primary reason for limiting the labels to 23 characters was so they could be directly contained in CIPSO category sets. This is still done were possible, but for labels that are too large a mapping is required. This is perfectly safe for communication that stays "on the box" and doesn't require much coordination between boxes beyond what would have been required to keep label names consistent. The bulk of this patch is in smackfs, adding and updating administrative interfaces. Because existing APIs can't be changed new ones that do much the same things as old ones have been introduced. The Smack specific CIPSO data representation has been removed and replaced with the data format used by netlabel. The CIPSO header is now computed when a label is imported rather than on use. This results in improved IP performance. The smack label is now allocated separately from the containing structure, allowing for larger strings. Four new /smack interfaces have been introduced as four of the old interfaces strictly required labels be specified in fixed length arrays. The access interface is supplemented with the check interface: access "Subject Object rwxat" access2 "Subject Object rwaxt" The load interface is supplemented with the rules interface: load "Subject Object rwxat" load2 "Subject Object rwaxt" The load-self interface is supplemented with the self-rules interface: load-self "Subject Object rwxat" load-self2 "Subject Object rwaxt" The cipso interface is supplemented with the wire interface: cipso "Subject lvl cnt c1 c2 ..." cipso2 "Subject lvl cnt c1 c2 ..." The old interfaces are maintained for compatibility. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-05-14 22:48:38 -07:00
Tetsuo Handa	ceffec5541	gfp flags for security_inode_alloc()? Dave Chinner wrote: > Yes, because you have no idea what the calling context is except > for the fact that is from somewhere inside filesystem code and the > filesystem could be holding locks. Therefore, GFP_NOFS is really the > only really safe way to allocate memory here. I see. Thank you. I'm not sure, but can call trace happen where somewhere inside network filesystem or stackable filesystem code with locks held invokes operations that involves GFP_KENREL memory allocation outside that filesystem? ---------- [PATCH] SMACK: Fix incorrect GFP_KERNEL usage. new_inode_smack() which can be called from smack_inode_alloc_security() needs to use GFP_NOFS like SELinux's inode_alloc_security() does, for security_inode_alloc() is called from inode_init_always() and inode_init_always() is called from xfs_inode_alloc() which is using GFP_NOFS. smack_inode_init_security() needs to use GFP_NOFS like selinux_inode_init_security() does, for initxattrs() callback function (e.g. btrfs_initxattrs()) which is called from security_inode_init_security() is using GFP_NOFS. smack_audit_rule_match() needs to use GFP_ATOMIC, for security_audit_rule_match() can be called from audit_filter_user_rules() and audit_filter_user_rules() is called from audit_filter_user() with RCU read lock held. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Casey Schaufler <cschaufler@cschaufler-intel.(none)>	2012-05-14 22:47:44 -07:00
Casey Schaufler	2267b13a7c	Smack: recursive tramsmute The transmuting directory feature of Smack requires that the transmuting attribute be explicitly set in all cases. It seems the users of this facility would expect that the transmuting attribute be inherited by subdirectories that are created in a transmuting directory. This does not seem to add any additional complexity to the understanding of how the system works. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>	2012-05-14 22:45:17 -07:00
Kees Cook	2cc8a71641	Yama: replace capable() with ns_capable() When checking capabilities, the question we want to be asking is "does current() have the capability in the child's namespace?" Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-15 10:27:57 +10:00
Tetsuo Handa	77b513dda9	TOMOYO: Accept manager programs which do not start with / . The pathname of /usr/sbin/tomoyo-editpolicy seen from Ubuntu 12.04 Live CD is squashfs:/usr/sbin/tomoyo-editpolicy rather than /usr/sbin/tomoyo-editpolicy . Therefore, we need to accept manager programs which do not start with / . Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-15 10:24:29 +10:00
David Howells	fd75815f72	KEYS: Add invalidation support Add support for invalidating a key - which renders it immediately invisible to further searches and causes the garbage collector to immediately wake up, remove it from keyrings and then destroy it when it's no longer referenced. It's better not to do this with keyctl_revoke() as that marks the key to start returning -EKEYREVOKED to searches when what is actually desired is to have the key refetched. To invalidate a key the caller must be granted SEARCH permission by the key. This may be too strict. It may be better to also permit invalidation if the caller has any of READ, WRITE or SETATTR permission. The primary use for this is to evict keys that are cached in special keyrings, such as the DNS resolver or an ID mapper. Signed-off-by: David Howells <dhowells@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	31d5a79d7f	KEYS: Do LRU discard in full keyrings Do an LRU discard in keyrings that are full rather than returning ENFILE. To perform this, a time_t is added to the key struct and updated by the creation of a link to a key and by a key being found as the result of a search. At the completion of a successful search, the keyrings in the path between the root of the search and the first found link to it also have their last-used times updated. Note that discarding a link to a key from a keyring does not necessarily destroy the key as there may be references held by other places. An alternate discard method that might suffice is to perform FIFO discard from the keyring, using the spare 2-byte hole in the keylist header as the index of the next link to be discarded. This is useful when using a keyring as a cache for DNS results or foreign filesystem IDs. This can be tested by the following. As root do: echo 1000 >/proc/sys/kernel/keys/root_maxkeys kr=`keyctl newring foo @s` for ((i=0; i<2000; i++)); do keyctl add user a$i a $kr; done Without this patch ENFILE should be reported when the keyring fills up. With this patch, the keyring discards keys in an LRU fashion. Note that the stored LRU time has a granularity of 1s. After doing this, /proc/key-users can be observed and should show that most of the 2000 keys have been discarded: [root@andromeda ~]# cat /proc/key-users 0: 517 516/516 513/1000 5249/20000 The "513/1000" here is the number of quota-accounted keys present for this user out of the maximum permitted. In /proc/keys, the keyring shows the number of keys it has and the number of slots it has allocated: [root@andromeda ~]# grep foo /proc/keys 200c64c4 I--Q-- 1 perm 3b3f0000 0 0 keyring foo: 509/509 The maximum is (PAGE_SIZE - header) / key pointer size. That's typically 509 on a 64-bit system and 1020 on a 32-bit system. Signed-off-by: David Howells <dhowells@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	233e4735f2	KEYS: Permit in-place link replacement in keyring list Make use of the previous patch that makes the garbage collector perform RCU synchronisation before destroying defunct keys. Key pointers can now be replaced in-place without creating a new keyring payload and replacing the whole thing as the discarded keys will not be destroyed until all currently held RCU read locks are released. If the keyring payload space needs to be expanded or contracted, then a replacement will still need allocating, and the original will still have to be freed by RCU. Signed-off-by: David Howells <dhowells@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	65d87fe68a	KEYS: Perform RCU synchronisation on keys prior to key destruction Make the keys garbage collector invoke synchronize_rcu() prior to destroying keys with a zero usage count. This means that a key can be examined under the RCU read lock in the safe knowledge that it won't get deallocated until after the lock is released - even if its usage count becomes zero whilst we're looking at it. This is useful in keyring search vs key link. Consider a keyring containing a link to a key. That link can be replaced in-place in the keyring without requiring an RCU copy-and-replace on the keyring contents without breaking a search underway on that keyring when the displaced key is released, provided the key is actually destroyed only after the RCU read lock held by the search algorithm is released. This permits __key_link() to replace a key without having to reallocate the key payload. A key gets replaced if a new key being linked into a keyring has the same type and description. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	1eb1bcf5bf	KEYS: Announce key type (un)registration Announce the (un)registration of a key type in the core key code rather than in the callers. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com>	2012-05-11 10:56:56 +01:00
David Howells	9f7ce8e249	KEYS: Reorganise keys Makefile Reorganise the keys directory Makefile to put all the core bits together and the type-specific bits after. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com>	2012-05-11 10:56:56 +01:00
David Howells	f0894940ae	KEYS: Move the key config into security/keys/Kconfig Move the key config into security/keys/Kconfig as there are going to be a lot of key-related options. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com>	2012-05-11 10:56:56 +01:00
Pablo Neira Ayuso	d16cf20e2f	netfilter: remove ip_queue support This patch removes ip_queue support which was marked as obsolete years ago. The nfnetlink_queue modules provides more advanced user-space packet queueing mechanism. This patch also removes capability code included in SELinux that refers to ip_queue. Otherwise, we break compilation. Several warning has been sent regarding this to the mailing list in the past month without anyone rising the hand to stop this with some strong argument. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-05-08 20:25:42 +02:00
James Morris	898bfc1d46	Linux 3.4-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2 MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9 JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0= =4d4w -----END PGP SIGNATURE----- Merge tag 'v3.4-rc5' into next Linux 3.4-rc5 Merge to pull in prerequisite change for Smack: `86812bb0de` Requested by Casey.	2012-05-04 12:46:40 +10:00
Eric W. Biederman	18815a1808	userns: Convert capabilities related permsion checks - Use uid_eq when comparing kuids Use gid_eq when comparing kgids - Use make_kuid(user_ns, 0) to talk about the user_namespace root uid Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-03 03:28:40 -07:00
Eric W. Biederman	078de5f706	userns: Store uid and gid values in struct cred with kuid_t and kgid_t types cred.h and a few trivial users of struct cred are changed. The rest of the users of struct cred are left for other patches as there are too many changes to make in one go and leave the change reviewable. If the user namespace is disabled and CONFIG_UIDGID_STRICT_TYPE_CHECKS are disabled the code will contiue to compile and behave correctly. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-03 03:28:38 -07:00
Eric W. Biederman	ae2975bc34	userns: Convert group_info values from gid_t to kgid_t. As a first step to converting struct cred to be all kuid_t and kgid_t values convert the group values stored in group_info to always be kgid_t values. Unless user namespaces are used this change should have no effect. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-03 03:27:21 -07:00
Eric W. Biederman	783291e690	userns: Simplify the user_namespace by making userns->creator a kuid. - Transform userns->creator from a user_struct reference to a simple kuid_t, kgid_t pair. In cap_capable this allows the check to see if we are the creator of a namespace to become the classic suser style euid permission check. This allows us to remove the need for a struct cred in the mapping functions and still be able to dispaly the user namespace creators uid and gid as 0. - Remove the now unnecessary delayed_work in free_user_ns. All that is left for free_user_ns to do is to call kmem_cache_free and put_user_ns. Those functions can be called in any context so call them directly from free_user_ns removing the need for delayed work. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-26 02:00:59 -07:00
Dan Carpenter	08162e6a23	Yama: remove an unused variable GCC complains that we don't use "one" any more after `389da25f93` "Yama: add additional ptrace scopes". security/yama/yama_lsm.c:322:12: warning: ?one? defined but not used [-Wunused-variable] Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-23 17:20:22 +10:00
Kees Cook	389da25f93	Yama: add additional ptrace scopes This expands the available Yama ptrace restrictions to include two more modes. Mode 2 requires CAP_SYS_PTRACE for PTRACE_ATTACH, and mode 3 completely disables PTRACE_ATTACH (and locks the sysctl). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-19 13:39:56 +10:00
Jonghwan Choi	51b79bee62	security: fix compile error in commoncap.c Add missing "personality.h" security/commoncap.c: In function 'cap_bprm_set_creds': security/commoncap.c:510: error: 'PER_CLEAR_ON_SETID' undeclared (first use in this function) security/commoncap.c:510: error: (Each undeclared identifier is reported only once security/commoncap.c:510: error: for each function it appears in.) Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-19 12:56:39 +10:00
Eric Paris	d52fc5dde1	fcaps: clear the same personality flags as suid when fcaps are used If a process increases permissions using fcaps all of the dangerous personality flags which are cleared for suid apps should also be cleared. Thus programs given priviledge with fcaps will continue to have address space randomization enabled even if the parent tried to disable it to make it easier to attack. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-18 12:37:56 +10:00
Casey Schaufler	86812bb0de	Smack: move label list initialization A kernel with Smack enabled will fail if tmpfs has xattr support. Move the initialization of predefined Smack label list entries to the LSM initialization from the smackfs setup. This became an issue when tmpfs acquired xattr support, but was never correct. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-18 12:02:28 +10:00
John Johansen	c29bceb396	Fix execve behavior apparmor for PR_{GET,SET}_NO_NEW_PRIVS Add support for AppArmor to explicitly fail requested domain transitions if NO_NEW_PRIVS is set and the task is not unconfined. Transitions from unconfined are still allowed because this always results in a reduction of privileges. Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> v18: new acked-by, new description Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-14 11:13:18 +10:00
Andy Lutomirski	259e5e6c75	Add PR_{GET,SET}_NO_NEW_PRIVS to prevent execve from granting privs With this change, calling prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) disables privilege granting operations at execve-time. For example, a process will not be able to execute a setuid binary to change their uid or gid if this bit is set. The same is true for file capabilities. Additionally, LSM_UNSAFE_NO_NEW_PRIVS is defined to ensure that LSMs respect the requested behavior. To determine if the NO_NEW_PRIVS bit is set, a task may call prctl(PR_GET_NO_NEW_PRIVS, 0, 0, 0, 0); It returns 1 if set and 0 if it is not set. If any of the arguments are non-zero, it will return -1 and set errno to -EINVAL. (PR_SET_NO_NEW_PRIVS behaves similarly.) This functionality is desired for the proposed seccomp filter patch series. By using PR_SET_NO_NEW_PRIVS, it allows a task to modify the system call behavior for itself and its child tasks without being able to impact the behavior of a more privileged task. Another potential use is making certain privileged operations unprivileged. For example, chroot may be considered "safe" if it cannot affect privileged tasks. Note, this patch causes execve to fail when PR_SET_NO_NEW_PRIVS is set and AppArmor is in use. It is fixed in a subsequent patch. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Will Drewry <wad@chromium.org> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> v18: updated change desc v17: using new define values as per 3.4 Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-04-14 11:13:18 +10:00
Kees Cook	923e9a1399	Smack: build when CONFIG_AUDIT not defined This fixes builds where CONFIG_AUDIT is not defined and CONFIG_SECURITY_SMACK=y. This got introduced by the stack-usage reducation commit `48c62af68a` ("LSM: shrink the common_audit_data data union"). Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-10 16:14:40 -07:00
Eric Paris	c737f8284c	SELinux: remove unused common_audit_data in flush_unauthorized_files We don't need this variable and it just eats stack space. Remove it. Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:57 -04:00
Wanlong Gao	562c99f20d	SELinux: avc: remove the useless fields in avc_add_callback avc_add_callback now just used for registering reset functions in initcalls, and the callback functions just did reset operations. So, reducing the arguments to only one event is enough now. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:44 -04:00
Wanlong Gao	0b36e44cc6	SELinux: replace weak GFP_ATOMIC to GFP_KERNEL in avc_add_callback avc_add_callback now only called from initcalls, so replace the weak GFP_ATOMIC to GFP_KERNEL, and mark this function __init to make a warning when not been called from initcalls. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:07 -04:00
Eric Paris	899838b25f	SELinux: unify the selinux_audit_data and selinux_late_audit_data We no longer need the distinction. We only need data after we decide to do an audit. So turn the "late" audit data into just "data" and remove what we currently have as "data". Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:06 -04:00
Eric Paris	1d34929271	SELinux: remove auditdeny from selinux_audit_data It's just takin' up space. Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:05 -04:00
Eric Paris	50c205f5e5	LSM: do not initialize common_audit_data to 0 It isn't needed. If you don't set the type of the data associated with that type it is a pretty obvious programming bug. So why waste the cycles? Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:04 -04:00
Eric Paris	07f62eb66c	LSM: BUILD_BUG_ON if the common_audit_data union ever grows We did a lot of work to shrink the common_audit_data. Add a BUILD_BUG_ON so future programers (let's be honest, probably me) won't do something foolish like make it large again! Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:23:03 -04:00

... 3 4 5 6 7 ...

2167 Commits