OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
David Howells	d54e58b7f0	KEYS: Fix the keyring hash function The keyring hash function (used by the associative array) is supposed to clear the bottommost nibble of the index key (where the hash value resides) for keyrings and make sure it is non-zero for non-keyrings. This is done to make keyrings cluster together on one branch of the tree separately to other keys. Unfortunately, the wrong mask is used, so only the bottom two bits are examined and cleared and not the whole bottom nibble. This means that keys and keyrings can still be successfully searched for under most circumstances as the hash is consistent in its miscalculation, but if a keyring's associative array bottom node gets filled up then approx 75% of the keyrings will not be put into the 0 branch. The consequence of this is that a key in a keyring linked to by another keyring, ie. keyring A -> keyring B -> key may not be found if the search starts at keyring A and then descends into keyring B because search_nested_keyrings() only searches up the 0 branch (as it "knows" all keyrings must be there and not elsewhere in the tree). The fix is to use the right mask. This can be tested with: r=`keyctl newring sandbox @s` for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done for ((i=0; i<=16; i++)); do keyctl add user a$i a %:ring$i; done for ((i=0; i<=16; i++)); do keyctl search $r user a$i; done This creates a sandbox keyring, then creates 17 keyrings therein (labelled ring0..ring16). This causes the root node of the sandbox's associative array to overflow and for the tree to have extra nodes inserted. Each keyring then is given a user key (labelled aN for ringN) for us to search for. We then search for the user keys we added, starting from the sandbox. If working correctly, it should return the same ordered list of key IDs as for...keyctl add... did. Without this patch, it reports ENOKEY "Required key not available" for some of the keys. Just which keys get this depends as the kernel pointer to the key type forms part of the hash function. Reported-by: Nalin Dahyabhai <nalin@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Stephen Gallagher <sgallagh@redhat.com>	2013-12-02 11:24:18 +00:00
David Howells	2480f57fb3	KEYS: Pre-clear struct key on allocation The second word of key->payload does not get initialised in key_alloc(), but the big_key type is relying on it having been cleared. The problem comes when big_key fails to instantiate a large key and doesn't then set the payload. The big_key_destroy() op is called from the garbage collector and this assumes that the dentry pointer stored in the second word will be NULL if instantiation did not complete. Therefore just pre-clear the entire struct key on allocation rather than trying to be clever and only initialising to 0 only those bits that aren't otherwise initialised. The lack of initialisation can lead to a bug report like the following if big_key failed to initialise its file: general protection fault: 0000 [#1] SMP Modules linked in: ... CPU: 0 PID: 51 Comm: kworker/0:1 Not tainted 3.10.0-53.el7.x86_64 #1 Hardware name: Dell Inc. PowerEdge 1955/0HC513, BIOS 1.4.4 12/09/2008 Workqueue: events key_garbage_collector task: ffff8801294f5680 ti: ffff8801296e2000 task.ti: ffff8801296e2000 RIP: 0010:[<ffffffff811b4a51>] dput+0x21/0x2d0 ... Call Trace: [<ffffffff811a7b06>] path_put+0x16/0x30 [<ffffffff81235604>] big_key_destroy+0x44/0x60 [<ffffffff8122dc4b>] key_gc_unused_keys.constprop.2+0x5b/0xe0 [<ffffffff8122df2f>] key_garbage_collector+0x1df/0x3c0 [<ffffffff8107759b>] process_one_work+0x17b/0x460 [<ffffffff8107834b>] worker_thread+0x11b/0x400 [<ffffffff81078230>] ? rescuer_thread+0x3e0/0x3e0 [<ffffffff8107eb00>] kthread+0xc0/0xd0 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110 [<ffffffff815c4bec>] ret_from_fork+0x7c/0xb0 [<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110 Reported-by: Patrik Kis <pkis@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>	2013-12-02 11:24:18 +00:00
David Howells	62fe318256	KEYS: Fix keyring content gc scanner Key pointers stored in the keyring are marked in bit 1 to indicate if they point to a keyring. We need to strip off this bit before using the pointer when iterating over the keyring for the purpose of looking for links to garbage collect. This means that expirable keyrings aren't correctly expiring because the checker is seeing their key pointer with 2 added to it. Since the fix for this involves knowing about the internals of the keyring, key_gc_keyring() is moved to keyring.c and merged into keyring_gc(). This can be tested by: echo 2 >/proc/sys/kernel/keys/gc_delay keyctl timeout `keyctl add keyring qwerty "" @s` 2 cat /proc/keys sleep 5; cat /proc/keys which should see a keyring called "qwerty" appear in the session keyring and then disappear after it expires, and: echo 2 >/proc/sys/kernel/keys/gc_delay a=`keyctl get_persistent @s` b=`keyctl add keyring 0 "" $a` keyctl add user a a $b keyctl timeout $b 2 cat /proc/keys sleep 5; cat /proc/keys which should see a keyring called "0" with a key called "a" in it appear in the user's persistent keyring (which will be attached to the session keyring) and then both the "0" keyring and the "a" key should disappear when the "0" keyring expires. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Simo Sorce <simo@redhat.com>	2013-11-14 14:09:53 +00:00
David Howells	97826c821e	KEYS: Fix error handling in big_key instantiation In the big_key_instantiate() function we return 0 if kernel_write() returns us an error rather than returning an error. This can potentially lead to dentry_open() giving a BUG when called from big_key_read() with an unset tmpfile path. ------------[ cut here ]------------ kernel BUG at fs/open.c:798! ... RIP: 0010:[<ffffffff8119bbd1>] dentry_open+0xd1/0xe0 ... Call Trace: [<ffffffff812350c5>] big_key_read+0x55/0x100 [<ffffffff81231084>] keyctl_read_key+0xb4/0xe0 [<ffffffff81231e58>] SyS_keyctl+0xf8/0x1d0 [<ffffffff815bb799>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>	2013-11-13 16:51:06 +00:00
David Howells	fbf8c53f1a	KEYS: Fix UID check in keyctl_get_persistent() If the UID is specified by userspace when calling the KEYCTL_GET_PERSISTENT function and the process does not have the CAP_SETUID capability, then the function will return -EPERM if the current process's uid, suid, euid and fsuid all match the requested UID. This is incorrect. Fix it such that when a non-privileged caller requests a persistent keyring by a specific UID they can only request their own (ie. the specified UID matches either then process's UID or the process's EUID). This can be tested by logging in as the user and doing: keyctl get_persistent @p keyctl get_persistent @p `id -u` keyctl get_persistent @p 0 The first two should successfully print the same key ID. The third should do the same if called by UID 0 or indicate Operation Not Permitted otherwise. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Stephen Gallagher <sgallagh@redhat.com>	2013-11-06 14:01:51 +00:00
Wei Yongjun	d2b8697024	KEYS: fix error return code in big_key_instantiate() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David Howells <dhowells@redhat.com>	2013-10-30 12:54:29 +00:00
David Howells	034faeb9ef	KEYS: Fix keyring quota misaccounting on key replacement and unlink If a key is displaced from a keyring by a matching one, then four more bytes of quota are allocated to the keyring - despite the fact that the keyring does not change in size. Further, when a key is unlinked from a keyring, the four bytes of quota allocated the link isn't recovered and returned to the user's pool. The first can be tested by repeating: keyctl add big_key a fred @s cat /proc/key-users (Don't put it in a shell loop otherwise the garbage collector won't have time to clear the displaced keys, thus affecting the result). This was causing the kerberos keyring to run out of room fairly quickly. The second can be tested by: cat /proc/key-users a=`keyctl add user a a @s` cat /proc/key-users keyctl unlink $a sleep 1 # Give RCU a chance to delete the key cat /proc/key-users assuming no system activity that otherwise adds/removes keys, the amount of key data allocated should go up (say 40/20000 -> 47/20000) and then return to the original value at the end. Reported-by: Stephen Gallagher <sgallagh@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>	2013-10-30 11:15:24 +00:00
David Howells	74792b0001	KEYS: Fix a race between negating a key and reading the error set key_reject_and_link() marking a key as negative and setting the error with which it was negated races with keyring searches and other things that read that error. The fix is to switch the order in which the assignments are done in key_reject_and_link() and to use memory barriers. Kudos to Dave Wysochanski <dwysocha@redhat.com> and Scott Mayhew <smayhew@redhat.com> for tracking this down. This may be the cause of: BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 IP: [<ffffffff81219011>] wait_for_key_construction+0x31/0x80 PGD c6b2c3067 PUD c59879067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map CPU 0 Modules linked in: ... Pid: 13359, comm: amqzxma0 Not tainted 2.6.32-358.20.1.el6.x86_64 #1 IBM System x3650 M3 -[7945PSJ]-/00J6159 RIP: 0010:[<ffffffff81219011>] wait_for_key_construction+0x31/0x80 RSP: 0018:ffff880c6ab33758 EFLAGS: 00010246 RAX: ffffffff81219080 RBX: 0000000000000000 RCX: 0000000000000002 RDX: ffffffff81219060 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff880c6ab33768 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff880adfcbce40 R13: ffffffffa03afb84 R14: ffff880adfcbce40 R15: ffff880adfcbce43 FS: 00007f29b8042700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000070 CR3: 0000000c613dc000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process amqzxma0 (pid: 13359, threadinfo ffff880c6ab32000, task ffff880c610deae0) Stack: ffff880adfcbce40 0000000000000000 ffff880c6ab337b8 ffffffff81219695 <d> 0000000000000000 ffff880a000000d0 ffff880c6ab337a8 000000000000000f <d> ffffffffa03afb93 000000000000000f ffff88186c7882c0 0000000000000014 Call Trace: [<ffffffff81219695>] request_key+0x65/0xa0 [<ffffffffa03a0885>] nfs_idmap_request_key+0xc5/0x170 [nfs] [<ffffffffa03a0eb4>] nfs_idmap_lookup_id+0x34/0x80 [nfs] [<ffffffffa03a1255>] nfs_map_group_to_gid+0x75/0xa0 [nfs] [<ffffffffa039a9ad>] decode_getfattr_attrs+0xbdd/0xfb0 [nfs] [<ffffffff81057310>] ? __dequeue_entity+0x30/0x50 [<ffffffff8100988e>] ? __switch_to+0x26e/0x320 [<ffffffffa039ae03>] decode_getfattr+0x83/0xe0 [nfs] [<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs] [<ffffffffa039b69f>] nfs4_xdr_dec_getattr+0x8f/0xa0 [nfs] [<ffffffffa02dada4>] rpcauth_unwrap_resp+0x84/0xb0 [sunrpc] [<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs] [<ffffffffa02cf923>] call_decode+0x1b3/0x800 [sunrpc] [<ffffffff81096de0>] ? wake_bit_function+0x0/0x50 [<ffffffffa02cf770>] ? call_decode+0x0/0x800 [sunrpc] [<ffffffffa02d99a7>] __rpc_execute+0x77/0x350 [sunrpc] [<ffffffff81096c67>] ? bit_waitqueue+0x17/0xd0 [<ffffffffa02d9ce1>] rpc_execute+0x61/0xa0 [sunrpc] [<ffffffffa02d03a5>] rpc_run_task+0x75/0x90 [sunrpc] [<ffffffffa02d04c2>] rpc_call_sync+0x42/0x70 [sunrpc] [<ffffffffa038ff80>] _nfs4_call_sync+0x30/0x40 [nfs] [<ffffffffa038836c>] _nfs4_proc_getattr+0xac/0xc0 [nfs] [<ffffffff810aac87>] ? futex_wait+0x227/0x380 [<ffffffffa038b856>] nfs4_proc_getattr+0x56/0x80 [nfs] [<ffffffffa0371403>] __nfs_revalidate_inode+0xe3/0x220 [nfs] [<ffffffffa037158e>] nfs_revalidate_mapping+0x4e/0x170 [nfs] [<ffffffffa036f147>] nfs_file_read+0x77/0x130 [nfs] [<ffffffff811811aa>] do_sync_read+0xfa/0x140 [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13 [<ffffffff81228ffb>] ? selinux_file_permission+0xfb/0x150 [<ffffffff8121bed6>] ? security_file_permission+0x16/0x20 [<ffffffff81181a95>] vfs_read+0xb5/0x1a0 [<ffffffff81181bd1>] sys_read+0x51/0x90 [<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> cc: Dave Wysochanski <dwysocha@redhat.com> cc: Scott Mayhew <smayhew@redhat.com>	2013-10-30 11:15:24 +00:00
Josh Boyer	2eaf6b5dca	KEYS: Make BIG_KEYS boolean Having the big_keys functionality as a module is very marginally useful. The userspace code that would use this functionality will get odd error messages from the keys layer if the module isn't loaded. The code itself is fairly small, so just have this as a boolean option and not a tristate. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: David Howells <dhowells@redhat.com>	2013-10-30 11:15:23 +00:00
Mimi Zohar	c124bde28b	KEYS: initialize root uid and session keyrings early In order to create the integrity keyrings (eg. _evm, _ima), root's uid and session keyrings need to be initialized early. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-25 17:17:01 +01:00
David Howells	008643b86c	KEYS: Add a 'trusted' flag and a 'trusted only' flag Add KEY_FLAG_TRUSTED to indicate that a key either comes from a trusted source or had a cryptographic signature chain that led back to a trusted key the kernel already possessed. Add KEY_FLAGS_TRUSTED_ONLY to indicate that a keyring will only accept links to keys marked with KEY_FLAGS_TRUSTED. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org>	2013-09-25 17:17:01 +01:00
David Howells	f36f8c75ae	KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches Add support for per-user_namespace registers of persistent per-UID kerberos caches held within the kernel. This allows the kerberos cache to be retained beyond the life of all a user's processes so that the user's cron jobs can work. The kerberos cache is envisioned as a keyring/key tree looking something like: struct user_namespace \___ .krb_cache keyring - The register \___ _krb.0 keyring - Root's Kerberos cache \___ _krb.5000 keyring - User 5000's Kerberos cache \___ _krb.5001 keyring - User 5001's Kerberos cache \___ tkt785 big_key - A ccache blob \___ tkt12345 big_key - Another ccache blob Or possibly: struct user_namespace \___ .krb_cache keyring - The register \___ _krb.0 keyring - Root's Kerberos cache \___ _krb.5000 keyring - User 5000's Kerberos cache \___ _krb.5001 keyring - User 5001's Kerberos cache \___ tkt785 keyring - A ccache \___ krbtgt/REDHAT.COM@REDHAT.COM big_key \___ http/REDHAT.COM@REDHAT.COM user \___ afs/REDHAT.COM@REDHAT.COM user \___ nfs/REDHAT.COM@REDHAT.COM user \___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key \___ http/KERNEL.ORG@KERNEL.ORG big_key What goes into a particular Kerberos cache is entirely up to userspace. Kernel support is limited to giving you the Kerberos cache keyring that you want. The user asks for their Kerberos cache by: krb_cache = keyctl_get_krbcache(uid, dest_keyring); The uid is -1 or the user's own UID for the user's own cache or the uid of some other user's cache (requires CAP_SETUID). This permits rpc.gssd or whatever to mess with the cache. The cache returned is a keyring named "_krb.<uid>" that the possessor can read, search, clear, invalidate, unlink from and add links to. Active LSMs get a chance to rule on whether the caller is permitted to make a link. Each uid's cache keyring is created when it first accessed and is given a timeout that is extended each time this function is called so that the keyring goes away after a while. The timeout is configurable by sysctl but defaults to three days. Each user_namespace struct gets a lazily-created keyring that serves as the register. The cache keyrings are added to it. This means that standard key search and garbage collection facilities are available. The user_namespace struct's register goes away when it does and anything left in it is then automatically gc'd. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Simo Sorce <simo@redhat.com> cc: Serge E. Hallyn <serge.hallyn@ubuntu.com> cc: Eric W. Biederman <ebiederm@xmission.com>	2013-09-24 10:35:19 +01:00
David Howells	ab3c3587f8	KEYS: Implement a big key type that can save to tmpfs Implement a big key type that can save its contents to tmpfs and thus swapspace when memory is tight. This is useful for Kerberos ticket caches. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Simo Sorce <simo@redhat.com>	2013-09-24 10:35:18 +01:00
David Howells	b2a4df200d	KEYS: Expand the capacity of a keyring Expand the capacity of a keyring to be able to hold a lot more keys by using the previously added associative array implementation. Currently the maximum capacity is: (PAGE_SIZE - sizeof(header)) / sizeof(struct key *) which, on a 64-bit system, is a little more 500. However, since this is being used for the NFS uid mapper, we need more than that. The new implementation gives us effectively unlimited capacity. With some alterations, the keyutils testsuite runs successfully to completion after this patch is applied. The alterations are because (a) keyrings that are simply added to no longer appear ordered and (b) some of the errors have changed a bit. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:18 +01:00
David Howells	e57e8669f2	KEYS: Drop the permissions argument from __keyring_search_one() Drop the permissions argument from __keyring_search_one() as the only caller passes 0 here - which causes all checks to be skipped. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:17 +01:00
David Howells	ccc3e6d9c9	KEYS: Define a __key_get() wrapper to use rather than atomic_inc() Define a __key_get() wrapper to use rather than atomic_inc() on the key usage count as this makes it easier to hook in refcount error debugging. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:16 +01:00
David Howells	d0a059cac6	KEYS: Search for auth-key by name rather than target key ID Search for auth-key by name rather than by target key ID as, in a future patch, we'll by searching directly by index key in preference to iteration over all keys. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:16 +01:00
David Howells	4bdf0bc300	KEYS: Introduce a search context structure Search functions pass around a bunch of arguments, each of which gets copied with each call. Introduce a search context structure to hold these. Whilst we're at it, create a search flag that indicates whether the search should be directly to the description or whether it should iterate through all keys looking for a non-description match. This will be useful when keyrings use a generic data struct with generic routines to manage their content as the search terms can just be passed through to the iterator callback function. Also, for future use, the data to be supplied to the match function is separated from the description pointer in the search context. This makes it clear which is being supplied. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:15 +01:00
David Howells	16feef4340	KEYS: Consolidate the concept of an 'index key' for key access Consolidate the concept of an 'index key' for accessing keys. The index key is the search term needed to find a key directly - basically the key type and the key description. We can add to that the description length. This will be useful when turning a keyring into an associative array rather than just a pointer block. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:15 +01:00
David Howells	7e55ca6dcd	KEYS: key_is_dead() should take a const key pointer argument key_is_dead() should take a const key pointer argument as it doesn't modify what it points to. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:14 +01:00
David Howells	a5b4bd2874	KEYS: Use bool in make_key_ref() and is_key_possessed() Make make_key_ref() take a bool possession parameter and make is_key_possessed() return a bool. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:14 +01:00
David Howells	61ea0c0ba9	KEYS: Skip key state checks when checking for possession Skip key state checks (invalidation, revocation and expiration) when checking for possession. Without this, keys that have been marked invalid, revoked keys and expired keys are not given a possession attribute - which means the possessor is not granted any possession permits and cannot do anything with them unless they also have one a user, group or other permit. This causes failures in the keyutils test suite's revocation and expiration tests now that commit `96b5c8fea6` reduced the initial permissions granted to a key. The failures are due to accesses to revoked and expired keys being given EACCES instead of EKEYREVOKED or EKEYEXPIRED. Signed-off-by: David Howells <dhowells@redhat.com>	2013-09-24 10:35:13 +01:00
Kent Overstreet	a27bb332c0	aio: don't include aio.h in sched.h Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 20:16:25 -07:00
Lucas De Marchi	93997f6ddb	KEYS: split call to call_usermodehelper_fns() Use call_usermodehelper_setup() + call_usermodehelper_exec() instead of calling call_usermodehelper_fns(). In case there's an OOM in this last function the cleanup function may not be called - in this case we would miss a call to key_put(). Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-30 17:04:06 -07:00
Mathieu Desnoyers	8aec0f5d41	Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to compat_process_vm_rw() shows that the compatibility code requires an explicit "access_ok()" check before calling compat_rw_copy_check_uvector(). The same difference seems to appear when we compare fs/read_write.c:do_readv_writev() to fs/compat.c:compat_do_readv_writev(). This subtle difference between the compat and non-compat requirements should probably be debated, as it seems to be error-prone. In fact, there are two others sites that use this function in the Linux kernel, and they both seem to get it wrong: Now shifting our attention to fs/aio.c, we see that aio_setup_iocb() also ends up calling compat_rw_copy_check_uvector() through aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to be missing. Same situation for security/keys/compat.c:compat_keyctl_instantiate_key_iov(). I propose that we add the access_ok() check directly into compat_rw_copy_check_uvector(), so callers don't have to worry about it, and it therefore makes the compat call code similar to its non-compat counterpart. Place the access_ok() check in the same location where copy_from_user() can trigger a -EFAULT error in the non-compat code, so the ABI behaviors are alike on both compat and non-compat. While we are here, fix compat_do_readv_writev() so it checks for compat_rw_copy_check_uvector() negative return values. And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error handling. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-12 11:05:45 -07:00
David Howells	0da9dfdd2c	keys: fix race with concurrent install_user_keyrings() This fixes CVE-2013-1792. There is a race in install_user_keyrings() that can cause a NULL pointer dereference when called concurrently for the same user if the uid and uid-session keyrings are not yet created. It might be possible for an unprivileged user to trigger this by calling keyctl() from userspace in parallel immediately after logging in. Assume that we have two threads both executing lookup_user_key(), both looking for KEY_SPEC_USER_SESSION_KEYRING. THREAD A THREAD B =============================== =============================== ==>call install_user_keyrings(); if (!cred->user->session_keyring) ==>call install_user_keyrings() ... user->uid_keyring = uid_keyring; if (user->uid_keyring) return 0; <== key = cred->user->session_keyring [== NULL] user->session_keyring = session_keyring; atomic_inc(&key->usage); [oops] At the point thread A dereferences cred->user->session_keyring, thread B hasn't updated user->session_keyring yet, but thread A assumes it is populated because install_user_keyrings() returned ok. The race window is really small but can be exploited if, for example, thread B is interrupted or preempted after initializing uid_keyring, but before doing setting session_keyring. This couldn't be reproduced on a stock kernel. However, after placing systemtap probe on 'user->session_keyring = session_keyring;' that introduced some delay, the kernel could be crashed reliably. Fix this by checking both pointers before deciding whether to return. Alternatively, the test could be done away with entirely as it is checked inside the mutex - but since the mutex is global, that may not be the best way. Signed-off-by: David Howells <dhowells@redhat.com> Reported-by: Mateusz Guzik <mguzik@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <james.l.morris@oracle.com>	2013-03-12 16:44:31 +11:00
Eric W. Biederman	ba0e3427b0	userns: Stop oopsing in key_change_session_keyring Dave Jones <davej@redhat.com> writes: > Just hit this on Linus' current tree. > > [ 89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 > [ 89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0 > [ 89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0 > [ 89.624901] Oops: 0000 [#1] PREEMPT SMP > [ 89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii > [ 89.637846] CPU 2 > [ 89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H > [ 89.639850] RIP: 0010:[<ffffffff810784b0>] [<ffffffff810784b0>] commit_creds+0x250/0x2f0 > [ 89.641161] RSP: 0018:ffff880115657eb8 EFLAGS: 00010207 > [ 89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000 > [ 89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600 > [ 89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000 > [ 89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600 > [ 89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000 > [ 89.647431] FS: 00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000 > [ 89.648660] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0 > [ 89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490) > [ 89.654128] Stack: > [ 89.654433] 0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78 > [ 89.655769] ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000 > [ 89.657073] ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58 > [ 89.658399] Call Trace: > [ 89.658822] [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140 > [ 89.659845] [<ffffffff8106c665>] task_work_run+0xa5/0xd0 > [ 89.660698] [<ffffffff81002911>] do_notify_resume+0x71/0xb0 > [ 89.661581] [<ffffffff816c9a4a>] int_signal+0x12/0x17 > [ 89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b > [ 89.667778] RIP [<ffffffff810784b0>] commit_creds+0x250/0x2f0 > [ 89.668733] RSP <ffff880115657eb8> > [ 89.669301] CR2: 00000000000000c8 > > My fastest trinity induced oops yet! > > > Appears to be.. > > if ((set_ns == subset_ns->parent) && > 850: 48 8b 8a c8 00 00 00 mov 0xc8(%rdx),%rcx > > from the inlined cred_cap_issubset By historical accident we have been reading trying to set new->user_ns from new->user_ns. Which is totally silly as new->user_ns is NULL (as is every other field in new except session_keyring at that point). The intent is clearly to copy all of the fields from old to new so copy old->user_ns into into new->user_ns. Cc: stable@vger.kernel.org Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-03 19:35:38 -08:00
David Howells	fe9453a1dc	KEYS: Revert one application of "Fix unreachable code" patch A patch to fix some unreachable code in search_my_process_keyrings() got applied twice by two different routes upstream as commits `e67eab39be` and `b010520ab3` (both "fix unreachable code"). Unfortunately, the second application removed something it shouldn't have and this wasn't detected by GIT. This is due to the patch not having sufficient lines of context to distinguish the two places of application. The effect of this is relatively minor: inside the kernel, the keyring search routines may search multiple keyrings and then prioritise the errors if no keys or negative keys are found in any of them. With the extra deletion, the presence of a negative key in the thread keyring (causing ENOKEY) is incorrectly overridden by an error searching the process keyring. So revert the second application of the patch. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 07:56:25 -08:00
Alan Cox	e67eab39be	keys: fix unreachable code We set ret to NULL then test it. Remove the bogus test Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-20 17:40:21 -08:00
Linus Torvalds	2a74dbb9a8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "A quiet cycle for the security subsystem with just a few maintenance updates." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: create a sysfs mount point for smackfs Smack: use select not depends in Kconfig Yama: remove locking from delete path Yama: add RCU to drop read locking drivers/char/tpm: remove tasklet and cleanup KEYS: Use keyring_alloc() to create special keyrings KEYS: Reduce initial permissions on keys KEYS: Make the session and process keyrings per-thread seccomp: Make syscall skipping and nr changes more consistent key: Fix resource leak keys: Fix unreachable code KEYS: Add payload preparsing opportunity prior to key instantiate or update	2012-12-16 15:40:50 -08:00
Jiri Kosina	3bd7bf1f0f	Merge branch 'master' into for-next Sync up with Linus' tree to be able to apply Cesar's patch against newer version of the code. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-10-28 19:29:19 +01:00
Alan Cox	b010520ab3	keys: Fix unreachable code We set ret to NULL then test it. Remove the bogus test Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-10-25 18:00:27 +02:00
Linus Torvalds	d25282d1c9	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...	2012-10-14 13:39:34 -07:00
David Howells	cf7f601c06	KEYS: Add payload preparsing opportunity prior to key instantiate or update Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (preparse)(struct key_preparsed_payload prep); void (free_preparse)(struct key_preparsed_payload prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char description; void type_data[2]; void payload; const void data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (instantiate)(struct key key, struct key_preparsed_payload prep); int (update)(struct key key, struct key_preparsed_payload prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-08 13:49:48 +10:30
Linus Torvalds	88265322c1	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - Integrity: add local fs integrity verification to detect offline attacks - Integrity: add digital signature verification - Simple stacking of Yama with other LSMs (per LSS discussions) - IBM vTPM support on ppc64 - Add new driver for Infineon I2C TIS TPM - Smack: add rule revocation for subject labels" Fixed conflicts with the user namespace support in kernel/auditsc.c and security/integrity/ima/ima_policy.c. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits) Documentation: Update git repository URL for Smack userland tools ima: change flags container data type Smack: setprocattr memory leak fix Smack: implement revoking all rules for a subject label Smack: remove task_wait() hook. ima: audit log hashes ima: generic IMA action flag handling ima: rename ima_must_appraise_or_measure audit: export audit_log_task_info tpm: fix tpm_acpi sparse warning on different address spaces samples/seccomp: fix 31 bit build on s390 ima: digital signature verification support ima: add support for different security.ima data types ima: add ima_inode_setxattr/removexattr function and calls ima: add inode_post_setattr call ima: replace iint spinblock with rwlock/read_lock ima: allocating iint improvements ima: add appraise action keywords and default rules ima: integrity appraisal extension vfs: move ima_file_free before releasing the file ...	2012-10-02 21:38:48 -07:00
David Howells	4442d7704c	Merge branch 'modsign-keys-devel' into security-next-keys Signed-off-by: David Howells <dhowells@redhat.com>	2012-10-02 19:30:19 +01:00
David Howells	f8aa23a55f	KEYS: Use keyring_alloc() to create special keyrings Use keyring_alloc() to create special keyrings now that it has a permissions parameter rather than using key_alloc() + key_instantiate_and_link(). Also document and export keyring_alloc() so that modules can use it too. Signed-off-by: David Howells <dhowells@redhat.com>	2012-10-02 19:24:56 +01:00
David Howells	96b5c8fea6	KEYS: Reduce initial permissions on keys Reduce the initial permissions on new keys to grant the possessor everything, view permission only to the user (so the keys can be seen in /proc/keys) and nothing else. This gives the creator a chance to adjust the permissions mask before other processes can access the new key or create a link to it. To aid with this, keyring_alloc() now takes a permission argument rather than setting the permissions itself. The following permissions are now set: (1) The user and user-session keyrings grant the user that owns them full permissions and grant a possessor everything bar SETATTR. (2) The process and thread keyrings grant the possessor full permissions but only grant the user VIEW. This permits the user to see them in /proc/keys, but not to do anything with them. (3) Anonymous session keyrings grant the possessor full permissions, but only grant the user VIEW and READ. This means that the user can see them in /proc/keys and can list them, but nothing else. Possibly READ shouldn't be provided either. (4) Named session keyrings grant everything an anonymous session keyring does, plus they grant the user LINK permission. The whole point of named session keyrings is that others can also subscribe to them. Possibly this should be a separate permission to LINK. (5) The temporary session keyring created by call_sbin_request_key() gets the same permissions as an anonymous session keyring. (6) Keys created by add_key() get VIEW, SEARCH, LINK and SETATTR for the possessor, plus READ and/or WRITE if the key type supports them. The used only gets VIEW now. (7) Keys created by request_key() now get the same as those created by add_key(). Reported-by: Lennart Poettering <lennart@poettering.net> Reported-by: Stef Walter <stefw@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com>	2012-10-02 19:24:56 +01:00
David Howells	3a50597de8	KEYS: Make the session and process keyrings per-thread Make the session keyring per-thread rather than per-process, but still inherited from the parent thread to solve a problem with PAM and gdm. The problem is that join_session_keyring() will reject attempts to change the session keyring of a multithreaded program but gdm is now multithreaded before it gets to the point of starting PAM and running pam_keyinit to create the session keyring. See: https://bugs.freedesktop.org/show_bug.cgi?id=49211 The reason that join_session_keyring() will only change the session keyring under a single-threaded environment is that it's hard to alter the other thread's credentials to effect the change in a multi-threaded program. The problems are such as: (1) How to prevent two threads both running join_session_keyring() from racing. (2) Another thread's credentials may not be modified directly by this process. (3) The number of threads is uncertain whilst we're not holding the appropriate spinlock, making preallocation slightly tricky. (4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get another thread to replace its keyring, but that means preallocating for each thread. A reasonable way around this is to make the session keyring per-thread rather than per-process and just document that if you want a common session keyring, you must get it before you spawn any threads - which is the current situation anyway. Whilst we're at it, we can the process keyring behave in the same way. This means we can clean up some of the ickyness in the creds code. Basically, after this patch, the session, process and thread keyrings are about inheritance rules only and not about sharing changes of keyring. Reported-by: Mantas M. <grawity@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ray Strode <rstrode@redhat.com>	2012-10-02 19:24:29 +01:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
Linus Torvalds	033d9959ed	Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...	2012-10-02 09:54:49 -07:00
Alan Cox	a84a921978	key: Fix resource leak On an error iov may still have been reallocated and need freeing Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com>	2012-09-28 12:20:02 +01:00
Alan Cox	631527703d	keys: Fix unreachable code We set ret to NULL then test it. Remove the bogus test Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David Howells <dhowells@redhat.com>	2012-09-28 12:20:02 +01:00
Eric W. Biederman	9a56c2db49	userns: Convert security/keys to the new userns infrastructure - Replace key_user ->user_ns equality checks with kuid_has_mapping checks. - Use from_kuid to generate key descriptions - Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t - Avoid potential problems with file descriptor passing by displaying keys in the user namespace of the opener of key status proc files. Cc: linux-security-module@vger.kernel.org Cc: keyrings@linux-nfs.org Cc: David Howells <dhowells@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:02 -07:00
Oleg Nesterov	b3f68f16db	task_work: Revert "hold task_lock around checks in keyctl" This reverts commit `d35abdb288`. task_lock() was added to ensure exit_mm() and thus exit_task_work() is not possible before task_work_add(). This is wrong, task_lock() must not be nested with write_lock(tasklist). And this is no longer needed, task_work_add() now fails if it is called after exit_task_work(). Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20120826191214.GA4231@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 16:47:36 +02:00
David Howells	d4f65b5d24	KEYS: Add payload preparsing opportunity prior to key instantiate or update Give the key type the opportunity to preparse the payload prior to the instantiation and update routines being called. This is done with the provision of two new key type operations: int (preparse)(struct key_preparsed_payload prep); void (free_preparse)(struct key_preparsed_payload prep); If the first operation is present, then it is called before key creation (in the add/update case) or before the key semaphore is taken (in the update and instantiate cases). The second operation is called to clean up if the first was called. preparse() is given the opportunity to fill in the following structure: struct key_preparsed_payload { char description; void type_data[2]; void payload; const void data; size_t datalen; size_t quotalen; }; Before the preparser is called, the first three fields will have been cleared, the payload pointer and size will be stored in data and datalen and the default quota size from the key_type struct will be stored into quotalen. The preparser may parse the payload in any way it likes and may store data in the type_data[] and payload fields for use by the instantiate() and update() ops. The preparser may also propose a description for the key by attaching it as a string to the description field. This can be used by passing a NULL or "" description to the add_key() system call or the key_create_or_update() function. This cannot work with request_key() as that required the description to tell the upcall about the key to be created. This, for example permits keys that store PGP public keys to generate their own name from the user ID and public key fingerprint in the key. The instantiate() and update() operations are then modified to look like this: int (instantiate)(struct key key, struct key_preparsed_payload prep); int (update)(struct key key, struct key_preparsed_payload prep); and the new payload data is passed in *prep, whether or not it was preparsed. Signed-off-by: David Howells <dhowells@redhat.com>	2012-09-13 13:06:29 +01:00
Kent Yoder	41ab999c80	tpm: Move tpm_get_random api into the TPM device driver Move the tpm_get_random api from the trusted keys code into the TPM device driver itself so that other callers can make use of it. Also, change the api slightly so that the number of bytes read is returned in the call, since the TPM command can potentially return fewer bytes than requested. Acked-by: David Safford <safford@linux.vnet.ibm.com> Reviewed-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>	2012-08-22 11:11:33 -05:00
Tejun Heo	3b07e9ca26	workqueue: deprecate system_nrt[_freezable]_wq system_nrt[_freezable]_wq are now spurious. Mark them deprecated and convert all users to system[_freezable]_wq. If you're cc'd and wondering what's going on: Now all workqueues are non-reentrant, so there's no reason to use system_nrt[_freezable]_wq. Please use system[_freezable]_wq instead. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-By: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Airlie <airlied@linux.ie> Cc: Jiri Kosina <jkosina@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Howells <dhowells@redhat.com>	2012-08-20 14:51:24 -07:00
Linus Torvalds	e05644e17e	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Nothing groundbreaking for this kernel, just cleanups and fixes, and a couple of Smack enhancements." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits) Smack: Maintainer Record Smack: don't show empty rules when /smack/load or /smack/load2 is read Smack: user access check bounds Smack: onlycap limits on CAP_MAC_ADMIN Smack: fix smack_new_inode bogosities ima: audit is compiled only when enabled ima: ima_initialized is set only if successful ima: add policy for pseudo fs ima: remove unused cleanup functions ima: free securityfs violations file ima: use full pathnames in measurement list security: Fix nommu build. samples: seccomp: add .gitignore for untracked executables tpm: check the chip reference before using it TPM: fix memleak when register hardware fails TPM: chip disabled state erronously being reported as error MAINTAINERS: TPM maintainers' contacts update Merge branches 'next-queue' and 'next' into next Remove unused code from MPI library Revert "crypto: GnuPG based MPI lib - additional sources (part 4)" ...	2012-07-23 18:49:06 -07:00
Al Viro	d35abdb288	hold task_lock around checks in keyctl Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:58:01 +04:00
Al Viro	67d1214551	merge task_work and rcu_head, get rid of separate allocation for keyring case task_work and rcu_head are identical now; merge them (calling the result struct callback_head, rcu_head #define'd to it), kill separate allocation in security/keys since we can just use cred->rcu now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:57:56 +04:00
Al Viro	41f9d29f09	trimming task_work: kill ->data get rid of the only user of ->data; this is _not_ the final variant - in the end we'll have task_work and rcu_head identical and just use cred->rcu, at which point the separate allocation will be gone completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:57:54 +04:00
James Morris	66dd07b88a	Merge commit 'v3.5-rc2' into next	2012-06-10 22:52:10 +10:00
Linus Torvalds	fb21affa49	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull second pile of signal handling patches from Al Viro: "This one is just task_work_add() series + remaining prereqs for it. There probably will be another pull request from that tree this cycle - at least for helpers, to get them out of the way for per-arch fixes remaining in the tree." Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile had brought in commit `97fd75b7b8` ("kernel/irq/manage.c: use the pr_foo() infrastructure to prefix printks") which changed one of the pr_err() calls that this merge moves around. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: keys: kill task_struct->replacement_session_keyring keys: kill the dummy key_replace_session_keyring() keys: change keyctl_session_to_parent() to use task_work_add() genirq: reimplement exit_irq_thread() hook via task_work_add() task_work_add: generic process-context callbacks avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers parisc: need to check NOTIFY_RESUME when exiting from syscall move key_repace_session_keyring() into tracehook_notify_resume() TIF_NOTIFY_RESUME is defined on all targets now	2012-05-31 18:47:30 -07:00
Christopher Yeoh	ac34ebb3a6	aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() A cleanup of rw_copy_check_uvector and compat_rw_copy_check_uvector after changes made to support CMA in an earlier patch. Rather than having an additional check_access parameter to these functions, the first paramater type is overloaded to allow the caller to specify CHECK_IOVEC_ONLY which means check that the contents of the iovec are valid, but do not check the memory that they point to. This is used by process_vm_readv/writev where we need to validate that a iovec passed to the syscall is valid but do not want to check the memory that it points to at this point because it refers to an address space in another process. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:32 -07:00
Boaz Harrosh	81ab6e7b26	kmod: convert two call sites to call_usermodehelper_fns() Both kernel/sys.c && security/keys/request_key.c where inlining the exact same code as call_usermodehelper_fns(); So simply convert these sites to directly use call_usermodehelper_fns(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:28 -07:00
Andrew Morton	4f1c28d241	security/keys/keyctl.c: suppress memory allocation failure warning This allocation may be large. The code is probing to see if it will succeed and if not, it falls back to vmalloc(). We should suppress any page-allocation failure messages when the fallback happens. Reported-by: Dave Jones <davej@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:26 -07:00
David Howells	423b978802	KEYS: Fix some sparse warnings Fix some sparse warnings in the keyrings code: (1) compat_keyctl_instantiate_key_iov() should be static. (2) There were a couple of places where a pointer was being compared against integer 0 rather than NULL. (3) keyctl_instantiate_key_common() should not take a __user-labelled iovec pointer as the caller must have copied the iovec to kernel space. (4) __key_link_begin() takes and __key_link_end() releases keyring_serialise_link_sem under some circumstances and so this should be declared. Note that adding __acquires() and __releases() for this doesn't help cure the warnings messages - something only commenting out both helps. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-25 20:51:42 +10:00
Oleg Nesterov	413cd3d9ab	keys: change keyctl_session_to_parent() to use task_work_add() Change keyctl_session_to_parent() to use task_work_add() and move key_replace_session_keyring() logic into task_work->func(). Note that we do task_work_cancel() before task_work_add() to ensure that only one work can be pending at any time. This is important, we must not allow user-space to abuse the parent's ->task_works list. The callback, replace_session_keyring(), checks PF_EXITING. I guess this is not really needed but looks better. As a side effect, this fixes the (unlikely) race. The callers of key_replace_session_keyring() and keyctl_session_to_parent() lack the necessary barriers, the parent can miss the request. Now we can remove task_struct->replacement_session_keyring and related code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Smith <dsmith@redhat.com> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-23 22:11:23 -04:00
Al Viro	1227dd773d	TIF_NOTIFY_RESUME is defined on all targets now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-23 22:09:19 -04:00
Linus Torvalds	644473e9c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...	2012-05-23 17:42:39 -07:00
David Howells	b404aef72f	KEYS: Don't check for NULL key pointer in key_validate() Don't bother checking for NULL key pointer in key_validate() as all of the places that call it will crash anyway if the relevant key pointer is NULL by the time they call key_validate(). Therefore, the checking must be done prior to calling here. Whilst we're at it, simplify the key_validate() function a bit and mark its argument const. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-05-16 00:54:33 +10:00
David Howells	fd75815f72	KEYS: Add invalidation support Add support for invalidating a key - which renders it immediately invisible to further searches and causes the garbage collector to immediately wake up, remove it from keyrings and then destroy it when it's no longer referenced. It's better not to do this with keyctl_revoke() as that marks the key to start returning -EKEYREVOKED to searches when what is actually desired is to have the key refetched. To invalidate a key the caller must be granted SEARCH permission by the key. This may be too strict. It may be better to also permit invalidation if the caller has any of READ, WRITE or SETATTR permission. The primary use for this is to evict keys that are cached in special keyrings, such as the DNS resolver or an ID mapper. Signed-off-by: David Howells <dhowells@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	31d5a79d7f	KEYS: Do LRU discard in full keyrings Do an LRU discard in keyrings that are full rather than returning ENFILE. To perform this, a time_t is added to the key struct and updated by the creation of a link to a key and by a key being found as the result of a search. At the completion of a successful search, the keyrings in the path between the root of the search and the first found link to it also have their last-used times updated. Note that discarding a link to a key from a keyring does not necessarily destroy the key as there may be references held by other places. An alternate discard method that might suffice is to perform FIFO discard from the keyring, using the spare 2-byte hole in the keylist header as the index of the next link to be discarded. This is useful when using a keyring as a cache for DNS results or foreign filesystem IDs. This can be tested by the following. As root do: echo 1000 >/proc/sys/kernel/keys/root_maxkeys kr=`keyctl newring foo @s` for ((i=0; i<2000; i++)); do keyctl add user a$i a $kr; done Without this patch ENFILE should be reported when the keyring fills up. With this patch, the keyring discards keys in an LRU fashion. Note that the stored LRU time has a granularity of 1s. After doing this, /proc/key-users can be observed and should show that most of the 2000 keys have been discarded: [root@andromeda ~]# cat /proc/key-users 0: 517 516/516 513/1000 5249/20000 The "513/1000" here is the number of quota-accounted keys present for this user out of the maximum permitted. In /proc/keys, the keyring shows the number of keys it has and the number of slots it has allocated: [root@andromeda ~]# grep foo /proc/keys 200c64c4 I--Q-- 1 perm 3b3f0000 0 0 keyring foo: 509/509 The maximum is (PAGE_SIZE - header) / key pointer size. That's typically 509 on a 64-bit system and 1020 on a 32-bit system. Signed-off-by: David Howells <dhowells@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	233e4735f2	KEYS: Permit in-place link replacement in keyring list Make use of the previous patch that makes the garbage collector perform RCU synchronisation before destroying defunct keys. Key pointers can now be replaced in-place without creating a new keyring payload and replacing the whole thing as the discarded keys will not be destroyed until all currently held RCU read locks are released. If the keyring payload space needs to be expanded or contracted, then a replacement will still need allocating, and the original will still have to be freed by RCU. Signed-off-by: David Howells <dhowells@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	65d87fe68a	KEYS: Perform RCU synchronisation on keys prior to key destruction Make the keys garbage collector invoke synchronize_rcu() prior to destroying keys with a zero usage count. This means that a key can be examined under the RCU read lock in the safe knowledge that it won't get deallocated until after the lock is released - even if its usage count becomes zero whilst we're looking at it. This is useful in keyring search vs key link. Consider a keyring containing a link to a key. That link can be replaced in-place in the keyring without requiring an RCU copy-and-replace on the keyring contents without breaking a search underway on that keyring when the displaced key is released, provided the key is actually destroyed only after the RCU read lock held by the search algorithm is released. This permits __key_link() to replace a key without having to reallocate the key payload. A key gets replaced if a new key being linked into a keyring has the same type and description. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com>	2012-05-11 10:56:56 +01:00
David Howells	1eb1bcf5bf	KEYS: Announce key type (un)registration Announce the (un)registration of a key type in the core key code rather than in the callers. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com>	2012-05-11 10:56:56 +01:00
David Howells	9f7ce8e249	KEYS: Reorganise keys Makefile Reorganise the keys directory Makefile to put all the core bits together and the type-specific bits after. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com>	2012-05-11 10:56:56 +01:00
David Howells	f0894940ae	KEYS: Move the key config into security/keys/Kconfig Move the key config into security/keys/Kconfig as there are going to be a lot of key-related options. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com>	2012-05-11 10:56:56 +01:00
Eric W. Biederman	ae2975bc34	userns: Convert group_info values from gid_t to kgid_t. As a first step to converting struct cred to be all kuid_t and kgid_t values convert the group values stored in group_info to always be kgid_t values. Unless user namespaces are used this change should have no effect. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-03 03:27:21 -07:00
Eric W. Biederman	0093ccb68f	cred: Refcount the user_ns pointed to by the cred. struct user_struct will shortly loose it's user_ns reference so make the cred user_ns reference a proper reference complete with reference counting. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 16:55:52 -07:00
Eric W. Biederman	c4a4d60379	userns: Use cred->user_ns instead of cred->user->user_ns Optimize performance and prepare for the removal of the user_ns reference from user_struct. Remove the slow long walk through cred->user->user_ns and instead go straight to cred->user_ns. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 16:55:51 -07:00
Oleg Nesterov	9d944ef32e	usermodehelper: kill umh_wait, renumber UMH_* constants No functional changes. It is not sane to use UMH_KILLABLE with enum umh_wait, but obviously we do not want another argument in call_usermodehelper_* helpers. Kill this enum, use the plain int. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:41 -07:00
Linus Torvalds	f63d395d47	NFS client updates for Linux 3.4 New features include: - Add NFS client support for containers. This should enable most of the necessary functionality, including lockd support, and support for rpc.statd, NFSv4 idmapper and RPCSEC_GSS upcalls into the correct network namespace from which the mount system call was issued. - NFSv4 idmapper scalability improvements Base the idmapper cache on the keyring interface to allow concurrent access to idmapper entries. Start the process of migrating users from the single-threaded daemon-based approach to the multi-threaded request-key based approach. - NFSv4.1 implementation id. Allows the NFSv4.1 client and server to mutually identify each other for logging and debugging purposes. - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of having to use the more counterintuitive 'vers=4,minorversion=1'. - SUNRPC tracepoints. Start the process of adding tracepoints in order to improve debugging of the RPC layer. - pNFS object layout support for autologin. Important bugfixes include: - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail to wake up all tasks when applied to priority waitqueues. - Ensure that we handle read delegations correctly, when we try to truncate a file. - A number of fixes for NFSv4 state manager loops (mostly to do with delegation recovery). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73 xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu 5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH 7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK 3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6 WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1 VBm+EI00Pvmm5+fUjIlp =LuHE -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates for Linux 3.4 from Trond Myklebust: "New features include: - Add NFS client support for containers. This should enable most of the necessary functionality, including lockd support, and support for rpc.statd, NFSv4 idmapper and RPCSEC_GSS upcalls into the correct network namespace from which the mount system call was issued. - NFSv4 idmapper scalability improvements Base the idmapper cache on the keyring interface to allow concurrent access to idmapper entries. Start the process of migrating users from the single-threaded daemon-based approach to the multi-threaded request-key based approach. - NFSv4.1 implementation id. Allows the NFSv4.1 client and server to mutually identify each other for logging and debugging purposes. - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of having to use the more counterintuitive 'vers=4,minorversion=1'. - SUNRPC tracepoints. Start the process of adding tracepoints in order to improve debugging of the RPC layer. - pNFS object layout support for autologin. Important bugfixes include: - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail to wake up all tasks when applied to priority waitqueues. - Ensure that we handle read delegations correctly, when we try to truncate a file. - A number of fixes for NFSv4 state manager loops (mostly to do with delegation recovery)." * tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits) NFS: fix sb->s_id in nfs debug prints xprtrdma: Remove assumption that each segment is <= PAGE_SIZE xprtrdma: The transport should not bug-check when a dup reply is received pnfs-obj: autologin: Add support for protocol autologin NFS: Remove nfs4_setup_sequence from generic rename code NFS: Remove nfs4_setup_sequence from generic unlink code NFS: Remove nfs4_setup_sequence from generic read code NFS: Remove nfs4_setup_sequence from generic write code NFS: Fix more NFS debug related build warnings SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined nfs: non void functions must return a value SUNRPC: Kill compiler warning when RPC_DEBUG is unset SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG NFS: Use cond_resched_lock() to reduce latencies in the commit scans NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner NFS: ncommit count is being double decremented SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() Try using machine credentials for RENEW calls NFSv4.1: Fix a few issues in filelayout_commit_pagelist NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code ...	2012-03-23 08:53:47 -07:00
Dan Carpenter	f67dabbdde	KEYS: testing wrong bit for KEY_FLAG_REVOKED The test for "if (cred->request_key_auth->flags & KEY_FLAG_REVOKED) {" should actually testing that the (1 << KEY_FLAG_REVOKED) bit is set. The current code actually checks for KEY_FLAG_DEAD. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2012-03-07 11:12:06 +11:00
Bryan Schumaker	59e6b9c113	Created a function for setting timeouts on keys The keyctl_set_timeout function isn't exported to other parts of the kernel, but I want to use it for the NFS idmapper. I already have the key, but I wanted a generic way to set the timeout. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-03-01 16:50:31 -05:00
James Morris	9e3ff38647	Merge branch 'next-queue' into next	2012-02-09 17:02:34 +11:00
Linus Torvalds	7908b3ef68	Merge git://git.samba.org/sfrench/cifs-2.6 * git://git.samba.org/sfrench/cifs-2.6: CIFS: Rename UCS functions to UTF16 [CIFS] ACL and FSCACHE support no longer EXPERIMENTAL [CIFS] Fix build break with multiuser patch when LANMAN disabled cifs: warn about impending deprecation of legacy MultiuserMount code cifs: fetch credentials out of keyring for non-krb5 auth multiuser mounts cifs: sanitize username handling keys: add a "logon" key type cifs: lower default wsize when unix extensions are not used cifs: better instrumentation for coalesce_t2 cifs: integer overflow in parse_dacl() cifs: Fix sparse warning when calling cifs_strtoUCS CIFS: Add descriptions to the brlock cache functions	2012-01-23 08:59:49 -08:00
Mimi Zohar	f6b24579d0	keys: fix user_defined key sparse messages Replace the rcu_assign_pointer() calls with rcu_assign_keypointer(). Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2012-01-19 16:16:29 +11:00
David Howells	700920eb5b	KEYS: Allow special keyrings to be cleared The kernel contains some special internal keyrings, for instance the DNS resolver keyring : 2a93faf1 I----- 1 perm 1f030000 0 0 keyring .dns_resolver: empty It would occasionally be useful to allow the contents of such keyrings to be flushed by root (cache invalidation). Allow a flag to be set on a keyring to mark that someone possessing the sysadmin capability can clear the keyring, even without normal write access to the keyring. Set this flag on the special keyrings created by the DNS resolver, the NFS identity mapper and the CIFS identity mapper. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2012-01-19 14:38:51 +11:00
Jeff Layton	9f6ed2ca25	keys: add a "logon" key type For CIFS, we want to be able to store NTLM credentials (aka username and password) in the keyring. We do not, however want to allow users to fetch those keys back out of the keyring since that would be a security risk. Unfortunately, due to the nuances of key permission bits, it's not possible to do this. We need to grant search permissions so the kernel can find these keys, but that also implies permissions to read the payload. Resolve this by adding a new key_type. This key type is essentially the same as key_type_user, but does not define a .read op. This prevents the payload from ever being visible from userspace. This key type also vets the description to ensure that it's "qualified" by checking to ensure that it has a ':' in it that is preceded by other characters. Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>	2012-01-17 22:39:40 -06:00
Mimi Zohar	6ac6172a93	encrypted-keys: fix rcu and sparse messages Enabling CONFIG_PROVE_RCU and CONFIG_SPARSE_RCU_POINTER resulted in "suspicious rcu_dereference_check() usage!" and "incompatible types in comparison expression (different address spaces)" messages. Access the masterkey directly when holding the rwsem. Changelog v1: - Use either rcu_read_lock()/rcu_derefence_key()/rcu_read_unlock() or remove the unnecessary rcu_derefence() - David Howells Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2012-01-18 10:41:30 +11:00
Mimi Zohar	ee0b31a25a	keys: fix trusted/encrypted keys sparse rcu_assign_pointer messages Define rcu_assign_keypointer(), which uses the key payload.rcudata instead of payload.data, to resolve the CONFIG_SPARSE_RCU_POINTER message: "incompatible types in comparison expression (different address spaces)" Replace the rcu_assign_pointer() calls in encrypted/trusted keys with rcu_assign_keypointer(). Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2012-01-18 10:41:29 +11:00
David Howells	efde8b6e16	KEYS: Add missing smp_rmb() primitives to the keyring search code Add missing smp_rmb() primitives to the keyring search code. When keyring payloads are appended to without replacement (thus using up spare slots in the key pointer array), an smp_wmb() is issued between the pointer assignment and the increment of the key count (nkeys). There should be corresponding read barriers between the read of nkeys and dereferences of keys[n] when n is dependent on the value of nkeys. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2012-01-18 10:41:27 +11:00
James Morris	8fcc995495	Merge branch 'next' into for-linus Conflicts: security/integrity/evm/evm_crypto.c Resolved upstream fix vs. next conflict manually. Signed-off-by: James Morris <jmorris@namei.org>	2012-01-09 12:16:48 +11:00
David Howells	7845bc3964	KEYS: Give key types their own lockdep class for key->sem Give keys their own lockdep class to differentiate them from each other in case a key of one type has to refer to a key of another type. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-11-17 09:35:32 +11:00
Mimi Zohar	9c69898783	encrypted-keys: module build fixes Encrypted keys are encrypted/decrypted using either a trusted or user-defined key type, which is referred to as the 'master' key. The master key may be of type trusted iff the trusted key is builtin or both the trusted key and encrypted keys are built as modules. This patch resolves the build dependency problem. - Use "masterkey-$(CONFIG_TRUSTED_KEYS)-$(CONFIG_ENCRYPTED_KEYS)" construct to encapsulate the above logic. (Suggested by Dimtry Kasatkin.) - Fixing the encrypted-keys Makefile, results in a module name change from encrypted.ko to encrypted-keys.ko. - Add module dependency for request_trusted_key() definition Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2011-11-16 14:23:14 -05:00
Mimi Zohar	f4a0d5abef	encrypted-keys: fix error return code Fix request_master_key() error return code. Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2011-11-16 14:23:13 -05:00
David Howells	9f35a33b8d	KEYS: Fix a NULL pointer deref in the user-defined key type Fix a NULL pointer deref in the user-defined key type whereby updating a negative key into a fully instantiated key will cause an oops to occur when the code attempts to free the non-existent old payload. This results in an oops that looks something like the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff81085fa1>] __call_rcu+0x11/0x13e PGD 3391d067 PUD 3894a067 PMD 0 Oops: 0002 [#1] SMP CPU 1 Pid: 4354, comm: keyctl Not tainted 3.1.0-fsdevel+ #1140 /DG965RY RIP: 0010:[<ffffffff81085fa1>] [<ffffffff81085fa1>] __call_rcu+0x11/0x13e RSP: 0018:ffff88003d591df8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000006e RDX: ffffffff8161d0c0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88003d591e18 R08: 0000000000000000 R09: ffffffff8152fa6c R10: 0000000000000000 R11: 0000000000000300 R12: ffff88003b8f9538 R13: ffffffff8161d0c0 R14: ffff88003b8f9d50 R15: ffff88003c69f908 FS: 00007f97eb18c720(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000003d47a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process keyctl (pid: 4354, threadinfo ffff88003d590000, task ffff88003c78a040) Stack: ffff88003e0ffde0 ffff88003b8f9538 0000000000000001 ffff88003b8f9d50 ffff88003d591e28 ffffffff810860f0 ffff88003d591e68 ffffffff8117bfea ffff88003d591e68 ffffffff00000000 ffff88003e0ffde1 ffff88003e0ffde0 Call Trace: [<ffffffff810860f0>] call_rcu_sched+0x10/0x12 [<ffffffff8117bfea>] user_update+0x8d/0xa2 [<ffffffff8117723a>] key_create_or_update+0x236/0x270 [<ffffffff811789b1>] sys_add_key+0x123/0x17e [<ffffffff813b84bb>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Horman <nhorman@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: James Morris <jmorris@namei.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-15 22:32:38 -02:00
Andy Shevchenko	02473119bc	security: follow rename pack_hex_byte() to hex_byte_pack() There is no functional change. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Mimi Zohar <zohar@us.ibm.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-10-31 17:30:56 -07:00
Christopher Yeoh	fcf634098c	Cross Memory Attach The basic idea behind cross memory attach is to allow MPI programs doing intra-node communication to do a single copy of the message rather than a double copy of the message via shared memory. The following patch attempts to achieve this by allowing a destination process, given an address and size from a source process, to copy memory directly from the source process into its own address space via a system call. There is also a symmetrical ability to copy from the current process's address space into a destination process's address space. - Use of /proc/pid/mem has been considered, but there are issues with using it: - Does not allow for specifying iovecs for both src and dest, assuming preadv or pwritev was implemented either the area read from or written to would need to be contiguous. - Currently mem_read allows only processes who are currently ptrace'ing the target and are still able to ptrace the target to read from the target. This check could possibly be moved to the open call, but its not clear exactly what race this restriction is stopping (reason appears to have been lost) - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix domain socket is a bit ugly from a userspace point of view, especially when you may have hundreds if not (eventually) thousands of processes that all need to do this with each other - Doesn't allow for some future use of the interface we would like to consider adding in the future (see below) - Interestingly reading from /proc/pid/mem currently actually involves two copies! (But this could be fixed pretty easily) As mentioned previously use of vmsplice instead was considered, but has problems. Since you need the reader and writer working co-operatively if the pipe is not drained then you block. Which requires some wrapping to do non blocking on the send side or polling on the receive. In all to all communication it requires ordering otherwise you can deadlock. And in the example of many MPI tasks writing to one MPI task vmsplice serialises the copying. There are some cases of MPI collectives where even a single copy interface does not get us the performance gain we could. For example in an MPI_Reduce rather than copy the data from the source we would like to instead use it directly in a mathops (say the reduce is doing a sum) as this would save us doing a copy. We don't need to keep a copy of the data from the source. I haven't implemented this, but I think this interface could in the future do all this through the use of the flags - eg could specify the math operation and type and the kernel rather than just copying the data would apply the specified operation between the source and destination and store it in the destination. Although we don't have a "second user" of the interface (though I've had some nibbles from people who may be interested in using it for intra process messaging which is not MPI). This interface is something which hardware vendors are already doing for their custom drivers to implement fast local communication. And so in addition to this being useful for OpenMPI it would mean the driver maintainers don't have to fix things up when the mm changes. There was some discussion about how much faster a true zero copy would go. Here's a link back to the email with some testing I did on that: http://marc.info/?l=linux-mm&m=130105930902915&w=2 There is a basic man page for the proposed interface here: http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt This has been implemented for x86 and powerpc, other architecture should mainly (I think) just need to add syscall numbers for the process_vm_readv and process_vm_writev. There are 32 bit compatibility versions for 64-bit kernels. For arch maintainers there are some simple tests to be able to quickly verify that the syscalls are working correctly here: http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: <linux-man@vger.kernel.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-10-31 17:30:44 -07:00
Mimi Zohar	2b3ff6319e	encrypted-keys: check hex2bin result For each hex2bin call in encrypted keys, check that the ascii hex string is valid. On failure, return -EINVAL. Changelog v1: - hex2bin now returns an int Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>	2011-09-20 23:26:44 -04:00
Mimi Zohar	2684bf7f29	trusted-keys: check hex2bin result For each hex2bin call in trusted keys, check that the ascii hex string is valid. On failure, return -EINVAL. Changelog v1: - hex2bin now returns an int Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>	2011-09-20 23:26:05 -04:00
Stephen Rothwell	cc100551b4	encrypted-keys: IS_ERR need include/err.h Fixes this build error: security/keys/encrypted-keys/masterkey_trusted.c: In function 'request_trusted_key': security/keys/encrypted-keys/masterkey_trusted.c:35:2: error: implicit declaration of function 'IS_ERR' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2011-09-15 17:37:24 -04:00
Mimi Zohar	982e617a31	encrypted-keys: remove trusted-keys dependency Encrypted keys are decrypted/encrypted using either a trusted-key or, for those systems without a TPM, a user-defined key. This patch removes the trusted-keys and TCG_TPM dependencies. Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2011-09-14 15:23:49 -04:00
Mimi Zohar	61cf45d019	encrypted-keys: create encrypted-keys directory Move all files associated with encrypted keys to keys/encrypted-keys. Signed-off-by: Mimi Zohar <zohar@us.ibm.com>	2011-09-14 15:22:26 -04:00
David Howells	0c061b5707	KEYS: Correctly destroy key payloads when their keytype is removed unregister_key_type() has code to mark a key as dead and make it unavailable in one loop and then destroy all those unavailable key payloads in the next loop. However, the loop to mark keys dead renders the key undetectable to the second loop by changing the key type pointer also. Fix this by the following means: (1) The key code has two garbage collectors: one deletes unreferenced keys and the other alters keyrings to delete links to old dead, revoked and expired keys. They can end up holding each other up as both want to scan the key serial tree under spinlock. Combine these into a single routine. (2) Move the dead key marking, dead link removal and dead key removal into the garbage collector as a three phase process running over the three cycles of the normal garbage collection procedure. This is tracked by the KEY_GC_REAPING_DEAD_1, _2 and _3 state flags. unregister_key_type() then just unlinks the key type from the list, wakes up the garbage collector and waits for the third phase to complete. (3) Downgrade the key types sem in unregister_key_type() once it has deleted the key type from the list so that it doesn't block the keyctl() syscall. (4) Dead keys that cannot be simply removed in the third phase have their payloads destroyed with the key's semaphore write-locked to prevent interference by the keyctl() syscall. There should be no in-kernel users of dead keys of that type by the point of unregistration, though keyctl() may be holding a reference. (5) Only perform timer recalculation in the GC if the timer actually expired. If it didn't, we'll get another cycle when it goes off - and if the key that actually triggered it has been removed, it's not a problem. (6) Only garbage collect link if the timer expired or if we're doing dead key clean up phase 2. (7) As only key_garbage_collector() is permitted to use rb_erase() on the key serial tree, it doesn't need to revalidate its cursor after dropping the spinlock as the node the cursor points to must still exist in the tree. (8) Drop the spinlock in the GC if there is contention on it or if we need to reschedule. After dealing with that, get the spinlock again and resume scanning. This has been tested in the following ways: (1) Run the keyutils testsuite against it. (2) Using the AF_RXRPC and RxKAD modules to test keytype removal: Load the rxrpc_s key type: # insmod /tmp/af-rxrpc.ko # insmod /tmp/rxkad.ko Create a key (http://people.redhat.com/~dhowells/rxrpc/listen.c): # /tmp/listen & [1] 8173 Find the key: # grep rxrpc_s /proc/keys 091086e1 I--Q-- 1 perm 39390000 0 0 rxrpc_s 52:2 Link it to a session keyring, preferably one with a higher serial number: # keyctl link 0x20e36251 @s Kill the process (the key should remain as it's linked to another place): # fg /tmp/listen ^C Remove the key type: rmmod rxkad rmmod af-rxrpc This can be made a more effective test by altering the following part of the patch: if (unlikely(gc_state & KEY_GC_REAPING_DEAD_2)) { /* Make sure everyone revalidates their keys if we marked a * bunch as being dead and make sure all keyring ex-payloads * are destroyed. */ kdebug("dead sync"); synchronize_rcu(); To call synchronize_rcu() in GC phase 1 instead. That causes that the keyring's old payload content to hang around longer until it's RCU destroyed - which usually happens after GC phase 3 is complete. This allows the destroy_dead_key branch to be tested. Reported-by: Benjamin Coddington <bcodding@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-08-23 09:57:37 +10:00
David Howells	d199798bdf	KEYS: The dead key link reaper should be non-reentrant The dead key link reaper should be non-reentrant as it relies on global state to keep track of where it's got to when it returns to the work queue manager to give it some air. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-08-23 09:57:36 +10:00
David Howells	b072e9bc2f	KEYS: Make the key reaper non-reentrant Make the key reaper non-reentrant by sticking it on the appropriate system work queue when we queue it. This will allow it to have global state and drop locks. It should probably be non-reentrant already as it may spend a long time holding the key serial spinlock, and so multiple entrants can spend long periods of time just sitting there spinning, waiting to get the lock. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-08-23 09:57:36 +10:00
David Howells	8bc16deabc	KEYS: Move the unreferenced key reaper to the keys garbage collector file Move the unreferenced key reaper function to the keys garbage collector file as that's a more appropriate place with the dead key link reaper. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2011-08-23 09:57:36 +10:00

1 2 3 4 5 ...

334 Commits