linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Sukadev Bhattiprolu	6eaba35b43	autofs4: allow autofs to work outside the initial PID namespace Enable autofs4 to work in a "container". oz_pgrp is converted from pid_t to struct pid and this is stored at mount time based on the "pgrp=" option or if the option is missing then the current pgrp. The "pgrp=" option is interpreted in the PID namespace of the current process. This option is flawed in that it doesn't carry the namespace information, so it should be deprecated. AFAICS the autofs daemon always sends the current pgrp, which is the default anyway. The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl. This ioctl sets oz_pgrp to the current pgrp. It is not allowed to change the pid namespace. oz_pgrp is used mainly to determine whether the process traversing the autofs mount tree is the autofs daemon itself or not. This function now compares the pid pointers instead of the pid_t values. One other use of oz_pgrp is in autofs4_show_options. There is shows the virtual pid number (i.e. the one that is valid inside the PID namespace of the calling process) For debugging printk convert oz_pgrp to the value in the initial pid namespace. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: Eric Biederman <ebiederm@xmission.com> Acked-by: Ian Kent <raven@themaw.net> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:58 -08:00
Al Viro	baa40671d3	autofs4: make freeing sbi rcu-delayed makes ->d_managed() safety in RCU mode independent from vfsmount_lock Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-10-24 23:43:27 -04:00
Al Viro	72c2d53192	file->f_op is never NULL... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-10-24 23:34:54 -04:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Eric W. Biederman	45634cd8cb	userns: Support autofs4 interacing with multiple user namespaces Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue. When creating directories and symlinks default the uid and gid of the mount requester to the global root uid and gid. autofs4_wait will update these fields when a mount is requested. When generating autofsv5 packets report the uid and gid of the mount requestor in user namespace of the process that opened the pipe, reporting unmapped uids and gids as overflowuid and overflowgid. In autofs_dev_ioctl_requester return the uid and gid of the last mount requester converted into the calling processes user namespace. When the uid or gid don't map return overflowuid and overflowgid as appropriate, allowing failure to find a mount requester to be distinguished from failure to map a mount requester. The uid and gid mount options specifying the user and group of the root autofs inode are converted into kuid and kgid as they are parsed defaulting to the current uid and current gid of the process that mounts autofs. Mounting of autofs for the present remains confined to processes in the initial user namespace. Cc: Ian Kent <raven@themaw.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-14 22:05:32 -08:00
Linus Torvalds	64f371bc31	autofs: make the autofsv5 packet file descriptor use a packetized pipe The autofs packet size has had a very unfortunate size problem on x86: because the alignment of 'u64' differs in 32-bit and 64-bit modes, and because the packet data was not 8-byte aligned, the size of the autofsv5 packet structure differed between 32-bit and 64-bit modes despite looking otherwise identical (300 vs 304 bytes respectively). We first fixed that up by making the 64-bit compat mode know about this problem in commit `a32744d4ab` ("autofs: work around unhappy compat problem on x86-64"), and that made a 32-bit 'systemd' work happily on a 64-bit kernel because everything then worked the same way as on a 32-bit kernel. But it turned out that 'automount' had actually known and worked around this problem in user space, so fixing the kernel to do the proper 32-bit compatibility handling actually broke 32-bit automount on a 64-bit kernel, because it knew that the packet sizes were wrong and expected those incorrect sizes. As a result, we ended up reverting that compatibility mode fix, and thus breaking systemd again, in commit `fcbf94b9de`. With both automount and systemd doing a single read() system call, and verifying that they get exactly the size they expect but using different sizes, it seemed that fixing one of them inevitably seemed to break the other. At one point, a patch I seriously considered applying from Michael Tokarev did a "strcmp()" to see if it was automount that was doing the operation. Ugly, ugly. However, a prettier solution exists now thanks to the packetized pipe mode. By marking the communication pipe as being packetized (by simply setting the O_DIRECT flag), we can always just write the bigger packet size, and if user-space does a smaller read, it will just get that partial end result and the extra alignment padding will simply be thrown away. This makes both automount and systemd happy, since they now get the size they asked for, and the kernel side of autofs simply no longer needs to care - it could pad out the packet arbitrarily. Of course, if there is some other user of autofs (please, please, please tell me it ain't so - and we haven't heard of any) that tries to read the packets with multiple writes, that other user will now be broken - the whole point of the packetized mode is that one system call gets exactly one packet, and you cannot read a packet in pieces. Tested-by: Michael Tokarev <mjt@tls.msk.ru> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: David Miller <davem@davemloft.net> Cc: Ian Kent <raven@themaw.net> Cc: Thomas Meyer <thomas@m3y3r.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-29 13:30:08 -07:00
Linus Torvalds	fcbf94b9de	Revert "autofs: work around unhappy compat problem on x86-64" This reverts commit `a32744d4ab`. While that commit was technically the right thing to do, and made the x86-64 compat mode work identically to native 32-bit mode (and thus fixing the problem with a 32-bit systemd install on a 64-bit kernel), it turns out that the automount binaries had workarounds for this compat problem. Now, the workarounds are disgusting: doing an "uname()" to find out the architecture of the kernel, and then comparing it for the 64-bit cases and fixing up the size of the read() in automount for those. And they were confused: it's not actually a generic 64-bit issue at all, it's very much tied to just x86-64, which has different alignment for an 'u64' in 64-bit mode than in 32-bit mode. But the end result is that fixing the compat layer actually breaks the case of a 32-bit automount on a x86-64 kernel. There are various approaches to fix this (including just doing a "strcmp()" on current->comm and comparing it to "automount"), but I think that I will do the one that teaches pipes about a special "packet mode", which will allow user space to not have to care too deeply about the padding at the end of the autofs packet. That change will make the compat workaround unnecessary, so let's revert it first, and get automount working again in compat mode. The packetized pipes will then fix autofs for systemd. Reported-and-requested-by: Michael Tokarev <mjt@tls.msk.ru> Cc: Ian Kent <raven@themaw.net> Cc: stable@kernel.org # for 3.3 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-28 08:29:56 -07:00
Ian Kent	a32744d4ab	autofs: work around unhappy compat problem on x86-64 When the autofs protocol version 5 packet type was added in commit `5c0a32fc2c` ("autofs4: add new packet type for v5 communications"), it obvously tried quite hard to be word-size agnostic, and uses explicitly sized fields that are all correctly aligned. However, with the final "char name[NAME_MAX+1]" array at the end, the actual size of the structure ends up being not very well defined: because the struct isn't marked 'packed', doing a "sizeof()" on it will align the size of the struct up to the biggest alignment of the members it has. And despite all the members being the same, the alignment of them is different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round number (256), the name[] array starts out a 4-byte aligned. End result: the "packed" size of the structure is 300 bytes: 4-byte, but not 8-byte aligned. As a result, despite all the fields being in the same place on all architectures, sizeof() will round up that size to 304 bytes on architectures that have 8-byte alignment for u64. Note that this is not a problem for 32-bit compat mode on POWER, since there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit and 64-bit alignment is different for 64-bit entities, and as a result the structure that has exactly the same layout has different sizes. So on x86-64, but no other architecture, we will just subtract 4 from the size of the structure when running in a compat task. That way we will write the properly sized packet that user mode expects. Not pretty. Sadly, this very subtle, and unnecessary, size difference has been encoded in user space that wants to read packets of exactly the right size, and will refuse to touch anything else. Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-02-25 12:10:27 -08:00
Al Viro	d668dc5663	autofs4: deal with autofs4_write/autofs4_write races Just serialize the actual writing of packets into pipe on a new mutex, independent from everything else in the locking hierarchy. As soon as something has started feeding a piece of packet into the pipe to daemon, we want everything else about to try the same to wait until we are done. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-11 00:20:12 -05:00
Al Viro	030a8ba48f	autofs4: propagate umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:10 -05:00
Linus Torvalds	c3ad996246	autofs4: clean up uaotfs use of debug/info/warning printouts Use 'pr_debug()' for DPRINTK, which will do the proper type checking on the arguments (without generating code) even when DEBUG isn't #defined. Also, use the standard __VA_ARGS__ for the macros, and stop the pointless abuse of 'do { xyz } while (0)' when the macro is already a perfectly well-formed single statement. Reported-by: David Howells <dhowells@redhat.com> Suggested-by: Joe Perches <joe@perches.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-08 11:35:17 -07:00
Ian Kent	e7854723d0	autofs4 - remove autofs4_lock The autofs4_lock introduced by the rcu-walk changes has unnecessarily broad scope. The locking is better handled by the per-autofs super block lookup_lock. Signed-off-by: Ian Kent <raven@themaw.net> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-24 14:54:35 -04:00
Al Viro	b89b12b462	autofs4: clean ->d_release() and autofs4_free_ino() up The latter is called only when both ino and dentry are about to be freed, so cleaning ->d_fsdata and ->dentry is pointless. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-18 01:21:29 -05:00
Al Viro	26e6c91067	autofs4: split autofs4_init_ino() split init_ino into new_ino and clean_ino; the former is what used to be init_ino(NULL, sbi), the latter is for cases where we passed non-NULL ino. Lose unused arguments. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-18 01:21:28 -05:00
Al Viro	726a5e0688	autofs4: autofs4_get_inode() doesn't need autofs_info * argument anymore Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-18 01:21:28 -05:00
Al Viro	0bf71d4d00	autofs4: kill ->size in autofs_info It's used only to pass the length of symlink body to autofs4_get_inode() in autofs4_dir_symlink(). We can bloody well set inode->i_size in autofs4_dir_symlink() directly and be done with that. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-18 01:21:28 -05:00
Al Viro	09f12c03fa	autofs4: pass mode to autofs4_get_inode() explicitly In all cases we'd set inf->mode to know value just before passing it to autofs4_get_inode(). That kills the need to store it in autofs_info and pass it to autofs_init_ino() Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-18 01:21:27 -05:00
Al Viro	292c5ee802	autofs4: keep symlink body in inode->i_private gets rid of all ->free()/->u.symlink machinery in autofs; we simply keep symlink bodies in inode->i_private and free them in ->evict_inode(). Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-18 01:21:27 -05:00
David Howells	b650c858c2	autofs4: Merge the remaining dentry ops tables Merge the remaining autofs4 dentry ops tables. It doesn't matter if d_automount and d_manage are present on something that's not mountable or holdable as these ops are only used if the appropriate flags are set in dentry->d_flags. [AV] switch to ->s_d_op, since now _everything_ on autofs4 is using the same dentry_operations. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:49 -05:00
Ian Kent	71e469db24	autofs4: Clean up dentry operations There are now two distinct dentry operations uses. One for dentrys that trigger mounts and one for dentrys that do not. Rationalize the use of these dentry operations and rename them to reflect their function. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:41 -05:00
Ian Kent	e61da20a50	autofs4: Clean up inode operations Since the use of ->follow_link() has been eliminated there is no need to separate the indirect and direct inode operations. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:40 -05:00
Ian Kent	8c13a676d5	autofs4: Remove unused code Remove code that is not used due to the use of ->d_automount() and ->d_manage(). Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:39 -05:00
Ian Kent	b5b801779d	autofs4: Add d_manage() dentry operation This patch required a previous patch to add the ->d_automount() dentry operation. Add a function to use the newly defined ->d_manage() dentry operation for blocking during mount and expire. Whether the VFS calls the dentry operations d_automount() and d_manage() is controled by the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags. autofs uses the d_automount() operation to callback to user space to request mount operations and the d_manage() operation to block walks into mounts that are under construction or destruction. In order to prevent these functions from being called unnecessarily the DMANAGED_* flags are cleared for cases which would cause this. In the common case the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags are both set for dentrys waiting to be mounted. The DMANAGED_TRANSIT flag is cleared upon successful mount request completion and set during expire runs, both during the dentry expire check, and if selected for expire, is left set until a subsequent successful mount request completes. The exception to this is the so-called rootless multi-mount which has no actual mount at its base. In this case the DMANAGED_AUTOMOUNT flag is cleared upon successful mount request completion as well and set again after a successful expire. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:38 -05:00
Ian Kent	10584211e4	autofs4: Add d_automount() dentry operation Add a function to use the newly defined ->d_automount() dentry operation for triggering mounts instead of doing the user space callback in ->lookup() and ->d_revalidate(). Note, to be useful the subsequent patch to add the ->d_manage() dentry operation is also needed so the discussion of functionality is deferred to that patch. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:37 -05:00
David Howells	cc53ce53c8	Add a dentry op to allow processes to be held during pathwalk transit Add a dentry op (d_manage) to permit a filesystem to hold a process and make it sleep when it tries to transit away from one of that filesystem's directories during a pathwalk. The operation is keyed off a new dentry flag (DCACHE_MANAGE_TRANSIT). The filesystem is allowed to be selective about which processes it holds and which it permits to continue on or prohibits from transiting from each flagged directory. This will allow autofs to hold up client processes whilst letting its userspace daemon through to maintain the directory or the stuff behind it or mounted upon it. The ->d_manage() dentry operation: int (d_manage)(struct path path, bool mounting_here); takes a pointer to the directory about to be transited away from and a flag indicating whether the transit is undertaken by do_add_mount() or do_move_mount() skipping through a pile of filesystems mounted on a mountpoint. It should return 0 if successful and to let the process continue on its way; -EISDIR to prohibit the caller from skipping to overmounted filesystems or automounting, and to use this directory; or some other error code to return to the user. ->d_manage() is called with namespace_sem writelocked if mounting_here is true and no other locks held, so it may sleep. However, if mounting_here is true, it may not initiate or wait for a mount or unmount upon the parameter directory, even if the act is actually performed by userspace. Within fs/namei.c, follow_managed() is extended to check with d_manage() first on each managed directory, before transiting away from it or attempting to automount upon it. follow_down() is renamed follow_down_one() and should only be used where the filesystem deliberately intends to avoid management steps (e.g. autofs). A new follow_down() is added that incorporates the loop done by all other callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS and CIFS do use it, their use is removed by converting them to use d_automount()). The new follow_down() calls d_manage() as appropriate. It also takes an extra parameter to indicate if it is being called from mount code (with namespace_sem writelocked) which it passes to d_manage(). follow_down() ignores automount points so that it can be used to mount on them. __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have that determine whether to abort or not itself. That would allow the autofs daemon to continue on in rcu-walk mode. Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't required as every tranist from that directory will cause d_manage() to be invoked. It can always be set again when necessary. ========================== WHAT THIS MEANS FOR AUTOFS ========================== Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to trigger the automounting of indirect mounts, and both of these can be called with i_mutex held. autofs knows that the i_mutex will be held by the caller in lookup(), and so can drop it before invoking the daemon - but this isn't so for d_revalidate(), since the lock is only held on _some_ of the code paths that call it. This means that autofs can't risk dropping i_mutex from its d_revalidate() function before it calls the daemon. The bug could manifest itself as, for example, a process that's trying to validate an automount dentry that gets made to wait because that dentry is expired and needs cleaning up: mkdir S ffffffff8014e05a 0 32580 24956 Call Trace: [<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897 [<ffffffff80127f7d>] avc_has_perm+0x46/0x58 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e [<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b [<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149 [<ffffffff80036d96>] __lookup_hash+0xa0/0x12f [<ffffffff80057a2f>] lookup_create+0x46/0x80 [<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4 versus the automount daemon which wants to remove that dentry, but can't because the normal process is holding the i_mutex lock: automount D ffffffff8014e05a 0 32581 1 32561 Call Trace: [<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1 [<ffffffff80063c89>] .text.lock.mutex+0xf/0x14 [<ffffffff800e6d55>] do_rmdir+0x77/0xde [<ffffffff8005d229>] tracesys+0x71/0xe0 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 which means that the system is deadlocked. This patch allows autofs to hold up normal processes whilst the daemon goes ahead and does things to the dentry tree behind the automouter point without risking a deadlock as almost no locks are held in d_manage() and none in d_automount(). Signed-off-by: David Howells <dhowells@redhat.com> Was-Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:31 -05:00
Nick Piggin	b5c84bf6f6	fs: dcache remove dcache_lock dcache_lock no longer protects anything. remove it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:23 +11:00
Nick Piggin	2fd6b7f507	fs: dcache scale subdirs Protect d_subdirs and d_child with d_lock, except in filesystems that aren't using dcache_lock for these anyway (eg. using i_mutex). Note: if we change the locking rule in future so that ->d_child protection is provided only with ->d_parent->d_lock, it may allow us to reduce some locking. But it would be an exception to an otherwise regular locking scheme, so we'd have to see some good results. Probably not worthwhile. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:21 +11:00
Nick Piggin	da5029563a	fs: dcache scale d_unhashed Protect d_unhashed(dentry) condition with d_lock. This means keeping DCACHE_UNHASHED bit in synch with hash manipulations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:21 +11:00
Al Viro	4b1ae27a96	Revert "autofs4: always use lookup for lookup" This reverts commit `213614d583`. Alas, ->d_revalidate() can't rely on ->lookup() finishing what it's started; if d_alloc() in do_lookup() fails, we are not going to call ->lookup() at all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 12:58:31 -05:00
Ian Kent	213614d583	autofs4: always use lookup for lookup We need to be able to cope with the directory mutex being held during ->d_revalidate() in some cases, but not all cases, and not necessarily by us. Because we need to release the mutex when we call back to the daemon to do perform a mount we must be sure that it is us who holds the mutex so we must redirect mount requests to ->lookup() if the mutex is held. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Sage Weil <sage@newdream.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Yehuda Saheh <yehuda@newdream.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:19:58 -08:00
Ian Kent	aa952eb26d	autofs4: use autofs_info for pending flag Eliminate the use of the d_lock spin lock by using the autofs super block info spin lock. This reduces the number of spin locks we use by one and makes the code for the following patch (to redirect ->d_revalidate() to ->lookup()) a little simpler. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Sage Weil <sage@newdream.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Yehuda Saheh <yehuda@newdream.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:19:58 -08:00
Ian Kent	c4cd70b3e3	autofs4: use helper functions for expiring list Define some simple helper functions for adding and deleting entries on the expiring dentry list. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Sage Weil <sage@newdream.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Yehuda Saheh <yehuda@newdream.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:19:57 -08:00
Ian Kent	4f8427d190	autofs4: use helper functions for active list handling Define some simple helper functions for adding and deleting entries on the active (and unhashed) dentry list. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Sage Weil <sage@newdream.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andreas Dilger <adilger@sun.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Yehuda Saheh <yehuda@newdream.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:19:57 -08:00
Al Viro	9393bd07cf	switch follow_down() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Ian Kent	56fcef7511	autofs4: cleanup expire code duplication A significant portion of the autofs_dev_ioctl_expire() and autofs4_expire_multi() functions is duplicated code. This patch cleans that up. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-01 08:59:23 -07:00
Ian Kent	a92daf6ba1	autofs4: make autofs type usage explicit - the type assigned at mount when no type is given is changed from 0 to AUTOFS_TYPE_INDIRECT. This was done because 0 and AUTOFS_TYPE_INDIRECT were being treated implicitly as the same type. - previously, an offset mount had it's type set to AUTOFS_TYPE_DIRECT\|AUTOFS_TYPE_OFFSET but the mount control re-implementation needs to be able distinguish all three types. So this was changed to make the type setting explicit. - a type AUTOFS_TYPE_ANY was added for use by the re-implementation when checking if a given path is a mountpoint. It's not really a type as we use this to ask if a given path is a mountpoint in the autofs_dev_ioctl_ismountpoint() function. - functions to set and test the autofs mount types have been added to improve readability and make the type usage explicit. - the mount type is used from user space for the mount control re-implementtion so, for consistency, all the definitions have been moved to the user space include file include/linux/auto_fs4.h. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:23 -08:00
Ian Kent	8d7b48e0bc	autofs4: add miscellaneous device for ioctls Add a miscellaneous device to the autofs4 module for routing ioctls. This provides the ability to obtain an ioctl file handle for an autofs mount point that is possibly covered by another mount. The actual problem with autofs is that it can't reconnect to existing mounts. Immediately one things of just adding the ability to remount autofs file systems would solve it, but alas, that can't work. This is because autofs direct mounts and the implementation of "on demand mount and expire" of nested mount trees have the file system mounted on top of the mount trigger dentry. To resolve this a miscellaneous device node for routing ioctl commands to these mount points has been implemented in the autofs4 kernel module and a library added to autofs. This provides the ability to open a file descriptor for these over mounted autofs mount points. Please refer to Documentation/filesystems/autofs4-mount-control.txt for a discussion of the problem, implementation alternatives considered and a description of the interface. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:39 -07:00
Ian Kent	c0f54d3e54	autofs4: track uid and gid of last mount requester Track the uid and gid of the last process to request a mount for on an autofs dentry. [akpm@linux-foundation.org: fix tpyo in comment] Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:39 -07:00
Ian Kent	bb979d7fc3	autofs4: cleanup autofs mount type usage Usage of the AUTOFS_TYPE_* defines is a little confusing and appears inconsistent. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:39 -07:00
Ian Kent	06a3598552	autofs4: reorganize expire pending wait function calls This patch re-orgnirzes the checking for and waiting on active expires and elininates redundant checks. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:32 -07:00
Ian Kent	6e60a9ab5f	autofs4: fix direct mount pending expire race For direct and offset type mounts that are covered by another mount we cannot check the AUTOFS_INF_EXPIRING flag during a path walk which leads to lookups walking into an expiring mount while it is being expired. For example, for the direct multi-mount map entry with a couple of offsets: /race/mm1 / <server1>:/<path1> /om1 <server2>:/<path2> /om2 <server1>:/<path3> an autofs trigger mount is mounted on /race/mm1 and when accessed it is over mounted and trigger mounts made for /race/mm1/om1 and /race/mm1/om2. So it isn't possible for path walks to see the expiring flag at all and they happily walk into the file system while it is expiring. When expiring these mounts follow_down() must stop at the autofs mount and all processes must block in the ->follow_link() method (except the daemon) until the expire is complete. This is done by decrementing the d_mounted field of the autofs trigger mount root dentry until the expire is completed. In ->follow_link() all processes wait on the expire and the mount following is completed for the daemon until the expire is complete. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:32 -07:00
Ian Kent	97e7449a7a	autofs4: fix indirect mount pending expire race The selection of a dentry for expiration and the setting of the AUTOFS_INF_EXPIRING flag isn't done atomically which can lead to lookups walking into an expiring mount. What happens is that an expire is initiated by the daemon and a dentry is selected for expire but, since there is no lock held between the selection and setting of the expiring flag, a process may find the flag clear and continue walking into the mount tree at the same time the daemon attempts the expire it. Signed-off-by: Ian Kent <raven@themaw.net> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:32 -07:00
Ian Kent	296f7bf78b	autofs4: fix waitq memory leak If an autofs mount becomes catatonic before autofs4_wait_release() is called the wait queue counter will not be decremented down to zero and the entry will never be freed. There are also races decrementing the wait counter in the wait release function. To deal with this the counter needs to be updated while holding the wait queue mutex and waiters need to be woken up unconditionally when the wait is removed from the queue to ensure we eventually free the wait. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:32 -07:00
Jeff Moyer	70b52a0a50	autofs4: use struct qstr in waitq.c The autofs_wait_queue already contains all of the fields of the struct qstr, so change it into a qstr. This patch, from Jeff Moyer, has been modified a liitle by myself. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:32 -07:00
Ian Kent	2576737873	autofs4: use look aside list for lookups A while ago a patch to resolve a deadlock during directory creation was merged. This delayed the hashing of lookup dentrys until the ->mkdir() (or ->symlink()) operation completed to ensure we always went through ->lookup() instead of also having processes go through ->revalidate() so our VFS locking remained consistent. Now we are seeing a couple of side affects of that change in situations with heavy mount activity. Two cases have been identified: 1) When a mount request is triggered, due to the delayed hashing, the directory created by user space for the mount point doesn't have the DCACHE_AUTOFS_PENDING flag set. In the case of an autofs multi-mount where a tree of mount point directories are created this can lead to the path walk continuing rather than the dentry being sent to the wait queue to wait for request completion. This is because, if the pending flag isn't set, the criteria for deciding this is a mount in progress fails to hold, namely that the dentry is not a mount point and has no subdirectories. 2) A mount request dentry is initially created negative and unhashed. It remains this way until the ->mkdir() callback completes. Since it is unhashed a fresh dentry is used when the user space mount request creates the mount point directory. This leaves the original dentry negative and unhashed. But revalidate has no way to tell the VFS that the dentry has changed, other than to force another ->lookup() by returning false, which is at best wastefull and at worst not possible. This results in an -ENOENT return from the original path walk when in fact the mount succeeded. To resolve this we need to ensure that the same dentry is used in all calls to ->lookup() during the course of a mount request. This patch achieves that by adding the initial dentry to a look aside list and removes it at ->mkdir() or ->symlink() completion (or when the dentry is released), since these are the only create operations autofs4 supports. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:31 -07:00
Ian Kent	5f6f4f28b6	autofs4: don't make expiring dentry negative Correct the error of making a positive dentry negative after it has been instantiated. The code that makes this error attempts to re-use the dentry from a concurrent expire and mount to resolve a race and the dentry used for the lookup must be negative for mounts to trigger in the required cases. The fact is that the dentry doesn't need to be re-used because all that is needed is to preserve the flag that indicates an expire is still incomplete at the time of the mount request. This change uses the the dentry to check the flag and wait for the expire to complete then discards it instead of attempting to re-use it. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:31 -07:00
Harvey Harrison	8e24eea728	fs: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:54 -07:00
Pavel Emelianov	a47afb0f9d	pid namespaces: round up the API The set of functions process_session, task_session, process_group and task_pgrp is confusing, as the names can be mixed with each other when looking at the code for a long time. The proposals are to * equip the functions that return the integer with _nr suffix to represent that fact, * and to make all functions work with task (not process) by making the common prefix of the same name. For monotony the routines signal_session() and set_signal_session() are replaced with task_session_nr() and set_task_session(), especially since they are only used with the explicit task->signal dereference. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Ian Kent	f50b6f8691	[PATCH] autofs4: fix another race between mount and expire Jeff Moyer has identified a race between mount and expire. What happens is that during an expire the situation can arise that a directory is removed and another lookup is done before the expire issues a completion status to the kernel module. In this case, since the the lookup gets a new dentry, it doesn't know that there is an expire in progress and when it posts its mount request, matches the existing expire request and waits for its completion. ENOENT is then returned to user space from lookup (as the dentry passed in is now unhashed) without having performed the mount request. The solution used here is to keep track of dentrys in this unhashed state and reuse them, if possible, in order to preserve the flags. Additionally, this infrastructure will provide the framework for the reintroduction of caching of mount fails removed earlier in development. Signed-off-by: Ian Kent <raven@themaw.net> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-20 17:10:15 -08:00
Arjan van de Ven	754661f143	[PATCH] mark struct inode_operations const 1 Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00

1 2

71 Commits