If the mount is aborted after userspace has been asked to mount,
userspace must be told to unmount.
Ordinarily orangefs_kill_sb does the unmount. However it cannot be
called if the superblock has not been set up. This is a very narrow
window.
The NULL fs_id is not unmounted.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Otherwise lockdep says:
[ 1337.483798] ================================================
[ 1337.483999] [ BUG: lock held when returning to user space! ]
[ 1337.484252] 4.11.0-rc6 #19 Not tainted
[ 1337.484423] ------------------------------------------------
[ 1337.484626] mount/14766 is leaving the kernel with locks still held!
[ 1337.484841] 1 lock held by mount/14766:
[ 1337.485017] #0: (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520
Caught by xfstests generic/413 which tried to mount with the unsupported
mount option dax. Then xfstests generic/422 ran sync which deadlocks.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Without this fix (and another to the userspace component itself
described later), the kernel will be unable to process any OrangeFS
requests after the userspace component is restarted (due to a crash or
at the administrator's behest).
The bug here is that inside orangefs_remount, the orangefs_request_mutex
is locked. When the userspace component restarts while the filesystem
is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device,
which causes the kernel to send it a few requests aimed at synchronizing
the state between the two. While this is happening the
orangefs_request_mutex is locked to prevent any other requests going
through.
This is only half of the bugfix. The other half is in the userspace
component which outright ignores(!) requests made before it considers
the filesystem remounted, which is after the ioctl returns. Of course
the ioctl doesn't return until after the userspace component responds to
the request it ignores. The userspace component has been changed to
allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status.
Mike Marshall says:
"I've tested this patch against the fixed userspace part. This patch is
real important, I hope it can make it into 4.11...
Here's what happens when the userspace daemon is restarted, without
the patch:
=============================================
[ INFO: possible recursive locking detected ]
[ 4.10.0-00007-ge98bdb3 #1 Not tainted ]
---------------------------------------------
pvfs2-client-co/29032 is trying to acquire lock:
(orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs]
but task is already holding lock:
(orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs]
CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb3 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
Call Trace:
__lock_acquire+0x7eb/0x1290
lock_acquire+0xe8/0x1d0
mutex_lock_killable_nested+0x6f/0x6e0
service_operation+0x3c7/0x7b0 [orangefs]
orangefs_remount+0xea/0x150 [orangefs]
dispatch_ioctl_command+0x227/0x330 [orangefs]
orangefs_devreq_ioctl+0x29/0x70 [orangefs]
do_vfs_ioctl+0xa3/0x6e0
SyS_ioctl+0x79/0x90"
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
freeing of inodes must be RCU-delayed on all filesystems
Cc: stable@vger.kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull in an OrangeFS branch containing miscellaneous improvements.
- clean up debugfs globals
- remove dead code in sysfs
- reorganize duplicated sysfs attribute structs
- consolidate sysfs show and store functions
- remove duplicated sysfs_ops structures
- describe organization of sysfs
- make devreq_mutex static
- g_orangefs_stats -> orangefs_stats for consistency
- rename most remaining global variables
OrangeFS 2.9.6 was released without support for the features op. Thus
OrangeFS 2.9.7 will be required to use it.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Only op_timeout_secs, slot_timeout_secs, and hash_table_size are left
because they are exposed as module parameters. All other global
variables have the orangefs_ prefix.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
This is a new userspace operation, which will be done if the client-core
version is greater than or equal to 2.9.6. This will provide a way to
implement optional features and to determine which features are
supported by the client-core. If the client-core version is older than
2.9.6, no optional features are supported and the op will not be done.
The intent is to allow protocol extensions without relying on the
client-core's current behavior of ignoring what it doesn't understand.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
* switch orangefs_remount() to taking ORANGEFS_SB(sb) instead of sb
* remove from the list _before_ orangefs_unmount() - request_mutex
in the latter will make sure that nothing observed in the loop in
ORANGEFS_DEV_REMOUNT_ALL handling will get freed until the end
of loop
* on removal, keep the forward pointer and zero the back one. That
way we can drop and regain the spinlock in the loop body (again,
ORANGEFS_DEV_REMOUNT_ALL one) and still be able to get to the
rest of the list.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
A couple of caches were no longer needed:
- iov_iter improvements to orangefs_devreq_write_iter eliminated
the need for the dev_req_cache.
- removal (months ago) of the old AIO code eliminated the need
for the kiocb_cache.
Also, deobfuscation of use of GFP_KERNEL when calling kmem_cache_(z)alloc
for remaining caches.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
This export_operations structure is never modified, so declare it as const.
Most other structures of this type are already const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
OrangeFS was formerly known as PVFS2 and retains the name in many places.
I leave the device /dev/pvfs2-req since this affects userspace.
I leave the filesystem type pvfs2 since this affects userspace. Further
the OrangeFS sysint library reads fstab for an entry of type pvfs2
independently of kernel mounts.
I leave extended attribute keys user.pvfs2 and system.pvfs2 as the
sysint library understands these.
I leave references to userspace binaries still named pvfs2.
I leave the filenames.
Signed-off-by: Yi Liu <yi9@clemson.edu>
[martin@omnibond.com: clairify above constraints and merge]
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
The only reason for that thing used to be the API of mount_nodev()
callback; since we are calling pvfs2_fill_sb() ourselves now,
we don't have to shove everything into a single structure.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>