linux-sg2042/fs
Jiufei Xue 814ce69432 ocfs2: fix a tiny race that leads file system read-only
when o2hb detect a node down, it first set the dead node to recovery map
and create ocfs2rec which will replay journal for dead node.  o2hb
thread then call dlm_do_local_recovery_cleanup() to delete the lock for
dead node.  After the lock of dead node is gone, locks for other nodes
can be granted and may modify the meta data without replaying journal of
the dead node.  The detail is described as follows.

     N1                         N2                   N3(master)
modify the extent tree of
inode, and commit
dirty metadata to journal,
then goes down.
                                                 o2hb thread detects
                                                 N1 goes down, set
                                                 recovery map and
                                                 delete the lock of N1.

                                                 dlm_thread flush ast
                                                 for the lock of N2.
                        do not detect the death
                        of N1, so recovery map is
                        empty.

                        read inode from disk
                        without replaying
                        the journal of N1 and
                        modify the extent tree
                        of the inode that N1
                        had modified.
                                                 ocfs2rec recover the
                                                 journal of N1.
                                                 The modification of N2
                                                 is lost.

The modification of N1 and N2 are not serial, and it will lead to
read-only file system.  We can set recovery_waiting flag to the lock
resource after delete the lock for dead node to prevent other node from
getting the lock before dlm recovery.  After dlm recovery, the recovery
map on N2 is not empty, ocfs2_inode_lock_full_nested() will wait for ocfs2
recovery.

Signed-off-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
..
9p wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
adfs fs/adfs/adfs.h: tidy up comments 2016-01-20 17:09:18 -08:00
affs affs_do_readpage_ofs(): just use kmap_atomic() around memcpy() 2016-02-20 00:15:51 -05:00
afs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
autofs4 switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
befs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
bfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
btrfs Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-03-04 17:31:32 -08:00
cachefiles wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ceph ceph: initial CEPH_FEATURE_FS_FILE_LAYOUT_V2 support 2016-03-04 21:00:37 +01:00
cifs CIFS: Fix duplicate line introduced by clone_file_range patch 2016-03-01 09:38:00 -06:00
coda Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-23 12:24:56 -08:00
configfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
cramfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
debugfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
devpts pty: make sure super_block is still valid in final /dev/tty close 2016-02-06 23:45:46 -08:00
dlm [regression] fix braino in fs/dlm/user.c 2016-01-21 17:45:15 -05:00
ecryptfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
efivarfs efi: Make efivarfs entries immutable by default 2016-02-10 16:25:52 +00:00
efs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
exofs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
exportfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ext2 Merge branch 'akpm' (patches from Andrew) 2016-02-27 12:46:16 -08:00
ext4 This fixes a regression which crept in v4.5-rc5. 2016-03-09 19:33:05 -08:00
f2fs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
fat wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
freevxfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
fscache FS-Cache: Handle a write to the page immediately beyond the EOF marker 2015-11-11 02:11:02 -05:00
fuse wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
gfs2 wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
hfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
hfsplus wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
hostfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
hpfs hpfs: don't truncate the file when delete fails 2016-02-27 19:15:51 -05:00
hugetlbfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
isofs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
jbd2 fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
jffs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-03-11 10:13:49 -08:00
jfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
kernfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
lockd lockd: constify nlmsvc_binding structure 2016-01-07 10:10:50 -05:00
logfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
minix kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ncpfs ncpfs: fix a braino in OOM handling in ncp_fill_cache() 2016-03-07 22:25:16 -05:00
nfs NFSv4.x/pnfs: Fix a race between layoutget and bulk recalls 2016-02-22 17:46:34 -05:00
nfs_common
nfsd wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
nilfs2 wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
nls
notify fsnotify: turn fsnotify reaper thread into a workqueue job 2016-02-18 16:23:24 -08:00
ntfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ocfs2 ocfs2: fix a tiny race that leads file system read-only 2016-03-15 16:55:16 -07:00
omfs
openpromfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
overlayfs ovl: copy new uid/gid into overlayfs runtime inode 2016-03-03 17:17:46 +01:00
proc proc: revert /proc/<pid>/maps [stack:TID] annotation 2016-02-03 08:28:43 -08:00
pstore wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
qnx4 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
qnx6 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
quota wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ramfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
reiserfs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
romfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
squashfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
sysfs platform/chrome: Branch for v4.4 2015-11-13 21:53:18 -08:00
sysv kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
tracefs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ubifs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
udf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-23 12:24:56 -08:00
ufs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
xfs xfs: fixes for 4.5-rc7 2016-03-11 10:21:32 -08:00
Kconfig dax: re-enable dax pmd mappings 2016-01-15 17:56:32 -08:00
Kconfig.binfmt
Makefile ext4: promote ext4 over ext2 in the default probe order 2015-10-15 10:33:21 -04:00
aio.c mm: move ->mremap() from file_operations to vm_operations_struct 2015-09-04 16:54:41 -07:00
anon_inodes.c
attr.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
bad_inode.c fs/bad_inode.c: is_bad_inode can be boolean 2015-12-06 21:17:14 -05:00
binfmt_aout.c
binfmt_elf.c mm: ASLR: use get_random_long() 2016-02-27 10:28:52 -08:00
binfmt_elf_fdpic.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
binfmt_script.c
block_dev.c dax: move writeback calls into the filesystems 2016-02-27 10:28:52 -08:00
buffer.c fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
char_dev.c
compat.c saner calling conventions for copy_mount_options() 2016-01-04 10:28:32 -05:00
compat_binfmt_elf.c
compat_ioctl.c Bluetooth: Add missing COMPATIBLE_IOCTL for UART line discipline 2016-01-27 10:48:26 -05:00
coredump.c fs/coredump: prevent "" / "." / ".." core path components 2016-01-20 17:09:18 -08:00
dax.c dax: check return value of dax_radix_entry() 2016-03-09 15:43:42 -08:00
dcache.c use ->d_seq to get coherency between ->d_inode and ->d_flags 2016-02-29 12:16:43 -05:00
dcookies.c
direct-io.c block: fix use-after-free in dio_bio_complete 2016-01-30 22:02:10 -07:00
drop_caches.c
eventfd.c Documentation: filesystem: Fix typo in fs/eventfd.c 2015-12-08 14:52:03 +01:00
eventpoll.c epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT 2016-02-05 18:10:40 -08:00
exec.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
fcntl.c fcntl: allow to set O_DIRECT flag on pipe 2016-01-09 02:55:37 -05:00
fhandle.c
file.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
file_table.c
filesystems.c find_filesystem(): simplify comparison 2016-01-19 12:02:23 -05:00
fs-writeback.c writeback: flush inode cgroup wb switches instead of pinning super_block 2016-03-03 14:42:50 -07:00
fs_pin.c
fs_struct.c
inode.c writeback: initialize inode members that track writeback history 2016-02-16 14:57:21 -07:00
internal.h Merge branch 'for-linus' into work.misc 2016-01-08 21:20:11 -05:00
ioctl.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
libfs.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
locks.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
mbcache.c
mount.h
mpage.c mm, fs: introduce mapping_gfp_constraint() 2015-11-06 17:50:42 -08:00
namei.c do_last(): ELOOP failure exit should be done after leaving RCU mode 2016-02-27 19:37:37 -05:00
namespace.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
no-block.c
nsfs.c fs/seq_file: convert int seq_vprint/seq_printf/etc... returns to void 2015-09-11 15:21:34 -07:00
open.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
pipe.c pipe: limit the per-user amount of pages allocated in pipes 2016-01-19 19:25:21 -05:00
pnode.c fs/pnode.c: treat zero mnt_group_id-s as unequal 2016-02-20 00:15:52 -05:00
pnode.h
posix_acl.c xattr handlers: Simplify list operation 2015-12-13 19:46:12 -05:00
proc_namespace.c vfs: show_vfsstat: remove redundant initialization and check of error code 2015-12-06 21:17:16 -05:00
read_write.c fs: return -EOPNOTSUPP if clone is not supported 2016-02-27 19:15:51 -05:00
readdir.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
select.c poll: plug an unused argument to do_poll 2016-01-06 08:26:52 -05:00
seq_file.c fs, seqfile: always allow oom killer 2015-11-06 17:50:42 -08:00
signalfd.c
splice.c fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE 2016-01-09 02:55:35 -05:00
stack.c
stat.c fs/stat.c: drop the last new_valid_dev check 2016-01-16 11:17:23 -08:00
statfs.c
super.c writeback: flush inode cgroup wb switches instead of pinning super_block 2016-03-03 14:42:50 -07:00
sync.c fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE writeback 2015-11-06 17:50:42 -08:00
timerfd.c timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper 2016-01-17 11:13:55 +01:00
userfaultfd.c userfaultfd: don't block on the last VM updates at exit time 2016-03-02 09:03:18 -08:00
utimes.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
xattr.c xattr handlers: plug a lock leak in simple_xattr_list 2016-02-20 00:15:51 -05:00