linux-sg2042/fs/ntfs
Anton Altaparmakov ba6d2377c8 NTFS: Fix a nasty deadlock that appeared in recent kernels.
The situation: VFS inode X on a mounted ntfs volume is dirty.  For
      same inode X, the ntfs_inode is dirty and thus corresponding on-disk
      inode, i.e. mft record, which is in a dirty PAGE_CACHE_PAGE belonging
      to the table of inodes, i.e. $MFT, inode 0.
      What happens:
      Process 1: sys_sync()/umount()/whatever...  calls
      __sync_single_inode() for $MFT -> do_writepages() -> write_page for
      the dirty page containing the on-disk inode X, the page is now locked
      -> ntfs_write_mst_block() which clears PageUptodate() on the page to
      prevent anyone else getting hold of it whilst it does the write out.
      This is necessary as the on-disk inode needs "fixups" applied before
      the write to disk which are removed again after the write and
      PageUptodate is then set again.  It then analyses the page looking
      for dirty on-disk inodes and when it finds one it calls
      ntfs_may_write_mft_record() to see if it is safe to write this
      on-disk inode.  This then calls ilookup5() to check if the
      corresponding VFS inode is in icache().  This in turn calls ifind()
      which waits on the inode lock via wait_on_inode whilst holding the
      global inode_lock.
      Process 2: pdflush results in a call to __sync_single_inode for the
      same VFS inode X on the ntfs volume.  This locks the inode (I_LOCK)
      then calls write-inode -> ntfs_write_inode -> map_mft_record() ->
      read_cache_page() for the page (in page cache of table of inodes
      $MFT, inode 0) containing the on-disk inode.  This page has
      PageUptodate() clear because of Process 1 (see above) so
      read_cache_page() blocks when it tries to take the page lock for the
      page so it can call ntfs_read_page().
      Thus Process 1 is holding the page lock on the page containing the
      on-disk inode X and it is waiting on the inode X to be unlocked in
      ifind() so it can write the page out and then unlock the page.
      And Process 2 is holding the inode lock on inode X and is waiting for
      the page to be unlocked so it can call ntfs_readpage() or discover
      that Process 1 set PageUptodate() again and use the page.
      Thus we have a deadlock due to ifind() waiting on the inode lock.
      The solution: The fix is to use the newly introduced
      ilookup5_nowait() which does not wait on the inode's lock and hence
      avoids the deadlock.  This is safe as we do not care about the VFS
      inode and only use the fact that it is in the VFS inode cache and the
      fact that the vfs and ntfs inodes are one struct in memory to find
      the ntfs inode in memory if present.  Also, the ntfs inode has its
      own locking so it does not matter if the vfs inode is locked.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-26 22:12:02 +01:00
..
ChangeLog NTFS: Fix a nasty deadlock that appeared in recent kernels. 2005-06-26 22:12:02 +01:00
Makefile NTFS: Prepare for 2.1.23 release: Update documentation and bump version. 2005-06-25 21:07:27 +01:00
aops.c NTFS: Fix a bug in address space operations error recovery code paths where 2005-06-25 16:15:36 +01:00
aops.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
attrib.c NTFS: Prepare for 2.1.23 release: Update documentation and bump version. 2005-06-25 21:07:27 +01:00
attrib.h NTFS: Fix compilation when configured read-only. 2005-05-05 11:39:30 +01:00
bitmap.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
bitmap.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
collate.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
collate.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
compress.c NTFS: - In fs/ntfs/compress.c, use i_size_read() at the start and then use the 2005-05-05 10:30:29 +01:00
debug.c NTFS: Fix printk format warnings on ia64. (Randy Dunlap) 2005-05-05 11:11:47 +01:00
debug.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
dir.c NTFS: Fix several occurences of a bug where we would perform 'var & ~const' 2005-06-25 16:51:58 +01:00
dir.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
endian.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
file.c NTFS: Use i_size_read() in fs/ntfs/file.c::ntfs_file_open(). 2005-05-04 17:02:25 +01:00
index.c NTFS: Use C99 style structure initialization after memory allocation where 2005-05-27 16:42:56 +01:00
index.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
inode.c NTFS: Remove spurious void pointer casts from fs/ntfs/. 2005-05-27 16:00:53 +01:00
inode.h NTFS: Fix several occurences of a bug where we would perform 'var & ~const' 2005-06-25 16:51:58 +01:00
layout.h NTFS: Fix a bug in address space operations error recovery code paths where 2005-06-25 16:15:36 +01:00
lcnalloc.c NTFS: Fix several occurences of a bug where we would perform 'var & ~const' 2005-06-25 16:51:58 +01:00
lcnalloc.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
logfile.c NTFS: Fix several occurences of a bug where we would perform 'var & ~const' 2005-06-25 16:51:58 +01:00
logfile.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
malloc.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
mft.c NTFS: Fix a nasty deadlock that appeared in recent kernels. 2005-06-26 22:12:02 +01:00
mft.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
mst.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
namei.c NTFS: Remove checks for NULL before calling kfree() since kfree() does the 2005-05-05 11:42:27 +01:00
ntfs.h NTFS: Minor cleanup: Define and use NTFS_MAX_CLUSTER_SIZE constant instead 2005-05-05 11:48:00 +01:00
quota.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
quota.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
runlist.c NTFS: Add an extra parameter @last_vcn to ntfs_get_size_for_mapping_pairs() 2005-06-25 17:15:36 +01:00
runlist.h NTFS: Add an extra parameter @last_vcn to ntfs_get_size_for_mapping_pairs() 2005-06-25 17:15:36 +01:00
super.c NTFS: Detect the case when Windows has been suspended to disk on the volume 2005-06-25 16:31:27 +01:00
sysctl.c NTFS: - Add disable_sparse mount option together with a per volume sparse 2005-05-05 10:53:01 +01:00
sysctl.h Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
time.h NTFS: Change time to u64 in time.h::ntfs2utc() as it otherwise generates a 2005-05-05 11:01:13 +01:00
types.h NTFS: Stamp the transaction log ($UsnJrnl), aka user space journal, if it 2005-06-25 15:28:56 +01:00
unistr.c NTFS: Remove spurious void pointer casts from fs/ntfs/. 2005-05-27 16:00:53 +01:00
upcase.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
usnjrnl.c NTFS: Stamp the transaction log ($UsnJrnl), aka user space journal, if it 2005-06-25 15:28:56 +01:00
usnjrnl.h NTFS: Stamp the transaction log ($UsnJrnl), aka user space journal, if it 2005-06-25 15:28:56 +01:00
volume.h NTFS: Stamp the transaction log ($UsnJrnl), aka user space journal, if it 2005-06-25 15:28:56 +01:00