linux-sg2042/fs/xfs
Dave Chinner 239567033c xfs: inode log reservations are too small
We've been seeing occasional problems with log space leaks and
transaction underruns such as this for some time:

 XFS (dm-0): xlog_write: reservation summary:
   trans type  = FSYNC_TS (36)
   unit res    = 2740 bytes
   current res = -4 bytes
   total reg   = 0 bytes (o/flow = 0 bytes)
   ophdrs      = 0 (ophdr space = 0 bytes)
   ophdr + reg = 0 bytes
   num regions = 0

Turns out that xfstests generic/311 is reliably reproducing this
problem with the test it runs at sequence 16 of it execution. It is
a 100% reliable reproducer with the mkfs configuration of "-b
size=1024 -m crc=1" on a 10GB scratch device.

The problem? Inode forks in btree format are logged in memory
format, not disk format (i.e. bmbt format, not bmdr format). That
means there is a btree block header being logged, when such a
structure is never written to the inode fork in bmdr format. The
bmdr header in the inode is only 4 bytes, while the bmbt header is
24 bytes for v4 filesystems and 72 bytes for v5 filesystems.

We currently reserve the inode size plus the rounded up overhead of
a logging a buffer, which is 128 bytes. That means the reservation
for a 512 byte inode is 640 bytes. What we can actually log is:

	inode core, data and attr fork = 512 bytes
	inode log format + log op header = 56 + 12 = 68 bytes
	data fork bmbt hdr = 24/72 bytes
	attr fork bmbt hdr = 24/72 bytes

So, for a v2 inodes we can log at least 628 bytes, but if we split that
inode over the end of the log across log buffers, we need to also
another log op header, which takes us to 640 bytes. If there's
another reservation taken out of this that I haven't taken into
account (perhaps multiple iclog splits?) or I haven't corectly
calculated the bmbt format space used (entirely possible), then
we will overun it.

For v3 inodes the maximum is actually 724 bytes, and even a
single maximally sized btree format fork can blow it (652 bytes).
And that's exactly what is happening with the FSYNC_TS transaction
in the above output - it's consumed 644 bytes of space after the CIL
context took the space reserved for it (2100 bytes).

This problem has always been present in the XFS code - the btree
format inode forks have always been logged in this manner. Hence
there has always been the possibility of an overrun with such a
transaction. The CRC code has just exposed it frequently enough to
be able to debug and understand the root cause....

So, let's fix all the inode log space reservations.

[ I'm so glad we spent the effort to clean up the transaction
  reservation code. This is an easy fix now. ]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-08-30 13:59:30 -05:00
..
Kconfig xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
Makefile xfs: Add xfs_log_rlimit.c 2013-08-12 17:49:38 -05:00
kmem.c xfs: switch to proper __bitwise type for KM_... flags 2012-05-29 23:28:32 -04:00
kmem.h xfs: switch to proper __bitwise type for KM_... flags 2012-05-29 23:28:32 -04:00
mrlock.h xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
time.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
uuid.c xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
uuid.h xfs: add CRC infrastructure 2012-11-19 20:11:24 -06:00
xfs.h xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
xfs_acl.c xfs: convert kuid_t to/from uid_t in ACLs 2013-08-15 14:18:31 -05:00
xfs_acl.h xfs: increase number of ACL entries for V5 superblocks 2013-06-06 10:52:15 -05:00
xfs_ag.h xfs: make struct xfs_perag kernel only 2013-08-12 17:44:36 -05:00
xfs_alloc.c xfs: kill __KERNEL__ check for debug code in allocation code 2013-08-12 16:57:51 -05:00
xfs_alloc.h xfs: convert buffer verifiers to an ops structure. 2012-11-15 21:35:12 -06:00
xfs_alloc_btree.c xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
xfs_alloc_btree.h xfs: add support for large btree blocks 2013-04-21 14:53:46 -05:00
xfs_aops.c xfs: rename bio_add_buffer() to xfs_bio_add_buffer() 2013-08-20 15:35:00 -05:00
xfs_aops.h Prefix IO_XX flags with XFS_IO_XX to avoid namespace colision. 2012-07-22 11:00:55 -05:00
xfs_attr.c xfs: avoid double-free in xfs_attr_node_addname 2013-08-13 15:48:01 -05:00
xfs_attr.h xfs: kill xfs_vnodeops.[ch] 2013-08-12 16:53:39 -05:00
xfs_attr_inactive.c xfs: refactor xfs_trans_reserve() interface 2013-08-12 17:47:34 -05:00
xfs_attr_leaf.c xfs: minor cleanups 2013-08-12 16:46:08 -05:00
xfs_attr_leaf.h xfs: sync minor header differences needed by userspace. 2013-08-12 16:35:41 -05:00
xfs_attr_list.c xfs: split out attribute listing code into separate file 2013-08-12 16:41:29 -05:00
xfs_attr_remote.c xfs: fix issues that cause userspace warnings 2013-08-12 16:52:54 -05:00
xfs_attr_remote.h xfs: rework remote attr CRCs 2013-05-30 17:26:31 -05:00
xfs_attr_sf.h
xfs_bit.c
xfs_bit.h
xfs_bmap.c xfs: fix the comment of xfs_bmap_last_before() 2013-08-20 15:36:28 -05:00
xfs_bmap.h xfs: remove __KERNEL__ from debug code 2013-08-12 16:58:37 -05:00
xfs_bmap_btree.c xfs: minor cleanups 2013-08-12 16:46:08 -05:00
xfs_bmap_btree.h xfs: check on-disk (not incore) btree root size in dfrag.c 2013-06-20 13:26:09 -05:00
xfs_bmap_util.c xfs: fix the comment of xfs_bmap_punch_delalloc_range() 2013-08-20 15:40:16 -05:00
xfs_bmap_util.h xfs: consolidate extent swap code 2013-08-12 16:56:06 -05:00
xfs_btree.c xfs: btree block LSN escaping to disk uninitialised 2013-08-30 13:43:34 -05:00
xfs_btree.h xfs: sync minor header differences needed by userspace. 2013-08-12 16:35:41 -05:00
xfs_buf.c xfs: fix the comment of xfs_setsize_buftarg_early() 2013-08-20 15:40:39 -05:00
xfs_buf.h xfs: use b_maps[] for discontiguous buffers 2013-01-16 16:07:11 -06:00
xfs_buf_item.c xfs: use reference counts to free clean buffer items 2013-08-15 16:42:29 -05:00
xfs_buf_item.h xfs: split out buf log item format definitions 2013-08-12 16:06:37 -05:00
xfs_cksum.h xfs: add CRC infrastructure 2012-11-19 20:11:24 -06:00
xfs_da_btree.c xfs: fix issues that cause userspace warnings 2013-08-12 16:52:54 -05:00
xfs_da_btree.h XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568 2013-08-30 09:48:59 -05:00
xfs_dinode.h xfs: di_flushiter considered harmful 2013-07-24 12:15:23 -05:00
xfs_dir2.c XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568 2013-08-30 09:48:59 -05:00
xfs_dir2.h xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_dir2_block.c xfs: Add write support for dirent filetype field 2013-08-22 08:44:49 -05:00
xfs_dir2_data.c xfs: Add write support for dirent filetype field 2013-08-22 08:44:49 -05:00
xfs_dir2_format.h xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_dir2_leaf.c xfs: Add write support for dirent filetype field 2013-08-22 08:44:49 -05:00
xfs_dir2_node.c xfs: Add write support for dirent filetype field 2013-08-22 08:44:49 -05:00
xfs_dir2_priv.h xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_dir2_readdir.c xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_dir2_sf.c xfs: xfs_dir3_sfe_put_ino can be static 2013-08-26 11:27:40 -05:00
xfs_discard.c xfs: split out transaction reservation code 2013-08-12 16:36:16 -05:00
xfs_discard.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_dquot.c xfs: refactor xfs_trans_reserve() interface 2013-08-12 17:47:34 -05:00
xfs_dquot.h xfs: Add pquota fields where gquota is used. 2013-07-11 10:35:32 -05:00
xfs_dquot_item.c xfs: return log item size in IOP_SIZE 2013-08-13 16:10:21 -05:00
xfs_dquot_item.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_error.c xfs: consolidate xfs_utils.c 2013-08-12 16:55:17 -05:00
xfs_error.h xfs: kill support/debug.[ch] 2011-03-07 10:09:35 +11:00
xfs_export.c xfs: kill xfs_vnodeops.[ch] 2013-08-12 16:53:39 -05:00
xfs_export.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_extent_busy.c xfs: fix the comment of xfs_extent_busy_update_extent() 2013-08-20 15:41:08 -05:00
xfs_extent_busy.h xfs: make xfs_extent_busy_trim not static 2012-05-14 16:21:04 -05:00
xfs_extfree_item.c xfs: return log item size in IOP_SIZE 2013-08-13 16:10:21 -05:00
xfs_extfree_item.h xfs: split out EFI/EFD log item format definition 2013-08-12 16:07:13 -05:00
xfs_file.c xfs: kill xfs_vnodeops.[ch] 2013-08-12 16:53:39 -05:00
xfs_filestream.c xfs: consolidate xfs_utils.c 2013-08-12 16:55:17 -05:00
xfs_filestream.h xfs: xfs_filestreams.h doesn't need __KERNEL__ 2013-08-12 17:00:11 -05:00
xfs_format.h xfs: split out the remote symlink handling 2013-08-12 16:43:38 -05:00
xfs_fs.h xfs: create internal eofblocks structure with kuid_t types 2013-08-15 14:24:10 -05:00
xfs_fsops.c xfs: refactor xfs_trans_reserve() interface 2013-08-12 17:47:34 -05:00
xfs_fsops.h xfs: ensure log covering transactions are synchronous 2011-01-11 20:28:17 -06:00
xfs_globals.c xfs: add background scanning to clear eofblocks inodes 2012-11-08 15:34:59 -06:00
xfs_ialloc.c xfs: check correct status variable for xfs_inobt_get_rec() call 2013-08-30 13:48:35 -05:00
xfs_ialloc.h xfs: Inode create item recovery 2013-06-27 14:26:21 -05:00
xfs_ialloc_btree.c xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
xfs_ialloc_btree.h xfs: add support for large btree blocks 2013-04-21 14:53:46 -05:00
xfs_icache.c xfs: create internal eofblocks structure with kuid_t types 2013-08-15 14:24:10 -05:00
xfs_icache.h xfs: create internal eofblocks structure with kuid_t types 2013-08-15 14:24:10 -05:00
xfs_icreate_item.c xfs: return log item size in IOP_SIZE 2013-08-13 16:10:21 -05:00
xfs_icreate_item.h xfs: separate icreate log format definitions from xfs_icreate_item.h 2013-08-12 16:10:35 -05:00
xfs_inode.c xfs: fix the comment of xfs_ifree_cluster() 2013-08-20 15:44:36 -05:00
xfs_inode.h xfs: consolidate xfs_utils.c 2013-08-12 16:55:17 -05:00
xfs_inode_buf.c xfs: inode buffers may not be valid during recovery readahead 2013-08-30 13:45:49 -05:00
xfs_inode_buf.h xfs: inode buffers may not be valid during recovery readahead 2013-08-30 13:45:49 -05:00
xfs_inode_fork.c xfs: check for underflow in xfs_iformat_fork() 2013-08-26 11:28:08 -05:00
xfs_inode_fork.h xfs: move inode fork definitions to a new header file 2013-08-12 16:37:32 -05:00
xfs_inode_item.c xfs: return log item size in IOP_SIZE 2013-08-13 16:10:21 -05:00
xfs_inode_item.h xfs: split out inode log item format definition 2013-08-12 16:05:19 -05:00
xfs_inum.h xfs: move xfsagino_t to xfs_types.h 2012-05-14 16:20:54 -05:00
xfs_ioctl.c xfs: add capability check to free eofblocks ioctl 2013-08-15 14:25:01 -05:00
xfs_ioctl.h xfs: consolidate extent swap code 2013-08-12 16:56:06 -05:00
xfs_ioctl32.c xfs: consolidate extent swap code 2013-08-12 16:56:06 -05:00
xfs_ioctl32.h xfs: remove subdirectories 2011-08-12 16:21:35 -05:00
xfs_iomap.c xfs: refactor xfs_trans_reserve() interface 2013-08-12 17:47:34 -05:00
xfs_iomap.h xfs: kill xfs_iomap 2010-12-16 16:05:51 -06:00
xfs_iops.c xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_iops.h xfs: kill xfs_vnodeops.[ch] 2013-08-12 16:53:39 -05:00
xfs_itable.c xfs: clean up unused codes at xfs_bulkstat() 2013-07-09 15:36:21 -05:00
xfs_itable.h
xfs_linux.h xfs: remove two unused macro definitions in xfs_linux.h 2013-08-20 15:30:23 -05:00
xfs_log.c xfs: fix the comment of xfs_log_unmount_write() 2013-08-20 15:47:18 -05:00
xfs_log.h xfs: Reduce allocations during CIL insertion 2013-08-13 16:12:30 -05:00
xfs_log_cil.c xfs: split the CIL lock 2013-08-13 16:21:21 -05:00
xfs_log_format.h xfs: Add xfs_log_rlimit.c 2013-08-12 17:49:38 -05:00
xfs_log_priv.h xfs: split the CIL lock 2013-08-13 16:21:21 -05:00
xfs_log_recover.c xfs: inode buffers may not be valid during recovery readahead 2013-08-30 13:45:49 -05:00
xfs_log_recover.h
xfs_log_rlimit.c xfs: call roundup_64() to calculate the min_logblks 2013-08-13 14:19:11 -05:00
xfs_message.c xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
xfs_message.h xfs: introduce CONFIG_XFS_WARN 2013-05-07 18:45:36 -05:00
xfs_mount.c xfs: Register hotcpu notifier after initialization 2013-08-22 14:05:27 -05:00
xfs_mount.h xfs: Introduce a new structure to hold transaction reservation items 2013-08-12 17:45:49 -05:00
xfs_mru_cache.c xfs: convert to alloc_workqueue() 2011-02-01 11:42:43 +01:00
xfs_mru_cache.h
xfs_qm.c xfs: convert kuid_t to/from uid_t for internal structures 2013-08-15 14:22:40 -05:00
xfs_qm.h xfs: Add support for the Q_XGETQSTATV 2013-08-20 17:00:38 -05:00
xfs_qm_bhv.c xfs: separate dquot on disk format definitions out of xfs_quota.h 2013-08-12 16:09:52 -05:00
xfs_qm_syscalls.c xfs: Add support for the Q_XGETQSTATV 2013-08-20 17:00:38 -05:00
xfs_quota.h xfs: convert kuid_t to/from uid_t for internal structures 2013-08-15 14:22:40 -05:00
xfs_quota_defs.h xfs: introduce xfs_quota_defs.h 2013-08-12 16:20:18 -05:00
xfs_quota_priv.h xfs: use per-filesystem radix trees for dquot lookup 2012-03-14 11:09:06 -05:00
xfs_quotaops.c xfs: Add support for the Q_XGETQSTATV 2013-08-20 17:00:38 -05:00
xfs_rtalloc.c xfs: refactor xfs_trans_reserve() interface 2013-08-12 17:47:34 -05:00
xfs_rtalloc.h xfs: introduce xfs_rtalloc_defs.h 2013-08-12 16:13:10 -05:00
xfs_sb.c xfs: fix the comment of xfs_sb_quiet_read_verify() 2013-08-20 15:51:49 -05:00
xfs_sb.h xfs: add xfs sb v4 support for dirent filetype field 2013-08-22 08:49:59 -05:00
xfs_stats.c xfs: use common code for quota statistics 2012-03-14 11:09:06 -05:00
xfs_stats.h xfs: use common code for quota statistics 2012-03-14 11:09:06 -05:00
xfs_super.c xfs: consolidate xfs_utils.c 2013-08-12 16:55:17 -05:00
xfs_super.h xfs: xfs_sync_data is redundant. 2012-10-17 12:01:25 -05:00
xfs_symlink.c xfs: convert kuid_t to/from uid_t for internal structures 2013-08-15 14:22:40 -05:00
xfs_symlink.h xfs: split out the remote symlink handling 2013-08-12 16:43:38 -05:00
xfs_symlink_remote.c xfs: make struct xfs_perag kernel only 2013-08-12 17:44:36 -05:00
xfs_sysctl.c xfs: Convert use of typedef ctl_table to struct ctl_table 2013-06-17 17:42:25 -05:00
xfs_sysctl.h xfs: add background scanning to clear eofblocks inodes 2012-11-08 15:34:59 -06:00
xfs_trace.c xfs: separate dquot on disk format definitions out of xfs_quota.h 2013-08-12 16:09:52 -05:00
xfs_trace.h xfs: update for 3.11-rc1 2013-07-09 12:29:12 -07:00
xfs_trans.c xfs: refactor xfs_trans_reserve() interface 2013-08-12 17:47:34 -05:00
xfs_trans.h xfs: avoid CIL allocation during insert 2013-08-13 16:19:03 -05:00
xfs_trans_ail.c xfs: Simplify xfs_ail_min() with list_first_entry_or_null() 2013-08-23 12:57:43 -05:00
xfs_trans_buf.c xfs: Introduce an ordered buffer item 2013-06-27 13:33:11 -05:00
xfs_trans_dquot.c xfs: separate dquot on disk format definitions out of xfs_quota.h 2013-08-12 16:09:52 -05:00
xfs_trans_extfree.c xfs: move xfsagino_t to xfs_types.h 2012-05-14 16:20:54 -05:00
xfs_trans_inode.c xfs: implement inode change count 2013-06-28 13:00:05 -05:00
xfs_trans_priv.h xfs: Simplify xfs_ail_min() with list_first_entry_or_null() 2013-08-23 12:57:43 -05:00
xfs_trans_resv.c xfs: inode log reservations are too small 2013-08-30 13:59:30 -05:00
xfs_trans_resv.h xfs: Get rid of all XFS_XXX_LOG_RES() macro 2013-08-12 17:48:08 -05:00
xfs_trans_space.h
xfs_types.h xfs: Add read-only support for dirent filetype field 2013-08-22 08:40:24 -05:00
xfs_vnode.h xfs: remove remaining scraps of struct xfs_iomap 2012-03-15 13:40:16 -05:00
xfs_xattr.c xfs: kill xfs_vnodeops.[ch] 2013-08-12 16:53:39 -05:00