From 228d62dd3f74734b9801c789b5addc57fdfc208f Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Fri, 6 May 2011 02:54:04 +0000 Subject: [PATCH 1/5] xfs: ensure reclaim cursor is reset correctly at end of AG On a 32 bit highmem PowerPC machine, the XFS inode cache was growing without bound and exhausting low memory causing the OOM killer to be triggered. After some effort, the problem was reproduced on a 32 bit x86 highmem machine. The problem is that the per-ag inode reclaim index cursor was not getting reset to the start of the AG if the radix tree tag lookup found no more reclaimable inodes. Hence every further reclaim attempt started at the same index beyond where any reclaimable inodes lay, and no further background reclaim ever occurred from the AG. Without background inode reclaim the VM driven cache shrinker simply cannot keep up with cache growth, and OOM is the result. While the change that exposed the problem was the conversion of the inode reclaim to use work queues for background reclaim, it was not the cause of the bug. The bug was introduced when the cursor code was added, just waiting for some weird configuration to strike.... Signed-off-by: Dave Chinner Tested-By: Christian Kujau Reviewed-by: Christoph Hellwig Reviewed-by: Alex Elder (cherry picked from commit b223221956675ce8a7b436d198ced974bb388571) --- fs/xfs/linux-2.6/xfs_sync.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c index e4f9c1b0836c..3e898a48122d 100644 --- a/fs/xfs/linux-2.6/xfs_sync.c +++ b/fs/xfs/linux-2.6/xfs_sync.c @@ -926,6 +926,7 @@ restart: XFS_LOOKUP_BATCH, XFS_ICI_RECLAIM_TAG); if (!nr_found) { + done = 1; rcu_read_unlock(); break; } From 9e7004e741de0b2daabbbadafbaf11ff1a94e00c Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Fri, 6 May 2011 02:54:05 +0000 Subject: [PATCH 2/5] xfs: exit AIL push work correctly when AIL is empty The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. The main cause is a regression where a work exit path fails to clear the PUSHING state and recheck the target correctly. Make both exit paths do the same PUSHING bit clearing and target checking when the "no more work to be done" condition is hit. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Alex Elder (cherry picked from commit ea35a20021f8497390d05b93271b4d675516c654) --- fs/xfs/xfs_trans_ail.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index acdb92f14d51..226c58bd62e0 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -346,18 +346,20 @@ xfs_ail_delete( */ STATIC void xfs_ail_worker( - struct work_struct *work) + struct work_struct *work) { - struct xfs_ail *ailp = container_of(to_delayed_work(work), + struct xfs_ail *ailp = container_of(to_delayed_work(work), struct xfs_ail, xa_work); - long tout; - xfs_lsn_t target = ailp->xa_target; - xfs_lsn_t lsn; - xfs_log_item_t *lip; - int flush_log, count, stuck; - xfs_mount_t *mp = ailp->xa_mount; + xfs_mount_t *mp = ailp->xa_mount; struct xfs_ail_cursor *cur = &ailp->xa_cursors; - int push_xfsbufd = 0; + xfs_log_item_t *lip; + xfs_lsn_t lsn; + xfs_lsn_t target = ailp->xa_target; + long tout = 10; + int flush_log = 0; + int stuck = 0; + int count = 0; + int push_xfsbufd = 0; spin_lock(&ailp->xa_lock); xfs_trans_ail_cursor_init(ailp, cur); @@ -368,8 +370,7 @@ xfs_ail_worker( */ xfs_trans_ail_cursor_done(ailp, cur); spin_unlock(&ailp->xa_lock); - ailp->xa_last_pushed_lsn = 0; - return; + goto out_done; } XFS_STATS_INC(xs_push_ail); @@ -386,7 +387,6 @@ xfs_ail_worker( * lots of contention on the AIL lists. */ lsn = lip->li_lsn; - flush_log = stuck = count = 0; while ((XFS_LSN_CMP(lip->li_lsn, target) < 0)) { int lock_result; /* @@ -480,7 +480,7 @@ xfs_ail_worker( } /* assume we have more work to do in a short while */ - tout = 10; +out_done: if (!count) { /* We're past our target or empty, so idle */ ailp->xa_last_pushed_lsn = 0; From 50e86686dfb287d720af8b0f977202d205c04215 Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Fri, 6 May 2011 02:54:06 +0000 Subject: [PATCH 3/5] xfs: always push the AIL to the target The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One of the problems discovered is a target mismatch between the item pushing loop and the target itself. The push trigger checks for the target increasing (i.e. new target > current) while the push loop only pushes items that have a LSN < current. As a result, we can get the situation where the push target is X, the items at the tail of the AIL have LSN X and they don't get pushed. The push work then completes thinking it is done, and cannot be restarted until the push target increases to >= X + 1. If the push target then never increases (because the tail is not moving), then we never run the push work again and we stall. Fix it by making sure log items with a LSN that matches the target exactly are pushed during the loop. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Alex Elder (cherry picked from commit cb64026b6e8af50db598ec7c3f59d504259b00bb) --- fs/xfs/xfs_trans_ail.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 226c58bd62e0..9f427c2597bb 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -387,7 +387,7 @@ xfs_ail_worker( * lots of contention on the AIL lists. */ lsn = lip->li_lsn; - while ((XFS_LSN_CMP(lip->li_lsn, target) < 0)) { + while ((XFS_LSN_CMP(lip->li_lsn, target) <= 0)) { int lock_result; /* * If we can lock the item without sleeping, unlock the AIL From fe0da767311933d1c1907cb8d326beea7a3cbd9c Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Fri, 6 May 2011 02:54:07 +0000 Subject: [PATCH 4/5] xfs: make AIL target updates and compares 32bit safe. The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One of the problems noticed was that updates of the push target are not 32 bit safe as the target is a 64 bit value. We cannot copy a 64 bit LSN without the possibility of corrupting the result when racing with another updating thread. We have function to do this update safely without needing to care about 32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when updating the AIL push target. Also move the reading of the target in the push work inside the AIL lock, and use XFS_LSN_CMP() for the unlocked comparison during work termination to close read holes as well. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Alex Elder (cherry picked from commit fd5670f22fce247754243cf2ed41941e5762d990) --- fs/xfs/xfs_trans_ail.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 9f427c2597bb..d7eebbf71362 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -354,7 +354,7 @@ xfs_ail_worker( struct xfs_ail_cursor *cur = &ailp->xa_cursors; xfs_log_item_t *lip; xfs_lsn_t lsn; - xfs_lsn_t target = ailp->xa_target; + xfs_lsn_t target; long tout = 10; int flush_log = 0; int stuck = 0; @@ -362,6 +362,7 @@ xfs_ail_worker( int push_xfsbufd = 0; spin_lock(&ailp->xa_lock); + target = ailp->xa_target; xfs_trans_ail_cursor_init(ailp, cur); lip = xfs_trans_ail_cursor_first(ailp, cur, ailp->xa_last_pushed_lsn); if (!lip || XFS_FORCED_SHUTDOWN(mp)) { @@ -491,7 +492,7 @@ out_done: * work to do. Wait a bit longer before starting that work. */ smp_rmb(); - if (ailp->xa_target == target) { + if (XFS_LSN_CMP(ailp->xa_target, target) == 0) { clear_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags); return; } @@ -553,7 +554,7 @@ xfs_ail_push( * the XFS_AIL_PUSHING_BIT. */ smp_wmb(); - ailp->xa_target = threshold_lsn; + xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &threshold_lsn); if (!test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)) queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, 0); } From 7ac956576d0ce8f97450a39c2f304db8eea01647 Mon Sep 17 00:00:00 2001 From: Dave Chinner Date: Fri, 6 May 2011 02:54:08 +0000 Subject: [PATCH 5/5] xfs: fix race condition in AIL push trigger The recent conversion of the xfsaild functionality to a work queue introduced a hard-to-hit log space grant hang. One is caused by a race condition in determining whether there is a psh in progress or not. The XFS_AIL_PUSHING_BIT is used to determine whether a push is currently in progress. When the AIL push work completes, it checked whether the target changed and cleared the PUSHING bit to allow a new push to be requeued. The race condition is as follows: Thread 1 push work smp_wmb() smp_rmb() check ailp->xa_target unchanged update ailp->xa_target test/set PUSHING bit does not queue clear PUSHING bit does not requeue Now that the push target is updated, new attempts to push the AIL will not trigger as the push target will be the same, and hence despite trying to push the AIL we won't ever wake it again. The fix is to ensure that the AIL push work clears the PUSHING bit before it checks if the target is unchanged. As a result, both push triggers operate on the same test/set bit criteria, so even if we race in the push work and miss the target update, the thread requesting the push will still set the PUSHING bit and queue the push work to occur. For safety sake, the same queue check is done if the push work detects the target change, though only one of the two will will queue new work due to the use of test_and_set_bit() checks. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Alex Elder (cherry picked from commit e4d3c4a43b595d5124ae824d300626e6489ae857) --- fs/xfs/xfs_trans_ail.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index d7eebbf71362..5fc2380092c8 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -487,15 +487,19 @@ out_done: ailp->xa_last_pushed_lsn = 0; /* - * Check for an updated push target before clearing the - * XFS_AIL_PUSHING_BIT. If the target changed, we've got more - * work to do. Wait a bit longer before starting that work. + * We clear the XFS_AIL_PUSHING_BIT first before checking + * whether the target has changed. If the target has changed, + * this pushes the requeue race directly onto the result of the + * atomic test/set bit, so we are guaranteed that either the + * the pusher that changed the target or ourselves will requeue + * the work (but not both). */ + clear_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags); smp_rmb(); - if (XFS_LSN_CMP(ailp->xa_target, target) == 0) { - clear_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags); + if (XFS_LSN_CMP(ailp->xa_target, target) == 0 || + test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags)) return; - } + tout = 50; } else if (XFS_LSN_CMP(lsn, target) >= 0) { /*