OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Andreas Gruenbacher	2129b42888	gfs2: Per-revoke accounting in transactions In the log, revokes are stored as a revoke descriptor (struct gfs2_log_descriptor), followed by zero or more additional revoke blocks (struct gfs2_meta_header). On filesystems with a blocksize of 4k, the revoke descriptor contains up to 503 revokes, and the metadata blocks contain up to 509 revokes each. We've so far been reserving space for revokes in transactions in block granularity, so a lot more space than necessary was being allocated and then released again. This patch switches to assigning revokes to transactions individually instead. Initially, space for the revoke descriptor is reserved and handed out to transactions. When more revokes than that are reserved, additional revoke blocks are added. When the log is flushed, the space for the additional revoke blocks is released, but we keep the space for the revoke descriptor block allocated. Transactions may still reserve more revokes than they will actually need in the end, but now we won't overshoot the target as much, and by only returning the space for excess revokes at log flush time, we further reduce the amount of contention between processes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-02-22 21:16:23 +01:00
Andreas Gruenbacher	c968f5788b	gfs2: Clean up on-stack transactions Replace the TR_ALLOCED flag by its inverse, TR_ONSTACK: that way, the flag only needs to be set in the exceptional case of on-stack transactions. Split off __gfs2_trans_begin from gfs2_trans_begin and use it to replace the open-coded version in gfs2_ail_empty_gl. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-02-03 18:37:10 +01:00
Andreas Gruenbacher	15e20a301a	gfs2: Use sb_start_intwrite in gfs2_ail_empty_gl Commit `2e60d7683c` ("GFS2: update freeze code to use freeze/thaw_super on all nodes") optimized away the sb_start_intwrite ... sb_end_intwrite protection for the on-stack transactions in gfs2_ail_empty_gl with no explanation. I can't think of a valid reason for doing that, so revert that change. This simplifies the next commit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-02-03 16:33:16 +01:00
Bob Peterson	f39e7d3aae	gfs2: Don't freeze the file system during unmount GFS2's freeze/thaw mechanism uses a special freeze glock to control its operation. It does this with a sync glock operation (glops.c) called freeze_go_sync. When the freeze glock is demoted (glock's do_xmote) the glops function causes the file system to be frozen. This is intended. However, GFS2's mount and unmount processes also hold the freeze glock to prevent other processes, perhaps on different cluster nodes, from mounting the frozen file system in read-write mode. Before this patch, there was no check in freeze_go_sync for whether a freeze in intended or whether the glock demote was caused by a normal unmount. So it was trying to freeze the file system it's trying to unmount, which ends up in a deadlock. This patch adds an additional check to freeze_go_sync so that demotes of the freeze glock are ignored if they come from the unmount process. Fixes: `20b3291290` ("gfs2: Fix regression in freeze_go_sync") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-25 18:12:08 +01:00
Alexander Aring	515b269d5b	gfs2: set lockdep subclass for iopen glocks This patch introduce a new globs attribute to define the subclass of the glock lockref spinlock. This avoid the following lockdep warning, which occurs when we lock an inode lock while an iopen lock is held: ============================================ WARNING: possible recursive locking detected 5.10.0-rc3+ #4990 Not tainted -------------------------------------------- kworker/0:1/12 is trying to acquire lock: ffff9067d45672d8 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: lockref_get+0x9/0x20 but task is already holding lock: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&gl->gl_lockref.lock); lock(&gl->gl_lockref.lock); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by kworker/0:1/12: #0: ffff9067c1bfdd38 ((wq_completion)delete_workqueue){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540 #1: ffffac594006be70 ((work_completion)(&(&gl->gl_delete)->work)){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540 #2: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260 stack backtrace: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.10.0-rc3+ #4990 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Workqueue: delete_workqueue delete_work_func Call Trace: dump_stack+0x8b/0xb0 __lock_acquire.cold+0x19e/0x2e3 lock_acquire+0x150/0x410 ? lockref_get+0x9/0x20 _raw_spin_lock+0x27/0x40 ? lockref_get+0x9/0x20 lockref_get+0x9/0x20 delete_work_func+0x188/0x260 process_one_work+0x237/0x540 worker_thread+0x4d/0x3b0 ? process_one_work+0x540/0x540 kthread+0x127/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-24 23:45:58 +01:00
Alexander Aring	16e6281b6b	gfs2: Fix deadlock dumping resource group glocks Commit `0e539ca1bb` ("gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump") introduced additional locking in gfs2_rgrp_go_dump, which is also used for dumping resource group glocks via debugfs. However, on that code path, the glock spin lock is already taken in dump_glock, and taking it again in gfs2_glock2rgrp leads to deadlock. This can be reproduced with: $ mkfs.gfs2 -O -p lock_nolock /dev/FOO $ mount /dev/FOO /mnt/foo $ touch /mnt/foo/bar $ cat /sys/kernel/debug/gfs2/FOO/glocks Fix that by not taking the glock spin lock inside the go_dump callback. Fixes: `0e539ca1bb` ("gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-24 23:45:58 +01:00
Bob Peterson	20b3291290	gfs2: Fix regression in freeze_go_sync Patch `541656d3a5` ("gfs2: freeze should work on read-only mounts") changed the check for glock state in function freeze_go_sync() from "gl->gl_state == LM_ST_SHARED" to "gl->gl_req == LM_ST_EXCLUSIVE". That's wrong and it regressed gfs2's freeze/thaw mechanism because it caused only the freezing node (which requests the glock in EX) to queue freeze work. All nodes go through this go_sync code path during the freeze to drop their SHared hold on the freeze glock, allowing the freezing node to acquire it in EXclusive mode. But all the nodes must freeze access to the file system locally, so they ALL must queue freeze work. The freeze_work calls freeze_func, which makes a request to reacquire the freeze glock in SH, effectively blocking until the thaw from the EX holder. Once thawed, the freezing node drops its EX hold on the freeze glock, then the (blocked) freeze_func reacquires the freeze glock in SH again (on all nodes, including the freezer) so all nodes go back to a thawed state. This patch changes the check back to gl_state == LM_ST_SHARED like it was prior to `541656d3a5`. Fixes: `541656d3a5` ("gfs2: freeze should work on read-only mounts") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-18 16:28:11 +01:00
Bob Peterson	4a55752ae2	gfs2: Split up gfs2_meta_sync into inode and rgrp versions Before this patch, function gfs2_meta_sync called filemap_fdatawrite to write the address space for the metadata being synced. That's great for inodes, but resource groups all point to the same superblock-address space, sdp->sd_aspace. Each rgrp has its own range of blocks on which it should operate. That meant every time an rgrp's metadata was synced, it would write all of them instead of just the range. This patch eliminates function gfs2_meta_sync and tailors specific metasync functions for inodes and rgrps. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-10-29 22:16:46 +01:00
Andreas Gruenbacher	ed3adb375b	gfs2: Ignore subsequent errors after withdraw in rgrp_go_sync Once a withdraw has occurred, ignore errors that are the consequence of the withdraw. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-10-20 23:16:22 +02:00
Bob Peterson	23cfb0c3d8	gfs2: Eliminate gl_vm The gfs2_glock structure has a gl_vm member, introduced in commit `7005c3e4ae` ("GFS2: Use range based functions for rgrp sync/invalidation"), which stores the location of resource groups within their address space. This structure is in a union with iopen glock specific fields. It was introduced because at unmount time, the resource group objects were destroyed before flushing out any pending resource group glock work, and flushing out such work could require flushing / truncating the address space. Since commit `b3422cacdd` ("gfs2: Rework how rgrp buffer_heads are managed"), any pending resource group glock work is flushed out before destroying the resource group objects. So the resource group objects will now always exist in rgrp_go_sync and rgrp_go_inval, and we now simply compute the gl_vm values where needed instead of caching them. This also eliminates the union. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-10-20 23:16:22 +02:00
Andrew Price	0e539ca1bb	gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump When an rindex entry is found to be corrupt, compute_bitstructs() calls gfs2_consist_rgrpd() which calls gfs2_rgrp_dump() like this: gfs2_rgrp_dump(NULL, rgd->rd_gl, fs_id_buf); gfs2_rgrp_dump then dereferences the gl without checking it and we get BUG: KASAN: null-ptr-deref in gfs2_rgrp_dump+0x28/0x280 because there's no rgrp glock involved while reading the rindex on mount. Fix this by changing gfs2_rgrp_dump to take an rgrp argument. Reported-by: syzbot+43fa87986bdd31df9de6@syzkaller.appspotmail.com Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-10-14 23:54:43 +02:00
Bob Peterson	541656d3a5	gfs2: freeze should work on read-only mounts Before this patch, function freeze_go_sync, called when promoting the freeze glock, was testing for the SDF_JOURNAL_LIVE superblock flag. That's only set for read-write mounts. Read-only mounts don't use a journal, so the bit is never set, so the freeze never happened. This patch removes the check for SDF_JOURNAL_LIVE for freeze requests but still checks it when deciding whether to flush a journal. Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2020-07-03 12:05:35 +02:00
Andreas Gruenbacher	300e549b6e	Merge branch 'gfs2-iopen' into for-next	2020-06-05 21:25:36 +02:00
Bob Peterson	cbcc89b630	gfs2: initialize transaction tr_ailX_lists earlier Since transactions may be freed shortly after they're created, before a log_flush occurs, we need to initialize their ail1 and ail2 lists earlier. Before this patch, the ail1 list was initialized in gfs2_log_flush(). This moves the initialization to the point when the transaction is first created. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-05 21:24:25 +02:00
Andreas Gruenbacher	a0e3cc65fa	gfs2: Turn gl_delete into a delayed work This requires flushing delayed work items in gfs2_make_fs_ro (which is called before unmounting a filesystem). When inodes are deleted and then recreated, pending gl_delete work items would have no effect because the inode generations will have changed, so we can cancel any pending gl_delete works before reusing iopen glocks. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-05 20:19:21 +02:00
Andreas Gruenbacher	f286d627ef	gfs2: Keep track of deleted inode generations in LVBs When deleting an inode, keep track of the generation of the deleted inode in the inode glock Lock Value Block (LVB). When trying to delete an inode remotely, check the last-known inode generation against the deleted inode generation to skip duplicate remote deletes. This avoids taking the resource group glock in order to verify the block type. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-05 20:19:20 +02:00
Bob Peterson	bbae10fac2	gfs2: Don't ignore inode write errors during inode_go_sync Before for this patch, function inode_go_sync ignored io errors during inode_go_sync, overwriting them with metadata write errors: error = filemap_fdatawait(mapping); mapping_set_error(mapping, error); } error = filemap_fdatawait(metamapping); ... return error; So any errors returned by the inode write would be forgotten if the metadata write succeeded. This patch still does both writes, but only sets error if it's still zero. That way, any errors will be reported by to the caller, do_xmote, which will take appropriate action and report the error. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-02 19:45:05 +02:00
Bob Peterson	1c634f94c3	gfs2: Do proper error checking for go_sync family of glops functions Before this patch, function do_xmote would try to sync out the glock dirty data by calling the appropriate glops function XXX_go_sync() but it did not check for a good return code. If the sync was not possible due to an io error or whatever, do_xmote would continue on and call go_inval and release the glock to other cluster nodes. When those nodes go to replay the journal, they may already be holding glocks for the journal records that should have been synced, but were not due to the ignored error. This patch introduces proper error code checking to the go_sync family of glops functions. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-02-27 07:53:18 -06:00
Bob Peterson	9ff7828935	gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty Before this patch, if gfs2_ail_empty_gl saw there was nothing on the ail list, it would return and not flush the log. The problem is that there could still be a revoke for the rgrp sitting on the sd_log_le_revoke list that's been recently taken off the ail list. But that revoke still needs to be written, and the rgrp_go_inval still needs to call log_flush_wait to ensure the revokes are all properly written to the journal before we relinquish control of the glock to another node. If we give the glock to another node before we have this knowledge, the node might crash and its journal replayed, in which case the missing revoke would allow the journal replay to replay the rgrp over top of the rgrp we already gave to another node, thus overwriting its changes and corrupting the file system. This patch makes gfs2_ail_empty_gl still call gfs2_log_flush rather than returning. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-02-27 07:53:18 -06:00
Bob Peterson	33dbd1e41a	gfs2: fix infinite loop when checking ail item count before go_inval Before this patch, the rgrp_go_inval and inode_go_inval functions each checked if there were any items left on the ail count (by way of a count), and if so, did a withdraw. But the withdraw code now uses glocks when changing the file system to read-only status. So we can not have glock functions withdrawing or a hang will likely result: The glocks can't be serviced by the work_func if the work_func is busy doing its own withdraw. This patch removes the checks from the go_inval functions and adds a centralized check in do_xmote to warn about the problem and not withdraw, but flag the error so it's eventually caught when the logd daemon eventually runs. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-02-27 07:53:17 -06:00
Bob Peterson	601ef0d52e	gfs2: Force withdraw to replay journals and wait for it to finish When a node withdraws from a file system, it often leaves its journal in an incomplete state. This is especially true when the withdraw is caused by io errors writing to the journal. Before this patch, a withdraw would try to write a "shutdown" record to the journal, tell dlm it's done with the file system, and none of the other nodes know about the problem. Later, when the problem is fixed and the withdrawn node is rebooted, it would then discover that its own journal was incomplete, and replay it. However, replaying it at this point is almost guaranteed to introduce corruption because the other nodes are likely to have used affected resource groups that appeared in the journal since the time of the withdraw. Replaying the journal later will overwrite any changes made, and not through any fault of dlm, which was instructed during the withdraw to release those resources. This patch makes file system withdraws seen by the entire cluster. Withdrawing nodes dequeue their journal glock to allow recovery. The remaining nodes check all the journals to see if they are clean or in need of replay. They try to replay dirty journals, but only the journals of withdrawn nodes will be "not busy" and therefore available for replay. Until the journal replay is complete, no i/o related glocks may be given out, to ensure that the replay does not cause the aforementioned corruption: We cannot allow any journal replay to overwrite blocks associated with a glock once it is held. The "live" glock which is now used to signal when a withdraw occurs. When a withdraw occurs, the node signals its withdraw by dequeueing the "live" glock and trying to enqueue it in EX mode, thus forcing the other nodes to all see a demote request, by way of a "1CB" (one callback) try lock. The "live" glock is not granted in EX; the callback is only just used to indicate a withdraw has occurred. Note that all nodes in the cluster must wait for the recovering node to finish replaying the withdrawing node's journal before continuing. To this end, it checks that the journals are clean multiple times in a retry loop. Also note that the withdraw function may be called from a wide variety of situations, and therefore, we need to take extra precautions to make sure pointers are valid before using them in many circumstances. We also need to take care when glocks decide to withdraw, since the withdraw code now uses glocks. Also, before this patch, if a process encountered an error and decided to withdraw, if another process was already withdrawing, the second withdraw would be silently ignored, which set it free to unlock its glocks. That's correct behavior if the original withdrawer encounters further errors down the road. But if secondary waiters don't wait for the journal replay, unlocking glocks will allow other nodes to use them, despite the fact that the journal containing those blocks is being replayed. The replay needs to finish before our glocks are released to other nodes. IOW, secondary withdraws need to wait for the first withdraw to finish. For example, if an rgrp glock is unlocked by a process that didn't wait for the first withdraw, a journal replay could introduce file system corruption by replaying a rgrp block that has already been granted to a different cluster node. Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2020-02-27 07:53:12 -06:00
Bob Peterson	a72d2401f5	gfs2: Allow some glocks to be used during withdraw We need to allow some glocks to be enqueued, dequeued, promoted, and demoted when we're withdrawn. For example, to maintain metadata integrity, we should disallow the use of inode and rgrp glocks when withdrawn. Other glocks, like iopen or the transaction glocks may be safely used because none of their metadata goes through the journal. So in general, we should disallow all glocks with an address space, and allow all the others. One exception is: we need to allow our active journal to be demoted so others may recover it. Allowing glocks after withdraw gives us the ability to take appropriate action (in a following patch) to have our journal properly replayed by another node rather than just abandoning the current transactions and pretending nothing bad happened, leaving the other nodes free to modify the blocks we had in our journal, which may result in file system corruption. Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2020-02-20 11:01:36 -06:00
Bob Peterson	b3422cacdd	gfs2: Rework how rgrp buffer_heads are managed Before this patch, the rgrp code had a serious problem related to how it managed buffer_heads for resource groups. The problem caused file system corruption, especially in cases of journal replay. When an rgrp glock was demoted to transfer ownership to a different cluster node, do_xmote() first calls rgrp_go_sync and then rgrp_go_inval, as expected. When it calls rgrp_go_sync, that called gfs2_rgrp_brelse() that dropped the buffer_head reference count. In most cases, the reference count went to zero, which is right. However, there were other places where the buffers are handled differently. After rgrp_go_sync, do_xmote called rgrp_go_inval which called gfs2_rgrp_brelse a second time, then rgrp_go_inval's call to truncate_inode_pages_range would get rid of the pages in memory, but only if the reference count drops to 0. Unfortunately, gfs2_rgrp_brelse was setting bi->bi_bh = NULL. So when rgrp_go_sync called gfs2_rgrp_brelse, it lost the pointer to the buffer_heads in cases where the reference count was still 1. Therefore, when rgrp_go_inval called gfs2_rgrp_brelse a second time, it failed the check for "if (bi->bi_bh)" and thus failed to call brelse a second time. Because of that, the reference count on those buffers sometimes failed to drop from 1 to 0. And that caused function truncate_inode_pages_range to keep the pages in page cache rather than freeing them. The next time the rgrp glock was acquired, the metadata read of the rgrp buffers re-used the pages in memory, which were now wrong because they were likely modified by the other node who acquired the glock in EX (which is why we demoted the glock). This re-use of the page cache caused corruption because changes made by the other nodes were never seen, so the bitmaps were inaccurate. For some reason, the problem became most apparent when journal replay forced the replay of rgrps in memory, which caused newer rgrp data to be overwritten by the older in-core pages. A big part of the problem was that the rgrp buffer were released in multiple places: The go_unlock function would release them when the glock was released rather than when the glock is demoted, which is clearly wrong because our intent was to cache them until the glock is demoted from SH or EX. This patch attempts to clean up the mess and make one consistent and centralized mechanism for managing the rgrp buffer_heads by implementing several changes: 1. It eliminates the call to gfs2_rgrp_brelse() from rgrp_go_sync. We don't want to release the buffers or zero the pointers when syncing for the reasons stated above. It only makes sense to release them when the glock is actually invalidated (go_inval). And when we do, then we set the bh pointers to NULL. 2. The go_unlock function (which was only used for rgrps) is eliminated, as we've talked about doing many times before. The go_unlock function was called too early in the glock dq process, and should not happen until the glock is invalidated. 3. It also eliminates the call to rgrp_brelse in gfs2_clear_rgrpd. That will now happen automatically when the rgrp glocks are demoted, and shouldn't happen any sooner or later than that. Instead, function gfs2_clear_rgrpd has been modified to demote the rgrp glocks, and therefore, free those pages, before the remaining glocks are culled by gfs2_gl_hash_clear. This prevents the gl_object from hanging around when the glocks are culled. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-02-10 07:39:48 -06:00
Andreas Gruenbacher	badb55ec20	gfs2: Split gfs2_lm_withdraw into two functions Split gfs2_lm_withdraw into a function that prints an error message and a function that withdraws the filesystem. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2020-02-10 07:39:44 -06:00
Bob Peterson	2e9eeaa117	gfs2: eliminate ssize parameter from gfs2_struct2blk Every caller of function gfs2_struct2blk specified sizeof(u64). This patch eliminates the unnecessary parameter and replaces the size calculation with a new superblock variable that is computed to be the maximum number of block pointers we can fit inside a log descriptor, as is done for pointers per dinode and indirect block. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-01-07 18:46:06 +01:00
Bob Peterson	eb43e660c0	gfs2: Introduce function gfs2_withdrawn Add function gfs2_withdrawn and replace all checks for the SDF_WITHDRAWN bit to call it. This does not change the logic or function of gfs2, and it facilitates later improvements to the withdraw sequence. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2019-11-14 19:46:18 +01:00
Aliasgar Surti	098b9c1453	gfs2: removed unnecessary semicolon There is use of unnecessary semicolon after switch case. Removed the semicolon. Signed-off-by: Aliasgar Surti <aliasgar.surti500@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2019-10-30 12:17:04 +01:00
Bob Peterson	f29e62eed2	gfs2: replace more printk with calls to fs_info and friends This patch replaces a few leftover printk errors with calls to fs_info and similar, so that the file system having the error is properly logged. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2019-06-27 21:30:27 +02:00
Bob Peterson	3792ce973f	gfs2: dump fsid when dumping glock problems Before this patch, if a glock error was encountered, the glock with the problem was dumped. But sometimes you may have lots of file systems mounted, and that doesn't tell you which file system it was for. This patch adds a new boolean parameter fsid to the dump_glock family of functions. For non-error cases, such as dumping the glocks debugfs file, the fsid is not dumped in order to keep lock dumps and glocktop as clean as possible. For all error cases, such as GLOCK_BUG_ON, the file system id is now printed. This will make it easier to debug. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2019-06-27 21:27:43 +02:00
Bob Peterson	04aea0ca14	gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN Before this patch, the superblock flag indicating when a file system is withdrawn was called SDF_SHUTDOWN. This patch simply renames it to the more obvious SDF_WITHDRAWN. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2019-06-27 21:26:35 +02:00
Thomas Gleixner	7336d0e654	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 Based on 1 normalized pattern(s): this copyrighted material is made available to anyone wishing to use modify copy or redistribute it subject to the terms and conditions of the gnu general public license version 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 44 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:12 +02:00
Abhi Das	f4686c26ec	gfs2: read journal in large chunks Use bios to read in the journal into the address space of the journal inode (jd_inode), sequentially and in large chunks. This is faster for locating the journal head that the previous binary search approach. When performing recovery, we keep the journal in the address space until recovery is done, which further speeds up things. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2019-05-07 23:39:15 +02:00
Bob Peterson	23e93c9b2c	Revert "gfs2: read journal in large chunks to locate the head" This reverts commit `2a5f14f279`. This patch causes xfstests generic/311 to fail. Reverting this for now until we have a proper fix. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2019-02-14 09:52:51 -08:00
Bob Peterson	27a2660f1e	gfs2: Dump nrpages for inodes and their glocks This patch is based on an idea from Steve Whitehouse. The idea is to dump the number of pages for inodes in the glock dumps. The additional locking required me to drop const from quite a few places. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2018-12-12 12:33:23 +01:00
Abhi Das	2a5f14f279	gfs2: read journal in large chunks to locate the head Use bio(s) to read in the journal sequentially in large chunks and locate the head of the journal. This version addresses the issues Christoph pointed out w.r.t error handling and using deprecated API. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>	2018-12-11 17:50:36 +01:00
Deepa Dinamani	95582b0083	vfs: change inode times to use struct timespec64 struct timespec is not y2038 safe. Transition vfs to use y2038 safe struct timespec64 instead. The change was made with the help of the following cocinelle script. This catches about 80% of the changes. All the header file and logic changes are included in the first 5 rules. The rest are trivial substitutions. I avoid changing any of the function signatures or any other filesystem specific data structures to keep the patch simple for review. The script can be a little shorter by combining different cases. But, this version was sufficient for my usecase. virtual patch @ depends on patch @ identifier now; @@ - struct timespec + struct timespec64 current_time ( ... ) { - struct timespec now = current_kernel_time(); + struct timespec64 now = current_kernel_time64(); ... - return timespec_trunc( + return timespec64_trunc( ... ); } @ depends on patch @ identifier xtime; @@ struct $ iattr \\| inode \\| kstat $ { ... - struct timespec xtime; + struct timespec64 xtime; ... } @ depends on patch @ identifier t; @@ struct inode_operations { ... int (update_time) (..., - struct timespec t, + struct timespec64 t, ...); ... } @ depends on patch @ identifier t; identifier fn_update_time =~ "update_time$"; @@ fn_update_time (..., - struct timespec t, + struct timespec64 t, ...) { ... } @ depends on patch @ identifier t; @@ lease_get_mtime( ... , - struct timespec t + struct timespec64 t ) { ... } @te depends on patch forall@ identifier ts; local idexpression struct inode inode_node; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn_update_time =~ "update_time$"; identifier fn; expression e, E3; local idexpression struct inode node1; local idexpression struct inode node2; local idexpression struct iattr attr1; local idexpression struct iattr attr2; local idexpression struct iattr attr; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; @@ ( ( - struct timespec ts; + struct timespec64 ts; \| - struct timespec ts = current_time(inode_node); + struct timespec64 ts = current_time(inode_node); ) <+... when != ts ( - timespec_equal(&inode_node->i_xtime, &ts) + timespec64_equal(&inode_node->i_xtime, &ts) \| - timespec_equal(&ts, &inode_node->i_xtime) + timespec64_equal(&ts, &inode_node->i_xtime) \| - timespec_compare(&inode_node->i_xtime, &ts) + timespec64_compare(&inode_node->i_xtime, &ts) \| - timespec_compare(&ts, &inode_node->i_xtime) + timespec64_compare(&ts, &inode_node->i_xtime) \| ts = current_time(e) \| fn_update_time(..., &ts,...) \| inode_node->i_xtime = ts \| node1->i_xtime = ts \| ts = inode_node->i_xtime \| <+... attr1->ia_xtime ...+> = ts \| ts = attr1->ia_xtime \| ts.tv_sec \| ts.tv_nsec \| btrfs_set_stack_timespec_sec(..., ts.tv_sec) \| btrfs_set_stack_timespec_nsec(..., ts.tv_nsec) \| - ts = timespec64_to_timespec( + ts = ... -) \| - ts = ktime_to_timespec( + ts = ktime_to_timespec64( ...) \| - ts = E3 + ts = timespec_to_timespec64(E3) \| - ktime_get_real_ts(&ts) + ktime_get_real_ts64(&ts) \| fn(..., - ts + timespec64_to_timespec(ts) ,...) ) ...+> ( <... when != ts - return ts; + return timespec64_to_timespec(ts); ...> ) \| - timespec_equal(&node1->i_xtime1, &node2->i_xtime2) + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2) \| - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2) + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2) \| - timespec_compare(&node1->i_xtime1, &node2->i_xtime2) + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2) \| node1->i_xtime1 = - timespec_trunc(attr1->ia_xtime1, + timespec64_trunc(attr1->ia_xtime1, ...) \| - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2, + attr1->ia_xtime1 = timespec64_trunc(attr2->ia_xtime2, ...) \| - ktime_get_real_ts(&attr1->ia_xtime1) + ktime_get_real_ts64(&attr1->ia_xtime1) \| - ktime_get_real_ts(&attr.ia_xtime1) + ktime_get_real_ts64(&attr.ia_xtime1) ) @ depends on patch @ struct inode node; struct iattr attr; identifier fn; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; expression e; @@ ( - fn(node->i_xtime); + fn(timespec64_to_timespec(node->i_xtime)); \| fn(..., - node->i_xtime); + timespec64_to_timespec(node->i_xtime)); \| - e = fn(attr->ia_xtime); + e = fn(timespec64_to_timespec(attr->ia_xtime)); ) @ depends on patch forall @ struct inode node; struct iattr attr; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); fn (..., - &node->i_xtime, + &ts, ...); \| + ts = timespec64_to_timespec(attr->ia_xtime); fn (..., - &attr->ia_xtime, + &ts, ...); ) ...+> } @ depends on patch forall @ struct inode node; struct iattr attr; struct kstat stat; identifier ia_xtime =~ "^ia_[acm]time$"; identifier i_xtime =~ "^i_[acm]time$"; identifier xtime =~ "^[acm]time$"; identifier fn, ret; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime, + &ts, ...); \| + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime); + &ts); \| + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime, + &ts, ...); \| + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime); + &ts); \| + ts = timespec64_to_timespec(stat->xtime); ret = fn (..., - &stat->xtime); + &ts); ) ...+> } @ depends on patch @ struct inode node; struct inode node2; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier i_xtime3 =~ "^i_[acm]time$"; struct iattr attrp; struct iattr attrp2; struct iattr attr ; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; struct kstat stat; struct kstat stat1; struct timespec64 ts; identifier xtime =~ "^[acmb]time$"; expression e; @@ ( ( node->i_xtime2 \\| attrp->ia_xtime2 \\| attr.ia_xtime2 \) = node->i_xtime1 ; \| node->i_xtime2 = $ node2->i_xtime1 \\| timespec64_trunc(...) $; \| node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = $ts \\| current_time(...) $; \| node->i_xtime1 = node->i_xtime3 = $ts \\| current_time(...) $; \| stat->xtime = node2->i_xtime1; \| stat1.xtime = node2->i_xtime1; \| ( node->i_xtime2 \\| attrp->ia_xtime2 \) = attrp->ia_xtime1 ; \| ( attrp->ia_xtime1 \\| attr.ia_xtime1 \) = attrp2->ia_xtime2; \| - e = node->i_xtime1; + e = timespec64_to_timespec( node->i_xtime1 ); \| - e = attrp->ia_xtime1; + e = timespec64_to_timespec( attrp->ia_xtime1 ); \| node->i_xtime1 = current_time(...); \| node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); \| node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); \| - node->i_xtime1 = e; + node->i_xtime1 = timespec_to_timespec64(e); ) Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: <anton@tuxera.com> Cc: <balbi@kernel.org> Cc: <bfields@fieldses.org> Cc: <darrick.wong@oracle.com> Cc: <dhowells@redhat.com> Cc: <dsterba@suse.com> Cc: <dwmw2@infradead.org> Cc: <hch@lst.de> Cc: <hirofumi@mail.parknet.co.jp> Cc: <hubcap@omnibond.com> Cc: <jack@suse.com> Cc: <jaegeuk@kernel.org> Cc: <jaharkes@cs.cmu.edu> Cc: <jslaby@suse.com> Cc: <keescook@chromium.org> Cc: <mark@fasheh.com> Cc: <miklos@szeredi.hu> Cc: <nico@linaro.org> Cc: <reiserfs-devel@vger.kernel.org> Cc: <richard@nod.at> Cc: <sage@redhat.com> Cc: <sfrench@samba.org> Cc: <swhiteho@redhat.com> Cc: <tj@kernel.org> Cc: <trond.myklebust@primarydata.com> Cc: <tytso@mit.edu> Cc: <viro@zeniv.linux.org.uk>	2018-06-05 16:57:31 -07:00
Bob Peterson	805c090750	GFS2: Log the reason for log flushes in every log header This patch just adds the capability for GFS2 to track which function called gfs2_log_flush. This should make it easier to diagnose problems based on the sequence of events found in the journals. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>	2018-01-23 07:39:20 -07:00
Bob Peterson	c1696fb85d	GFS2: Introduce new gfs2_log_header_v2 This patch adds a new structure called gfs2_log_header_v2 which is used to store expanded fields into previously unused areas of the log headers (i.e., this change is backwards compatible). Some of these are used for debug purposes so we can backtrack when problems occur. Others are reserved for future expansion. This patch is based on a prototype from Steve Whitehouse. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2018-01-23 07:38:53 -07:00
Linus Torvalds	0f0d12728e	Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull mount flag updates from Al Viro: "Another chunk of fmount preparations from dhowells; only trivial conflicts for that part. It separates MS_... bits (very grotty mount(2) ABI) from the struct super_block ->s_flags (kernel-internal, only a small subset of MS_... stuff). This does not convert the filesystems to new constants; only the infrastructure is done here. The next step in that series is where the conflicts would be; that's the conversion of filesystems. It's purely mechanical and it's better done after the merge, so if you could run something like list=$(for i in MS_RDONLY MS_NOSUID MS_NODEV MS_NOEXEC MS_SYNCHRONOUS MS_MANDLOCK MS_DIRSYNC MS_NOATIME MS_NODIRATIME MS_SILENT MS_POSIXACL MS_KERNMOUNT MS_I_VERSION MS_LAZYTIME; do git grep -l $i fs drivers/staging/lustre drivers/mtd ipc mm include/linux; done\|sort\|uniq\|grep -v '^fs/namespace.c$') sed -i -e 's/\<MS_RDONLY\>/SB_RDONLY/g' \ -e 's/\<MS_NOSUID\>/SB_NOSUID/g' \ -e 's/\<MS_NODEV\>/SB_NODEV/g' \ -e 's/\<MS_NOEXEC\>/SB_NOEXEC/g' \ -e 's/\<MS_SYNCHRONOUS\>/SB_SYNCHRONOUS/g' \ -e 's/\<MS_MANDLOCK\>/SB_MANDLOCK/g' \ -e 's/\<MS_DIRSYNC\>/SB_DIRSYNC/g' \ -e 's/\<MS_NOATIME\>/SB_NOATIME/g' \ -e 's/\<MS_NODIRATIME\>/SB_NODIRATIME/g' \ -e 's/\<MS_SILENT\>/SB_SILENT/g' \ -e 's/\<MS_POSIXACL\>/SB_POSIXACL/g' \ -e 's/\<MS_KERNMOUNT\>/SB_KERNMOUNT/g' \ -e 's/\<MS_I_VERSION\>/SB_I_VERSION/g' \ -e 's/\<MS_LAZYTIME\>/SB_LAZYTIME/g' \ $list and commit it with something along the lines of 'convert filesystems away from use of MS_... constants' as commit message, it would save a quite a bit of headache next cycle" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: Differentiate mount flags (MS_*) from internal superblock flags VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) vfs: Add sb_rdonly(sb) to query the MS_RDONLY flag on s_flags	2017-09-14 18:54:01 -07:00
Andreas Gruenbacher	eebd2e813f	gfs2: Get rid of gfs2_set_nlink Remove gfs2_set_nlink which prevents the link count of an inode from becoming non-zero once it has reached zero. The next commit reduces the amount of waiting on glocks when an inode is evicted from memory. With that, an inode can become reallocated before all the remote-unlink callbacks from a previous delete are processed, which causes the link count to change from zero to non-zero. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2017-08-10 10:42:11 -05:00
Wang Xibo	e7cb550d79	GFS2: fix code parameter error in inode_go_lock In inode_go_lock() function, the parameter order of list_add() is error. According to the define of list_add(), the first parameter is new entry and the second is the list head, so ip->i_trunc_list should be the first parameter and the sdp->sd_trunc_list should be second. Signed-off-by: Wang Xibo<wang.xibo@zte.com.cn> Signed-off-by: Xiao Likun<xiao.likun@zte.com.cn> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2017-07-21 07:40:59 -05:00
David Howells	bc98a42c1f	VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) Firstly by applying the following with coccinelle's spatch: @@ expression SB; @@ -SB->s_flags & MS_RDONLY +sb_rdonly(SB) to effect the conversion to sb_rdonly(sb), then by applying: @@ expression A, SB; @@ ( -(!sb_rdonly(SB)) && A +!sb_rdonly(SB) && A \| -A != (sb_rdonly(SB)) +A != sb_rdonly(SB) \| -A == (sb_rdonly(SB)) +A == sb_rdonly(SB) \| -!(sb_rdonly(SB)) +!sb_rdonly(SB) \| -A && (sb_rdonly(SB)) +A && sb_rdonly(SB) \| -A \|\| (sb_rdonly(SB)) +A \|\| sb_rdonly(SB) \| -(sb_rdonly(SB)) != A +sb_rdonly(SB) != A \| -(sb_rdonly(SB)) == A +sb_rdonly(SB) == A \| -(sb_rdonly(SB)) && A +sb_rdonly(SB) && A \| -(sb_rdonly(SB)) \|\| A +sb_rdonly(SB) \|\| A ) @@ expression A, B, SB; @@ ( -(sb_rdonly(SB)) ? 1 : 0 +sb_rdonly(SB) \| -(sb_rdonly(SB)) ? A : B +sb_rdonly(SB) ? A : B ) to remove left over excess bracketage and finally by applying: @@ expression A, SB; @@ ( -(A & MS_RDONLY) != sb_rdonly(SB) +(bool)(A & MS_RDONLY) != sb_rdonly(SB) \| -(A & MS_RDONLY) == sb_rdonly(SB) +(bool)(A & MS_RDONLY) == sb_rdonly(SB) ) to make comparisons against the result of sb_rdonly() (which is a bool) work correctly. Signed-off-by: David Howells <dhowells@redhat.com>	2017-07-17 08:45:34 +01:00
Andreas Gruenbacher	6f6597baae	gfs2: Protect gl->gl_object by spin lock Put all remaining accesses to gl->gl_object under the gl->gl_lockref.lock spinlock to prevent races. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2017-07-05 07:20:52 -05:00
Andreas Gruenbacher	4fd1a57952	gfs2: Get rid of flush_delayed_work in gfs2_evict_inode So far, gfs2_evict_inode clears gl->gl_object and then flushes the glock work queue to make sure that inode glops which dereference gl->gl_object have finished running before the inode is destroyed. However, flushing the work queue may do more work than needed, and in particular, it may call into DLM, which we want to avoid here. Use a bit lock (GIF_GLOP_PENDING) to synchronize between the inode glops and gfs2_evict_inode instead to get rid of the flushing. In addition, flush the work queues of existing glocks before reusing them for new inodes to get those glocks into a known state: the glock state engine currently doesn't handle glock re-appropriation correctly. (We may be able to fix the glock state engine instead later.) Based on a patch by Steven Whitehouse <swhiteho@redhat.com>. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2017-07-05 07:20:24 -05:00
Bob Peterson	73cc86252b	GFS2: Get rid of dead code in inode_go_demote_ok Function inode_go_demote_ok had some code that was only executed if gl_holders was not empty. However, if gl_holders was not empty, the only caller, demote_ok(), returns before inode_go_demote_ok would ever be called. Therefore, it's dead code, so I removed it. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>	2016-04-05 11:59:18 -04:00
Andreas Gruenbacher	f39814f60a	gfs2: Invalid security labels of inodes when they go invalid When gfs2 releases the glock of an inode, it must invalidate all information cached for that inode, including the page cache and acls. Use the new security_inode_invalidate_secctx hook to also invalidate security labels in that case. These items will be reread from disk when needed after reacquiring the glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: cluster-devel@redhat.com [PM: fixed spelling errors and description line lengths] Signed-off-by: Paul Moore <pmoore@redhat.com>	2015-12-24 11:09:40 -05:00
Andreas Gruenbacher	f3dd164912	gfs2: Remove gl_spin define Commit `e66cf161` replaced the gl_spin spinlock in struct gfs2_glock with a gl_lockref lockref and defined gl_spin as gl_lockref.lock (the spinlock in gl_lockref). Remove that define to make the references to gl_lockref.lock more obvious. Signed-off-by: Andreas Gruenbacher <andreas.gruenbacher@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2015-10-29 12:57:48 -05:00
Bob Peterson	15562c439d	GFS2: Move glock superblock pointer to field gl_name What uniquely identifies a glock in the glock hash table is not gl_name, but gl_name and its superblock pointer. This patch makes the gl_name field correspond to a unique glock identifier. That will allow us to simplify hashing with a future patch, since the hash algorithm can then take the gl_name and hash its components in one operation. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>	2015-09-03 13:33:09 -05:00
Bob Peterson	39b0f1e929	GFS2: Don't brelse rgrp buffer_heads every allocation This patch allows the block allocation code to retain the buffers for the resource groups so they don't need to be re-read from buffer cache with every request. This is a performance improvement that's especially noticeable when resource groups are very large. For example, with 2GB resource groups and 4K blocks, there can be 33 blocks for every resource group. This patch allows those 33 buffers to be kept around and not read in and thrown away with every operation. The buffers are released when the resource group is either synced or invalidated. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>	2015-06-19 07:40:22 -05:00
Bob Peterson	e7ccaf5fe1	GFS2: Don't add all glocks to the lru The glocks used for resource groups often come and go hundreds of thousands of times per second. Adding them to the lru list just adds unnecessary contention for the lru_lock spin_lock, especially considering we're almost certainly going to re-use the glock and take it back off the lru microseconds later. We never want the glock shrinker to cull them anyway. This patch adds a new bit in the glops that determines which glock types get put onto the lru list and which ones don't. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>	2015-06-18 12:17:59 -05:00

1 2 3

148 Commits