OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Matthew Wilcox	4d934a5e6c	ext4: Convert ext4_write_begin() to use a folio Remove a lot of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-17-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	6b90d4130a	ext4: Convert ext4_write_inline_data_end() to use a folio Convert the incoming page to a folio so that we call compound_head() only once instead of seven times. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-16-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	6b87fbe415	ext4: Convert ext4_read_inline_page() to ext4_read_inline_folio() All callers now have a folio, so pass it and use it. The folio may be large, although I doubt we'll want to use a large folio for an inline file. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-15-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	9a9d01f081	ext4: Convert ext4_da_write_inline_data_begin() to use a folio Saves a number of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-14-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	4ed9b598ac	ext4: Convert ext4_da_convert_inline_data_to_extent() to use a folio Saves a number of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-13-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	f8f8c89f59	ext4: Convert ext4_try_to_write_inline_data() to use a folio Saves a number of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-12-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	83eba701cf	ext4: Convert ext4_convert_inline_data_to_extent() to use a folio Saves a number of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-11-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	3edde93e07	ext4: Convert ext4_readpage_inline() to take a folio Use the folio API in this function, saves a few calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-10-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	e8d6062c50	ext4: Convert ext4_bio_write_page() to ext4_bio_write_folio() The only caller now has a folio so pass it in directly and avoid the call to page_folio() at the beginning. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-9-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	33483b3b6e	ext4: Convert mpage_page_done() to mpage_folio_done() All callers now have a folio so we can pass one in and use the folio APIs to support large folios as well as save instructions by eliminating a call to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-8-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:51 -04:00
Matthew Wilcox	81a0d3e126	ext4: Convert mpage_submit_page() to mpage_submit_folio() All callers now have a folio so we can pass one in and use the folio APIs to support large folios as well as save instructions by eliminating calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-7-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:50 -04:00
Matthew Wilcox	4da2f6e3c4	ext4: Turn mpage_process_page() into mpage_process_folio() The page/folio is only used to extract the buffers, so this is a simple change. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-6-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:50 -04:00
Matthew Wilcox	bb64c08bff	ext4: Convert ext4_finish_bio() to use folios Prepare ext4 to support large folios in the page writeback path. Also set the actual error in the mapping, not just -EIO. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-5-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:50 -04:00
Matthew Wilcox	cd57b77197	ext4: Convert ext4_bio_write_page() to use a folio Remove several calls to compound_head() and the last caller of set_page_writeback_keepwrite(), so remove the wrapper too. Also export bio_add_folio() as this is the first caller from a module. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230324180129.1220691-4-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:50 -04:00
Matthew Wilcox	e999a5c5a1	fs: Add FGP_WRITEBEGIN This particular combination of flags is used by most filesystems in their ->write_begin method, although it does find use in a few other places. Before folios, it warranted its own function (grab_cache_page_write_begin()), but I think that just having specialised flags is enough. It certainly helps the few places that have been converted from grab_cache_page_write_begin() to __filemap_get_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20230324180129.1220691-2-willy@infradead.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 13:39:50 -04:00
Ojaswin Mujoo	361eb69fc9	ext4: Remove the logic to trim inode PAs Earlier, inode PAs were stored in a linked list. This caused a need to periodically trim the list down inorder to avoid growing it to a very large size, as this would severly affect performance during list iteration. Recent patches changed this list to an rbtree, and since the tree scales up much better, we no longer need to have the trim functionality, hence remove it. Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/c409addceaa3ade4b40328e28e3b54b2f259689e.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:13 -04:00
Ojaswin Mujoo	3872778664	ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list Currently, the kernel uses i_prealloc_list to hold all the inode preallocations. This is known to cause degradation in performance in workloads which perform large number of sparse writes on a single file. This is mainly because functions like ext4_mb_normalize_request() and ext4_mb_use_preallocated() iterate over this complete list, resulting in slowdowns when large number of PAs are present. Patch `27bc446e2` partially fixed this by enforcing a limit of 512 for the inode preallocation list and adding logic to continually trim the list if it grows above the threshold, however our testing revealed that a hardcoded value is not suitable for all kinds of workloads. To optimize this, add an rbtree to the inode and hold the inode preallocations in this rbtree. This will make iterating over inode PAs faster and scale much better than a linked list. Additionally, we also had to remove the LRU logic that was added during trimming of the list (in ext4_mb_release_context()) as it will add extra overhead in rbtree. The discards now happen in the lowest-logical-offset-first order. Locking notes With the introduction of rbtree to maintain inode PAs, we can't use RCU to walk the tree for searching since it can result in partial traversals which might miss some nodes(or entire subtrees) while discards happen in parallel (which happens under a lock). Hence this patch converts the ei->i_prealloc_lock spin_lock to rw_lock. Almost all the codepaths that read/modify the PA rbtrees are protected by the higher level inode->i_data_sem (except ext4_mb_discard_group_preallocations() and ext4_clear_inode()) IIUC, the only place we need lock protection is when one thread is reading "searching" the PA rbtree (earlier protected under rcu_read_lock()) and another is "deleting" the PAs in ext4_mb_discard_group_preallocations() function (which iterates all the PAs using the grp->bb_prealloc_list and deletes PAs from the tree without taking any inode lock (i_data_sem)). So, this patch converts all rcu_read_lock/unlock() paths for inode list PA to use read_lock() and all places where we were using ei->i_prealloc_lock spinlock will now be using write_lock(). Note that this makes the fast path (searching of the right PA e.g. ext4_mb_use_preallocated() or ext4_mb_normalize_request()), now use read_lock() instead of rcu_read_lock/unlock(). Ths also will now block due to slow discard path (ext4_mb_discard_group_preallocations()) which uses write_lock(). But this is not as bad as it looks. This is because - 1. The slow path only occurs when the normal allocation failed and we can say that we are low on disk space. One can argue this scenario won't be much frequent. 2. ext4_mb_discard_group_preallocations(), locks and unlocks the rwlock for deleting every individual PA. This gives enough opportunity for the fast path to acquire the read_lock for searching the PA inode list. Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/4137bce8f6948fedd8bae134dabae24acfe699c6.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:13 -04:00
Ojaswin Mujoo	a8e38fd37c	ext4: Convert pa->pa_inode_list and pa->pa_obj_lock into a union Splitting pa->pa_inode_list Currently, we use the same pa->pa_inode_list to add a pa to either the inode preallocation list or the locality group preallocation list. For better clarity, split this list into a union of 2 list_heads and use either of the them based on the type of pa. Splitting pa->pa_obj_lock Currently, pa->pa_obj_lock is either assigned &ei->i_prealloc_lock for inode PAs or lg_prealloc_lock for lg PAs, and is then used to lock the lists containing these PAs. Make the distinction between the 2 PA types clear by changing this lock to a union of 2 locks. Explicitly use the pa_lock_node.inode_lock for inode PAs and pa_lock_node.lg_lock for lg PAs. This patch is required so that the locality group preallocation code remains the same as in upcoming patches we are going to make changes to inode preallocation code to move from list to rbtree based implementation. This patch also makes it easier to review the upcoming patches. There are no functional changes in this patch. Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/1d7ac0557e998c3fc7eef422b52e4bc67bdef2b0.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:13 -04:00
Ojaswin Mujoo	93cdf49f6e	ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa() When the length of best extent found is less than the length of goal extent we need to make sure that the best extent atleast covers the start of the original request. This is done by adjusting the ac_b_ex.fe_logical (logical start) of the extent. While doing so, the current logic sometimes results in the best extent's logical range overflowing the goal extent. Since this best extent is later added to the inode preallocation list, we have a possibility of introducing overlapping preallocations. This is discussed in detail here [1]. As per Jan's suggestion, to fix this, replace the existing logic with the below logic for adjusting best extent as it keeps fragmentation in check while ensuring logical range of best extent doesn't overflow out of goal extent: 1. Check if best extent can be kept at end of goal range and still cover original start. 2. Else, check if best extent can be kept at start of goal range and still cover original start. 3. Else, keep the best extent at start of original request. Also, add a few extra BUG_ONs that might help catch errors faster. [1] https://lore.kernel.org/r/Y+OGkVvzPN0RMv0O@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/f96aca6d415b36d1f90db86c1a8cd7e2e9d7ab0e.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:13 -04:00
Ojaswin Mujoo	0830344c95	ext4: Abstract out overlap fix/check logic in ext4_mb_normalize_request() Abstract out the logic of fixing PA overlaps in ext4_mb_normalize_request to improve readability of code. This also makes it easier to make changes to the overlap logic in future. There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/9b35f3955a1d7b66bbd713eca1e63026e01f78c1.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Ojaswin Mujoo	7692094ac5	ext4: Move overlap assert logic into a separate function Abstract out the logic to double check for overlaps in normalize_pa to a separate function. Since there has been no reports in past where we have seen any overlaps which hits this bug_on(), in future we can consider calling this function under "#ifdef AGGRESSIVE_CHECK" only. There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/35dd5d94fa0b2d1cd2d2947adf8967279c72967d.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Ojaswin Mujoo	bcf4349921	ext4: Refactor code in ext4_mb_normalize_request() and ext4_mb_use_preallocated() Change some variable names to be more consistent and refactor some of the code to make it easier to read. There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/8edcab489c06cf861b19d87207d9b0ff7ac7f3c1.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Ojaswin Mujoo	820897258a	ext4: Refactor code related to freeing PAs This patch makes the following changes: * Rename ext4_mb_pa_free to ext4_mb_pa_put_free to better reflect its purpose * Add new ext4_mb_pa_free() which only handles freeing * Refactor ext4_mb_pa_callback() to use ext4_mb_pa_free() There are no functional changes in this patch Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/b273bc9cbf5bd278f641fa5bc6c0cc9e6cb3330c.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Ojaswin Mujoo	e86a718228	ext4: Stop searching if PA doesn't satisfy non-extent file If we come across a PA that matches the logical offset but is unable to satisfy a non-extent file due to its physical start being higher than that supported by non extent files, then simply stop searching for another PA and break out of loop. This is because, since PAs don't overlap, we won't be able to find another inode PA which can satisfy the original request. Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/42404ca29bd304ae2c962184c3c32a02e8eefcd0.1679731817.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Theodore Ts'o	19b8b035a7	ext4: convert some BUG_ON's in mballoc to use WARN_RATELIMITED instead In cases where we have an obvious way of continuing, let's use WARN_RATELIMITED() instead of BUG_ON(). Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	91a48aaf59	ext4: avoid unnecessary pointer dereference in ext4_mb_normalize_request Result of EXT4_SB(ac->ac_sb) is already stored in sbi at beginning of ext4_mb_normalize_request. Use sbi instead of EXT4_SB(ac->ac_sb) to remove unnecessary pointer dereference. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20230311170949.1047958-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	1221b23501	ext4: fix typos in mballoc pa_plen -> pa_len pa_start -> pa_pstart Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20230311170949.1047958-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	253cacb0de	ext4: simplify calculation of blkoff in ext4_mb_new_blocks_simple We try to allocate a block from goal in ext4_mb_new_blocks_simple. We only need get blkoff in first group with goal and set blkoff to 0 for the rest groups. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230303172120.3800725-21-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	46825e9490	ext4: remove comment code ext4_discard_preallocations Just remove comment code in ext4_discard_preallocations. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-20-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	3a037b1b88	ext4: remove repeat assignment to ac_f_ex Call trace to assign ac_f_ex: ext4_mb_use_best_found ac->ac_f_ex = ac->ac_b_ex; ext4_mb_new_preallocation ext4_mb_new_group_pa ac->ac_f_ex = ac->ac_b_ex; ext4_mb_new_inode_pa ac->ac_f_ex = ac->ac_b_ex; Actually allocated blocks is already stored in ac_f_ex in ext4_mb_use_best_found, so there is no need to assign ac_f_ex in ext4_mb_new_group_pa and ext4_mb_new_inode_pa. Just remove repeat assignment to ac_f_ex in ext4_mb_new_group_pa and ext4_mb_new_inode_pa. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-19-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	fb28f9ceec	ext4: remove unnecessary goto in ext4_mb_mark_diskspace_used When ext4_read_block_bitmap fails, we can return PTR_ERR(bitmap_bh) to remove unnecessary NULL check of bitmap_bh. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20230303172120.3800725-18-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	c7f2bafa3c	ext4: remove unnecessary count2 in ext4_free_data_in_buddy count2 is always 1 in mb_debug, just remove unnecessary count2. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-17-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:12 -04:00
Kemeng Shi	df11909514	ext4: remove unnecessary exit_meta_group_info tag We goto exit_meta_group_info only to return -ENOMEM. Return -ENOMEM directly instead of goto to remove this unnecessary tag. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-16-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	78dc9f844f	ext4: use best found when complex scan of group finishs If any bex which meets bex->fe_len >= gex->fe_len is found, then it will always be used when complex scan of group that bex belongs to finishs. So there will not be any lock-unlock period. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-15-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	32c0869370	ext4: remove ac->ac_found > sbi->s_mb_min_to_scan dead check in ext4_mb_check_limits Only call trace of ext4_mb_check_limits is as following: ext4_mb_complex_scan_group ext4_mb_measure_extent ext4_mb_check_limits(ac, e4b, 0); ext4_mb_check_limits(ac, e4b, 1); If the first ac->ac_found > sbi->s_mb_max_to_scan check in ext4_mb_check_limits is met, we will set ac_status to AC_STATUS_BREAK and call ext4_mb_try_best_found to try to use ac->ac_b_ex. If ext4_mb_try_best_found successes, then block allocation finishs, the removed ac->ac_found > sbi->s_mb_min_to_scan check is not reachable. If ext4_mb_try_best_found fails, then we set EXT4_MB_HINT_FIRST and reset ac->ac_b_ex to retry block allocation. We will use any found free extent in ext4_mb_measure_extent before reach the removed ac->ac_found > sbi->s_mb_min_to_scan check. In summary, the removed ac->ac_found > sbi->s_mb_min_to_scan check is not reachable and we can remove that dead check. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-14-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	976620bd26	ext4: remove dead check in mb_buddy_mark_free We always adjust first to even number and adjust last to odd number, so first == last will never happen. Remove this dead check. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230303172120.3800725-13-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	aaae558dae	ext4: remove unnecessary check in ext4_mb_new_blocks 1. remove unnecessary ac check: We always go to out tag before ac is successfully allocated, then we can move out tag after free of ac and remove NULL check of ac. 2. remove unnecessary errp check: We always go to errout tag if errp is non-zero, then we can move errout tag into error handle if *errp is non-zero. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-12-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	285164b801	ext4: remove unnecessary e4b->bd_buddy_page check in ext4_mb_load_buddy_gfp e4b->bd_buddy_page is only set if we initialize ext4_buddy successfully. So e4b->bd_buddy_page is always NULL in error handle branch. Just remove the dead check. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-11-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	139f46d3b5	ext4: Remove unnecessary release when memory allocation failed in ext4_mb_init_cache If we alloc array of buffer_head failed, there is no resource need to be freed and we can simpily return error. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-10-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	85b67ffb7d	ext4: remove unused return value of ext4_mb_try_best_found and ext4_mb_free_metadata Return value static function ext4_mb_try_best_found and ext4_mb_free_metadata is not used. Just remove unused return value. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-9-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	1b5c9d3494	ext4: add missed brelse in ext4_free_blocks_simple Release bitmap buffer_head we got if error occurs. Besides, this patch remove unused assignment to err. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-8-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	36cb0f52ae	ext4: protect pa->pa_free in ext4_discard_allocated_blocks If ext4_mb_mark_diskspace_used fails in ext4_mb_new_blocks, we may discard pa already in list. Protect pa with pa_lock to avoid race. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-7-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	1afdc58894	ext4: correct start of used group pa for debug in ext4_mb_use_group_pa As we don't correct pa_lstart here, so there is no need to subtract pa_lstart with consumed len. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-6-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	abc075d4a5	ext4: correct calculation of s_mb_preallocated We will add pa_free to s_mb_preallocated when new ext4_prealloc_space is created. In ext4_mb_new_inode_pa, we will call ext4_mb_use_inode_pa before adding pa_free to s_mb_preallocated. However, ext4_mb_use_inode_pa will consume pa_free for block allocation which triggerred the creation of ext4_prealloc_space. Add pa_free to s_mb_preallocated before ext4_mb_use_inode_pa to correct calculation of s_mb_preallocated. There is no such problem in ext4_mb_new_group_pa as pa_free of group pa is consumed in ext4_mb_release_context instead of ext4_mb_use_group_pa. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	22fab98402	ext4: get correct ext4_group_info in ext4_mb_prefetch_fini We always get ext4_group_desc with group + 1 and ext4_group_info with group to check if we need do initialize ext4_group_info for the group. Just get ext4_group_desc with group for ext4_group_info initialization check. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230303172120.3800725-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:11 -04:00
Kemeng Shi	01e4ca2945	ext4: allow to find by goal if EXT4_MB_HINT_GOAL_ONLY is set If EXT4_MB_HINT_GOAL_ONLY is set, ext4_mb_regular_allocator will only allocate blocks from ext4_mb_find_by_goal. Allow to find by goal in ext4_mb_find_by_goal if EXT4_MB_HINT_GOAL_ONLY is set or allocation with EXT4_MB_HINT_GOAL_ONLY set will always fail. EXT4_MB_HINT_GOAL_ONLY is not used at all, so the problem is not found for now. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230303172120.3800725-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:10 -04:00
Kemeng Shi	b07ffe6927	ext4: set goal start correctly in ext4_mb_normalize_request We need to set ac_g_ex to notify the goal start used in ext4_mb_find_by_goal. Set ac_g_ex instead of ac_f_ex in ext4_mb_normalize_request. Besides we should assure goal start is in range [first_data_block, blocks_count) as ext4_mb_initialize_context does. [ Added a check to make sure size is less than ar->pright; otherwise we could end up passing an underflowed value of ar->pright - size to ext4_get_group_no_and_offset(), which will trigger a BUG_ON later on. - TYT ] Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/20230303172120.3800725-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-04-06 01:13:03 -04:00
Kemeng Shi	1df9bde48f	ext4: remove unused group parameter in ext4_block_bitmap_csum_set Remove unused group parameter in ext4_block_bitmap_csum_set. After this, group parameter in ext4_set_bitmap_checksums is also not used, just remove it too. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-03-23 23:00:08 -04:00
Kemeng Shi	82483dfe17	ext4: remove unused group parameter in ext4_block_bitmap_csum_verify Remove unused group parameter in ext4_block_bitmap_csum_verify. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-03-23 23:00:08 -04:00
Kemeng Shi	4fd873c817	ext4: remove unused group parameter in ext4_inode_bitmap_csum_set Remove unused group parameter in ext4_inode_bitmap_csum_set. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230221203027.2359920-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2023-03-23 23:00:08 -04:00

1 2 3 4 5 ...

80894 Commits