OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Yonggil Song	6797ebc4ac	f2fs: Fix discard bug on zoned block devices with 2MiB zone size When using f2fs on a zoned block device with 2MiB zone size, IO errors occurs because f2fs tries to write data to a zone that has not been reset. The cause is that f2fs tries to discard multiple zones at once. This is caused by a condition in f2fs_clear_prefree_segments that does not check for zoned block devices when setting the discard range. This leads to invalid reset commands and write pointer mismatches. This patch fixes the zoned block device with 2MiB zone size to reset one zone at a time. Signed-off-by: Yonggil Song <yonggil.song@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:38 -07:00
Jaegeuk Kim	bf21acf995	f2fs: remove entire rb_entry sharing This is a last part to remove the memory sharing for rb_tree in extent_cache. This should also fix arm32 memory alignment issue. [struct extent_node] [struct rb_entry] [0] struct rb_node rb_node; [0] struct rb_node rb_node; union { union { struct { struct { [16] unsigned int fofs; [12] unsigned int ofs; unsigned int len; unsigned int len; }; unsigned long long key; } __packed; Cc: <stable@vger.kernel.org> Fixes: `13054c548a` ("f2fs: introduce infra macro and data structure of rb-tree extent cache") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:38 -07:00
Jaegeuk Kim	f69475dd48	f2fs: factor out discard_cmd usage from general rb_tree use This is a second part to remove the mixed use of rb_tree in discard_cmd from extent_cache. This should also fix arm32 memory alignment issue caused by shared rb_entry. [struct discard_cmd] [struct rb_entry] [0] struct rb_node rb_node; [0] struct rb_node rb_node; union { union { struct { struct { [16] block_t lstart; [12] unsigned int ofs; block_t len; unsigned int len; }; unsigned long long key; } __packed; Cc: <stable@vger.kernel.org> Fixes: `004b686218` ("f2fs: use rb-tree to track pending discard commands") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:38 -07:00
Jaegeuk Kim	043d2d00b4	f2fs: factor out victim_entry usage from general rb_tree use Let's reduce the complexity of mixed use of rb_tree in victim_entry from extent_cache and discard_cmd. This should fix arm32 memory alignment issue caused by shared rb_entry. [struct victim_entry] [struct rb_entry] [0] struct rb_node rb_node; [0] struct rb_node rb_node; union { struct { unsigned int ofs; unsigned int len; }; [16] unsigned long long mtime; [12] unsigned long long key; } __packed; Cc: <stable@vger.kernel.org> Fixes: `093749e296` ("f2fs: support age threshold based garbage collection") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:38 -07:00
Yonggil Song	c17caf0ba3	f2fs: fix uninitialized skipped_gc_rwsem When f2fs skipped a gc round during victim migration, there was a bug which would skip all upcoming gc rounds unconditionally because skipped_gc_rwsem was not initialized. It fixes the bug by correctly initializing the skipped_gc_rwsem inside the gc loop. Fixes: `6f8d445506` ("f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc") Signed-off-by: Yonggil Song <yonggil.song@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:38 -07:00
Yangtao Li	8051692f5f	f2fs: handle dqget error in f2fs_transfer_project_quota() We should set the error code when dqget() failed. Fixes: `2c1d030569` ("f2fs: support F2FS_IOC_FS{GET,SET}XATTR") Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:37 -07:00
Yangtao Li	447286ebad	f2fs: convert to use bitmap API Let's use BIT() and GENMASK() instead of open it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:37 -07:00
Yangtao Li	960fa2c828	f2fs: export compress_percent and compress_watermark entries This patch export below sysfs entries for better control cached compress page count. /sys/fs/f2fs/<disk>/compress_watermark /sys/fs/f2fs/<disk>/compress_percent Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:37 -07:00
Li Zetao	6063037506	f2fs: make f2fs_sync_inode_meta() static After commit `26b5a07919` ("f2fs: cleanup dirty pages if recover failed"), f2fs_sync_inode_meta() is only used in checkpoint.c, so f2fs_sync_inode_meta() should only be visible inside. Delete the declaration in the header file and change f2fs_sync_inode_meta() to static. Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-03-29 15:17:37 -07:00
Christian Brauner	d549b74174	fs: rename generic posix acl handlers Reflect in their naming and document that they are kept around for legacy reasons and shouldn't be used anymore by new code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-03-06 09:57:13 +01:00
Christian Brauner	a5488f2983	fs: simplify ->listxattr() implementation The ext{2,4}, erofs, f2fs, and jffs2 filesystems use the same logic to check whether a given xattr can be listed. Simplify them and avoid open-coding the same check by calling the helper we introduced earlier. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: linux-f2fs-devel@lists.sourceforge.net Cc: linux-erofs@lists.ozlabs.org Cc: linux-ext4@vger.kernel.org Cc: linux-mtd@lists.infradead.org Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-03-06 09:57:12 +01:00
Christian Brauner	0c95c025a0	fs: drop unused posix acl handlers Remove struct posix_acl_{access,default}_handler for all filesystems that don't depend on the xattr handler in their inode->i_op->listxattr() method in any way. There's nothing more to do than to simply remove the handler. It's been effectively unused ever since we introduced the new posix acl api. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-03-06 09:57:12 +01:00
Linus Torvalds	103830683c	f2fs-for-6.3-rc1 In this round, we've got a huge number of patches that improve code readability along with minor bug fixes, while we've mainly fixed some critical issues in recently-added per-block age-based extent_cache, atomic write support, and some folio cases. Enhancement: - add sysfs nodes to set last_age_weight and manage discard_io_aware_gran - show ipu policy in debugfs - reduce stack memory cost by using bitfield in struct f2fs_io_info - introduce trace_f2fs_replace_atomic_write_block - enhance iostat support and adds flush commands Bug fix: - revert "f2fs: truncate blocks in batch in __complete_revoke_list()" - fix kernel crash on the atomic write abort flow - call clear_page_private_reference in .{release,invalid}_folio - support .migrate_folio for compressed inode - fix cgroup writeback accounting with fs-layer encryption - retry to update the inode page given data corruption - fix kernel crash due to null io->bio - fix some bugs in per-block age-based extent_cache: a. wrong calculation of block age b. update age extent in f2fs_do_zero_range() c. update age extent correctly during truncation -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmP9M/cACgkQQBSofoJI UNIX1Q//Yp+nDeY91H3IO6aMSHPqRoBBnVTr8ERtLUF0fuQbBkzcQE+t8cMSoYDM 88sxoC+F7UnovNr84VeuKlHN4hYyAXuxj8OZetXI3XX+yiO+auEPtljGA0BaGwqL 93lIg8nIQ2ing/oZ/4+h4dvpYCPrhKOQS6h1sHhIWlql6Wxwxq01uA47i0Ni6y/o D23JPFaDQfumN8qy1bm5xfLRhTQmaE35n5NhcBJpUD/rGK92NPXv7RLKPgZc3tKN tJmL+NXa3NNEx5e5TSP7JX+rhD7KlL5XlB/m8LbpPIx338I0pt0uzAc7nTIOlzM3 DI0q3HXe9U3+JBHi+rKxkIniiRDmvhPx3NzdgcsYg75EnwdrNazsRHulBUEAXB4v ghHbx53OxA2uSnUVbkY4HXNYf7cYjrx5vbX0oqEx48btBCC8KGFIcHtI72tIBee3 xdCxoM3e2AWijkFBOCkThXNuNNbdifQzn2e7xR7W+o0L9hwdR5t7tHhHT+cqG9Ox 6UKWoZZUjYUAV/YQT5Qh6570GsGncM8gHAUMz7DVTIOB9wYkHtb0Q9tUmoQccwf5 46r84c4/jxUQHt8SkXBBl1aiqHmR7EF17YvM5AXIdG3DqqOy70NWkM48hMOyIy2G eY/wTVJwpd6oDveQPtxaHn7Wo3jSPNUlidOfAYQ7Itm34VXY/zc= =yXLl -----END PGP SIGNATURE----- Merge tag 'f2fs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've got a huge number of patches that improve code readability along with minor bug fixes, while we've mainly fixed some critical issues in recently-added per-block age-based extent_cache, atomic write support, and some folio cases. Enhancements: - add sysfs nodes to set last_age_weight and manage discard_io_aware_gran - show ipu policy in debugfs - reduce stack memory cost by using bitfield in struct f2fs_io_info - introduce trace_f2fs_replace_atomic_write_block - enhance iostat support and adds flush commands Bug fixes: - revert "f2fs: truncate blocks in batch in __complete_revoke_list()" - fix kernel crash on the atomic write abort flow - call clear_page_private_reference in .{release,invalid}_folio - support .migrate_folio for compressed inode - fix cgroup writeback accounting with fs-layer encryption - retry to update the inode page given data corruption - fix kernel crash due to NULL io->bio - fix some bugs in per-block age-based extent_cache: - wrong calculation of block age - update age extent in f2fs_do_zero_range() - update age extent correctly during truncation" * tag 'f2fs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (81 commits) f2fs: drop unnecessary arg for f2fs_ioc_*() f2fs: Revert "f2fs: truncate blocks in batch in __complete_revoke_list()" f2fs: synchronize atomic write aborts f2fs: fix wrong segment count f2fs: replace si->sbi w/ sbi in stat_show() f2fs: export ipu policy in debugfs f2fs: make kobj_type structures constant f2fs: fix to do sanity check on extent cache correctly f2fs: add missing description for ipu_policy node f2fs: fix to set ipu policy f2fs: fix typos in comments f2fs: fix kernel crash due to null io->bio f2fs: use iostat_lat_type directly as a parameter in the iostat_update_and_unbind_ctx() f2fs: add sysfs nodes to set last_age_weight f2fs: fix f2fs_show_options to show nogc_merge mount option f2fs: fix cgroup writeback accounting with fs-layer encryption f2fs: fix wrong calculation of block age f2fs: fix to update age extent in f2fs_do_zero_range() f2fs: fix to update age extent correctly during truncation f2fs: fix to avoid potential memory corruption in __update_iostat_latency() ...	2023-02-27 16:18:51 -08:00
Linus Torvalds	3822a7c409	- Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K DmxHkn0LAitGgJRS/W9w81yrgig9tAQ= =MlGs -----END PGP SIGNATURE----- Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...	2023-02-23 17:09:35 -08:00
Linus Torvalds	6639c3ce7f	fsverity updates for 6.3 Fix the longstanding implementation limitation that fsverity was only supported when the Merkle tree block size, filesystem block size, and PAGE_SIZE were all equal. Specifically, add support for Merkle tree block sizes less than PAGE_SIZE, and make ext4 support fsverity on filesystems where the filesystem block size is less than PAGE_SIZE. Effectively, this means that fsverity can now be used on systems with non-4K pages, at least on ext4. These changes have been tested using the verity group of xfstests, newly updated to cover the new code paths. Also update fs/verity/ to support verifying data from large folios. There's also a similar patch for fs/crypto/, to support decrypting data from large folios, which I'm including in this pull request to avoid a merge conflict between the fscrypt and fsverity branches. There will be a merge conflict in fs/buffer.c with some of the foliation work in the mm tree. Please use the merge resolution from linux-next. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY/KJtRQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK/A/AP0RUlCClBRuHwXPRG0we8R1L153ga4s Vl+xRpCr+SswXwEAiOEpYN5cXoVKzNgxbEXo2pQzxi5lrpjZgUI6CL3DuQs= =ZRFX -----END PGP SIGNATURE----- Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux Pull fsverity updates from Eric Biggers: "Fix the longstanding implementation limitation that fsverity was only supported when the Merkle tree block size, filesystem block size, and PAGE_SIZE were all equal. Specifically, add support for Merkle tree block sizes less than PAGE_SIZE, and make ext4 support fsverity on filesystems where the filesystem block size is less than PAGE_SIZE. Effectively, this means that fsverity can now be used on systems with non-4K pages, at least on ext4. These changes have been tested using the verity group of xfstests, newly updated to cover the new code paths. Also update fs/verity/ to support verifying data from large folios. There's also a similar patch for fs/crypto/, to support decrypting data from large folios, which I'm including in here to avoid a merge conflict between the fscrypt and fsverity branches" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fscrypt: support decrypting data from large folios fsverity: support verifying data from large folios fsverity.rst: update git repo URL for fsverity-utils ext4: allow verity with fs block size < PAGE_SIZE fs/buffer.c: support fsverity in block_read_full_folio() f2fs: simplify f2fs_readpage_limit() ext4: simplify ext4_readpage_limit() fsverity: support enabling with tree block size < PAGE_SIZE fsverity: support verification with tree block size < PAGE_SIZE fsverity: replace fsverity_hash_page() with fsverity_hash_block() fsverity: use EFBIG for file too large to enable verity fsverity: store log2(digest_size) precomputed fsverity: simplify Merkle tree readahead size calculation fsverity: use unsigned long for level_start fsverity: remove debug messages and CONFIG_FS_VERITY_DEBUG fsverity: pass pos and size to ->write_merkle_tree_block fsverity: optimize fsverity_cleanup_inode() on non-verity files fsverity: optimize fsverity_prepare_setattr() on non-verity files fsverity: optimize fsverity_file_open() on non-verity files	2023-02-20 12:33:41 -08:00
Linus Torvalds	f18f9845f2	fscrypt updates for 6.3 Simplify the implementation of the test_dummy_encryption mount option by adding the "test dummy key" on-demand. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY/J7NRQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3UVAP9Likiuy47D/RM4mOsPMwLAlQRx5uW6 iGxT6DutekA7DwEA4hNjEQQ/EKO+UxFb+fBCX+xpTDbS3LB7CxGsqHzZJQM= =SiNJ -----END PGP SIGNATURE----- Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux Pull fscrypt updates from Eric Biggers: "Simplify the implementation of the test_dummy_encryption mount option by adding the 'test dummy key' on-demand" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: clean up fscrypt_add_test_dummy_key() fs/super.c: stop calling fscrypt_destroy_keyring() from __put_super() f2fs: stop calling fscrypt_add_test_dummy_key() ext4: stop calling fscrypt_add_test_dummy_key() fscrypt: add the test dummy encryption key on-demand	2023-02-20 12:29:27 -08:00
Linus Torvalds	05e6295f7b	fs.idmapped.v6.3 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NlQAKCRCRxhvAZXjc orOaAP9i2h3OJy95nO2Fpde0Bt2UT+oulKCCcGlvXJ8/+TQpyQD/ZQq47gFQ0EAz Br5NxeyGeecAb0lHpFz+CpLGsxMrMwQ= =+BG5 -----END PGP SIGNATURE----- Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull vfs idmapping updates from Christian Brauner: - Last cycle we introduced the dedicated struct mnt_idmap type for mount idmapping and the required infrastucture in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). As promised in last cycle's pull request message this converts everything to rely on struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevant on the mount level. Especially for non-vfs developers without detailed knowledge in this area this was a potential source for bugs. This finishes the conversion. Instead of passing the plain namespace around this updates all places that currently take a pointer to a mnt_userns with a pointer to struct mnt_idmap. Now that the conversion is done all helpers down to the really low-level helpers only accept a struct mnt_idmap argument instead of two namespace arguments. Conflating mount and other idmappings will now cause the compiler to complain loudly thus eliminating the possibility of any bugs. This makes it impossible for filesystem developers to mix up mount and filesystem idmappings as they are two distinct types and require distinct helpers that cannot be used interchangeably. Everything associated with struct mnt_idmap is moved into a single separate file. With that change no code can poke around in struct mnt_idmap. It can only be interacted with through dedicated helpers. That means all filesystems are and all of the vfs is completely oblivious to the actual implementation of idmappings. We are now also able to extend struct mnt_idmap as we see fit. For example, we can decouple it completely from namespaces for users that don't require or don't want to use them at all. We can also extend the concept of idmappings so we can cover filesystem specific requirements. In combination with the vfs{g,u}id_t work we finished in v6.2 this makes this feature substantially more robust and thus difficult to implement wrong by a given filesystem and also protects the vfs. - Enable idmapped mounts for tmpfs and fulfill a longstanding request. A long-standing request from users had been to make it possible to create idmapped mounts for tmpfs. For example, to share the host's tmpfs mount between multiple sandboxes. This is a prerequisite for some advanced Kubernetes cases. Systemd also has a range of use-cases to increase service isolation. And there are more users of this. However, with all of the other work going on this was way down on the priority list but luckily someone other than ourselves picked this up. As usual the patch is tiny as all the infrastructure work had been done multiple kernel releases ago. In addition to all the tests that we already have I requested that Rodrigo add a dedicated tmpfs testsuite for idmapped mounts to xfstests. It is to be included into xfstests during the v6.3 development cycle. This should add a slew of additional tests. * tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits) shmem: support idmapped mounts for tmpfs fs: move mnt_idmap fs: port vfs{g,u}id helpers to mnt_idmap fs: port fs{g,u}id helpers to mnt_idmap fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap fs: port i_{g,u}id_{needs_}update() to mnt_idmap quota: port to mnt_idmap fs: port privilege checking helpers to mnt_idmap fs: port inode_owner_or_capable() to mnt_idmap fs: port inode_init_owner() to mnt_idmap fs: port acl to mnt_idmap fs: port xattr to mnt_idmap fs: port ->permission() to pass mnt_idmap fs: port ->fileattr_set() to pass mnt_idmap fs: port ->set_acl() to pass mnt_idmap fs: port ->get_acl() to pass mnt_idmap fs: port ->tmpfile() to pass mnt_idmap fs: port ->rename() to pass mnt_idmap fs: port ->mknod() to pass mnt_idmap fs: port ->mkdir() to pass mnt_idmap ...	2023-02-20 11:53:11 -08:00
Yangtao Li	ddf1eca4fc	f2fs: drop unnecessary arg for f2fs_ioc_*() They are not used, let's remove them. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-15 09:49:04 -08:00
Jaegeuk Kim	c7dbc06688	f2fs: Revert "f2fs: truncate blocks in batch in __complete_revoke_list()" We should not truncate replaced blocks, and were supposed to truncate the first part as well. This reverts commit `78a99fe625`. Cc: stable@vger.kernel.org Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-15 09:48:28 -08:00
Daeho Jeong	a46bebd502	f2fs: synchronize atomic write aborts To fix a race condition between atomic write aborts, I use the inode lock and make COW inode to be re-usable thoroughout the whole atomic file inode lifetime. Reported-by: syzbot+823000d23b3400619f7c@syzkaller.appspotmail.com Fixes: `3db1de0e58` ("f2fs: change the current atomic write way") Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-14 10:08:59 -08:00
Jaegeuk Kim	7e986855fe	f2fs: fix wrong segment count MAIN_SEGS is for data area, while TOTAL_SEGS includes data and metadata. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-14 10:07:59 -08:00
Yangtao Li	dda7d77bcd	f2fs: replace si->sbi w/ sbi in stat_show() For each loop add a local f2fs_sb_info pointer insted of looking it up. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-14 10:07:55 -08:00
Yangtao Li	f2e357893c	f2fs: export ipu policy in debugfs Export ipu_policy as a string in debugfs for better readability and it can help us better understand some strategies of the file system. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-14 10:07:45 -08:00
Thomas Weißschuh	273a51e552	f2fs: make kobj_type structures constant Since commit `ee6d3dd4ed` ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-13 09:52:47 -08:00
Chao Yu	269d119481	f2fs: fix to do sanity check on extent cache correctly In do_read_inode(), sanity check for extent cache should be called after f2fs_init_read_extent_tree(), fix it. Fixes: `72840cccc0` ("f2fs: allocate the extent_cache by default") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-09 09:17:30 -08:00
Eric Biggers	1ad2a62676	f2fs: stop calling fscrypt_add_test_dummy_key() Now that fs/crypto/ adds the test dummy encryption key on-demand when it's needed, there's no need for individual filesystems to call fscrypt_add_test_dummy_key(). Remove the call to it from f2fs. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230208062107.199831-4-ebiggers@kernel.org	2023-02-07 22:30:30 -08:00
Yangtao Li	c5bf834833	f2fs: fix to set ipu policy For LFS mode, it should update outplace and no need inplace update. When using LFS mode for small-volume devices, IPU will not be used, and the OPU writing method is actually used, but F2FS_IPU_FORCE can be read from the ipu_policy node, which is different from the actual situation. And remount to lfs mode should be disallowed when f2fs ipu is enabled, let's fix it. Fixes: `84b89e5d94` ("f2fs: add auto tuning for small devices") Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-07 10:45:00 -08:00
Jinyoung CHOI	146949defd	f2fs: fix typos in comments This patch is to fix typos in f2fs files. Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-07 10:39:28 -08:00
Jaegeuk Kim	267c159f9c	f2fs: fix kernel crash due to null io->bio We should return when io->bio is null before doing anything. Otherwise, panic. BUG: kernel NULL pointer dereference, address: 0000000000000010 RIP: 0010:__submit_merged_write_cond+0x164/0x240 [f2fs] Call Trace: <TASK> f2fs_submit_merged_write+0x1d/0x30 [f2fs] commit_checkpoint+0x110/0x1e0 [f2fs] f2fs_write_checkpoint+0x9f7/0xf00 [f2fs] ? __pfx_issue_checkpoint_thread+0x10/0x10 [f2fs] __checkpoint_and_complete_reqs+0x84/0x190 [f2fs] ? preempt_count_add+0x82/0xc0 ? __pfx_issue_checkpoint_thread+0x10/0x10 [f2fs] issue_checkpoint_thread+0x4c/0xf0 [f2fs] ? __pfx_autoremove_wake_function+0x10/0x10 kthread+0xff/0x130 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> Cc: stable@vger.kernel.org # v5.18+ Fixes: `64bf0eef01` ("f2fs: pass the bio operation to bio_alloc_bioset") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-07 10:39:28 -08:00
Yangtao Li	d9bac032ac	f2fs: use iostat_lat_type directly as a parameter in the iostat_update_and_unbind_ctx() Convert to use iostat_lat_type as parameter instead of raw number. BTW, move NUM_PREALLOC_IOSTAT_CTXS to the header file, adjust iostat_lat[{0,1,2}] to iostat_lat[{READ_IO,WRITE_SYNC_IO,WRITE_ASYNC_IO}] in tracepoint function, and rename iotype to page_type to match the definition. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-07 10:39:28 -08:00
qixiaoyu1	d23be468ea	f2fs: add sysfs nodes to set last_age_weight Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com> Signed-off-by: xiongping1 <xiongping1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-07 10:39:19 -08:00
Yangtao Li	04d7a7ae43	f2fs: fix f2fs_show_options to show nogc_merge mount option Commit `5911d2d1d1` ("f2fs: introduce gc_merge mount option") forgot to show nogc_merge option, let's fix it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-05 19:38:28 -08:00
Eric Biggers	844545c51a	f2fs: fix cgroup writeback accounting with fs-layer encryption When writing a page from an encrypted file that is using filesystem-layer encryption (not inline encryption), f2fs encrypts the pagecache page into a bounce page, then writes the bounce page. It also passes the bounce page to wbc_account_cgroup_owner(). That's incorrect, because the bounce page is a newly allocated temporary page that doesn't have the memory cgroup of the original pagecache page. This makes wbc_account_cgroup_owner() not account the I/O to the owner of the pagecache page as it should. Fix this by always passing the pagecache page to wbc_account_cgroup_owner(). Fixes: `578c647879` ("f2fs: implement cgroup writeback support") Cc: stable@vger.kernel.org Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-05 19:34:21 -08:00
qixiaoyu1	b03a41a495	f2fs: fix wrong calculation of block age Currently we wrongly calculate the new block age to old * LAST_AGE_WEIGHT / 100. Fix it to new * (100 - LAST_AGE_WEIGHT) / 100 + old * LAST_AGE_WEIGHT / 100. Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com> Signed-off-by: xiongping1 <xiongping1@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-05 19:30:05 -08:00
Vishal Moola (Oracle)	580e7a4926	f2fs: convert f2fs_sync_meta_pages() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 5 calls to compound_head(). Initially the function was checking if the previous page index is truly the previous page i.e. 1 index behind the current page. To convert to folios and maintain this check we need to make the check folio->index != prev + folio_nr_pages(previous folio) since we don't know how many pages are in a folio. At index i == 0 the check is guaranteed to succeed, so to workaround indexing bounds we can simply ignore the check for that specific index. This makes the initial assignment of prev trivial, so I removed that as well. Also modify a comment in commit_checkpoint for consistency. Link: https://lkml.kernel.org/r/20230104211448.4804-17-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	4f4a4f0feb	f2fs: convert last_fsync_dnode() to use filemap_get_folios_tag() Convert to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-16-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	1cd98ee747	f2fs: convert f2fs_write_cache_pages() to use filemap_get_folios_tag() Convert the function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Also modified f2fs_all_cluster_page_ready to take in a folio_batch instead of pagevec. This does NOT support large folios. The function currently only utilizes folios of size 1 so this shouldn't cause any issues right now. This version of the patch limits the number of pages fetched to F2FS_ONSTACK_PAGES. If that ever happens, update the start index here since filemap_get_folios_tag() updates the index to be after the last found folio, not necessarily the last used page. Link: https://lkml.kernel.org/r/20230104211448.4804-15-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	7525486aff	f2fs: convert f2fs_sync_node_pages() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-14-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	a40a4ad118	f2fs: convert f2fs_flush_inline_data() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-13-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Vishal Moola (Oracle)	e6e46e1eb7	f2fs: convert f2fs_fsync_node_pages() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-12-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Chao Yu	a84153f939	f2fs: fix to update age extent in f2fs_do_zero_range() We should update age extent in f2fs_do_zero_range() like we did in f2fs_truncate_data_blocks_range(). Fixes: `71644dff48` ("f2fs: add block_age-based extent cache") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:17 -08:00
Chao Yu	8c0ed062ce	f2fs: fix to update age extent correctly during truncation nr_free may be less than len, we should update age extent cache w/ range [fofs, len] rather than [fofs, nr_free]. Fixes: `71644dff48` ("f2fs: add block_age-based extent cache") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:17 -08:00
Yangtao Li	0dbbf0fb38	f2fs: fix to avoid potential memory corruption in __update_iostat_latency() Add iotype sanity check to avoid potential memory corruption. This is to fix the compile error below: fs/f2fs/iostat.c:231 __update_iostat_latency() error: buffer overflow 'io_lat->peak_lat[type]' 3 <= 3 vim +228 fs/f2fs/iostat.c 211 static inline void __update_iostat_latency(struct bio_iostat_ctx iostat_ctx, 212 enum iostat_lat_type type) 213 { 214 unsigned long ts_diff; 215 unsigned int page_type = iostat_ctx->type; 216 struct f2fs_sb_info sbi = iostat_ctx->sbi; 217 struct iostat_lat_info *io_lat = sbi->iostat_io_lat; 218 unsigned long flags; 219 220 if (!sbi->iostat_enable) 221 return; 222 223 ts_diff = jiffies - iostat_ctx->submit_ts; 224 if (page_type >= META_FLUSH) ^^^^^^^^^^ 225 page_type = META; 226 227 spin_lock_irqsave(&sbi->iostat_lat_lock, flags); @228 io_lat->sum_lat[type][page_type] += ts_diff; ^^^^^^^^^ Mixup between META_FLUSH and NR_PAGE_TYPE leads to memory corruption. Fixes: `a4b6817625` ("f2fs: introduce periodic iostat io latency traces") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Suggested-by: Chao Yu <chao@kernel.org> Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:17 -08:00
Jaegeuk Kim	3aa51c61cb	f2fs: retry to update the inode page given data corruption If the storage gives a corrupted node block due to short power failure and reset, f2fs stops the entire operations by setting the checkpoint failure flag. Let's give more chances to live by re-issuing IOs for a while in such critical path. Cc: stable@vger.kernel.org Suggested-by: Randall Huang <huangrandall@google.com> Suggested-by: Chao Yu <chao@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:17 -08:00
Chao Yu	933141e4eb	f2fs: fix to handle F2FS_IOC_START_ATOMIC_REPLACE in f2fs_compat_ioctl() Otherwise, 32-bits binary call ioctl(F2FS_IOC_START_ATOMIC_REPLACE) will fail in 64-bits kernel. Fixes: `41e8f85a75` ("f2fs: introduce F2FS_IOC_START_ATOMIC_REPLACE") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:17 -08:00
Chao Yu	b90e5086df	f2fs: clean up i_compress_flag and i_compress_level usage .i_compress_level was introduced by commit `3fde13f817` ("f2fs: compress: support compress level"), but never be used. This patch updates as below: - load high 8-bits of on-disk .i_compress_flag to in-memory .i_compress_level - load low 8-bits of on-disk .i_compress_flag to in-memory .i_compress_flag - change type of in-memory .i_compress_flag from unsigned short to unsigned char. w/ above changes, we can avoid unneeded bit shift whenever during .init_compress_ctx(), and shrink size of struct f2fs_inode_info. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:17 -08:00
Chao Yu	2eae077e6e	f2fs: reduce stack memory cost by using bitfield in struct f2fs_io_info This patch tries to use bitfield in struct f2fs_io_info to improve memory usage. struct f2fs_io_info { ... unsigned int need_lock:8; /* indicate we need to lock cp_rwsem / unsigned int version:8; / version of the node / unsigned int submitted:1; / indicate IO submission / unsigned int in_list:1; / indicate fio is in io_list / unsigned int is_por:1; / indicate IO is from recovery or not / unsigned int retry:1; / need to reallocate block address / unsigned int encrypted:1; / indicate file is encrypted / unsigned int post_read:1; / require post read */ ... }; After this patch, size of struct f2fs_io_info reduces from 136 to 120. [Nathan: fix a compile warning (single-bit-bitfield-constant-conversion)] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:16 -08:00
Christoph Hellwig	a28bca0f47	f2fs: factor the read/write tracing logic into a helper Factor the logic to log a path for reads and writs into a helper shared between the read_iter and write_iter methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:16 -08:00
Christoph Hellwig	88c9edfd3c	f2fs: remove __has_curseg_space Just open code the logic in the only caller, where it is more obvious. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:16 -08:00
Christoph Hellwig	4a20958873	f2fs: refactor next blk selection Remove __refresh_next_blkoff by opencoding the SSR vs LFS segment check in the only caller, and then add helpers for SSR block selection and blkoff randomization instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:16 -08:00
Christoph Hellwig	dede3525ed	f2fs: remove __allocate_new_section Just fold this trivial wrapper into the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:16 -08:00
Christoph Hellwig	2df79573ef	f2fs: refactor __allocate_new_segment Simplify the check whether to allocate a new segment or reuse an open one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:16 -08:00
Christoph Hellwig	6392e9ff8b	f2fs: add a f2fs_curseg_valid_blocks helper Add a helper to return the valid blocks on log and SSR segments, and replace the last two uses of curseg_blkoff with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:15 -08:00
Christoph Hellwig	5a4fed7cd9	f2fs: simplify do_checkpoint For each loop add a local curseg_info pointer insted of looking it up for each of the three fields. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-02-02 13:37:14 -08:00
Christoph Hellwig	2163a691c5	f2fs: remove __add_sum_entry This function just assigns a summary entry. This can be done entirely typesafe with an open code struct assignment that relies on array indexing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:49:30 -08:00
Chao Yu	ae267fc1cf	f2fs: fix to abort atomic write only during do_exist() Commit `7a10f0177e` ("f2fs: don't give partially written atomic data from process crash") attempted to drop atomic write data after process crash, however, f2fs_abort_atomic_write() may be called from noncrash case, fix it by adding missed PF_EXITING check condition f2fs_file_flush(). - application crashs - do_exit - exit_signals -- sets PF_EXITING - exit_files - put_files_struct - close_files - filp_close - flush (f2fs_file_flush) - check atomic_write_task && PF_EXITING - f2fs_abort_atomic_write Fixes: `7a10f0177e` ("f2fs: don't give partially written atomic data from process crash") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:48:13 -08:00
Yangtao Li	e6261beb0c	f2fs: allow set compression option of files without blocks Files created by truncate have a size but no blocks, so they can be allowed to set compression option. Fixes: `e1e8debec6` ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl") Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:46 -08:00
Eric Biggers	9a5571cff4	f2fs: fix information leak in f2fs_move_inline_dirents() When converting an inline directory to a regular one, f2fs is leaking uninitialized memory to disk because it doesn't initialize the entire directory block. Fix this by zero-initializing the block. This bug was introduced by commit `4ec17d688d` ("f2fs: avoid unneeded initializing when converting inline dentry"), which didn't consider the security implications of leaking uninitialized memory to disk. This was found by running xfstest generic/435 on a KMSAN-enabled kernel. Fixes: `4ec17d688d` ("f2fs: avoid unneeded initializing when converting inline dentry") Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:46 -08:00
Alexander Potapenko	b1b9896718	fs: f2fs: initialize fsdata in pagecache_write() When aops->write_begin() does not initialize fsdata, KMSAN may report an error passing the latter to aops->write_end(). Fix this by unconditionally initializing fsdata. Suggested-by: Eric Biggers <ebiggers@kernel.org> Fixes: `95ae251fe8` ("f2fs: add fs-verity support") Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:45 -08:00
Yangtao Li	9b13a8662e	f2fs: fix to check warm_data_age_threshold hot_data_age_threshold is a non-zero positive number, and condition 2 includes condition 1, so there is no need to additionally judge whether t is 0. And let's remove it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:45 -08:00
Yangtao Li	b1c5ef26e4	f2fs: return true if all cmd were issued or no cmd need to be issued for f2fs_issue_discard_timeout() f2fs_issue_discard_timeout() returns whether discard cmds are dropped, which does not match the meaning of the function. Let's change it to return whether all discard cmd are issued. After commit `4d67490498` ("f2fs: Don't create discard thread when device doesn't support realtime discard"), f2fs_issue_discard_timeout() is alse called by f2fs_remount(). Since the comments of f2fs_issue_discard_timeout() doesn't make much sense, let's update it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:45 -08:00
Yangtao Li	2381a68556	f2fs: fix to show discard_unit mount opt Convert to show discard_unit only when has DISCARD opt. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:45 -08:00
Chao Yu	d48a7b3a72	f2fs: fix to do sanity check on extent cache correctly In do_read_inode(), sanity_check_inode() should be called after f2fs_init_read_extent_tree(), fix it. Fixes: `72840cccc0` ("f2fs: allocate the extent_cache by default") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:45 -08:00
Chao Yu	71a298ca20	f2fs: remove unneeded f2fs_cp_error() in f2fs_create_whiteout() f2fs_rename() has checked CP_ERROR_FLAG, so remove redundant check in f2fs_create_whiteout(). Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-31 10:47:44 -08:00
Chao Yu	0e8d040bfa	f2fs: clear atomic_write_task in f2fs_abort_atomic_write() Otherwise, last .atomic_write_task will be remained in structure f2fs_inode_info, resulting in aborting atomic_write accidentally in race case. Meanwhile, clear original_i_size as well. Fixes: `7a10f0177e` ("f2fs: don't give partially written atomic data from process crash") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-30 14:46:20 -08:00
Chao Yu	2f3a9ae990	f2fs: introduce trace_f2fs_replace_atomic_write_block Commit `3db1de0e58` ("f2fs: change the current atomic write way") removed old tracepoints, but it missed to add new one, this patch fixes to introduce trace_f2fs_replace_atomic_write_block to trace atomic_write commit flow. Fixes: `3db1de0e58` ("f2fs: change the current atomic write way") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-30 14:46:20 -08:00
Yangtao Li	120e0ea12d	f2fs: introduce discard_io_aware_gran sysfs node The current discard_io_aware_gran is a fixed value, change it to be configurable through the sys node. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-30 13:23:51 -08:00
Yangtao Li	c5f9db2548	f2fs: drop useless initializer and unneeded local variable No need to initialize idx twice. BTW, remove the unnecessary cnt variable. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-30 13:23:51 -08:00
Yangtao Li	193a639fed	f2fs: add iostat support for flush In this patch, it adds to account flush count. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-30 13:23:51 -08:00
Christian Brauner	e67fe63341	fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap Convert to struct mnt_idmap. Remove legacy file_mnt_user_ns() and mnt_user_ns(). Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:29 +01:00
Christian Brauner	0dbe12f2e4	fs: port i_{g,u}id_{needs_}update() to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:29 +01:00
Christian Brauner	f861646a65	quota: port to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:29 +01:00
Christian Brauner	9452e93e6d	fs: port privilege checking helpers to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:29 +01:00
Christian Brauner	01beba7957	fs: port inode_owner_or_capable() to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:29 +01:00
Christian Brauner	f2d40141d5	fs: port inode_init_owner() to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:28 +01:00
Christian Brauner	39f60c1cce	fs: port xattr to mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:28 +01:00
Christian Brauner	8782a9aea3	fs: port ->fileattr_set() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:27 +01:00
Christian Brauner	13e83a4923	fs: port ->set_acl() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:27 +01:00
Christian Brauner	011e2b717b	fs: port ->tmpfile() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:27 +01:00
Christian Brauner	e18275ae55	fs: port ->rename() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Christian Brauner	5ebb29bee8	fs: port ->mknod() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Christian Brauner	c54bd91e9e	fs: port ->mkdir() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:26 +01:00
Christian Brauner	7a77db9551	fs: port ->symlink() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Christian Brauner	6c960e68aa	fs: port ->create() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Christian Brauner	b74d24f7a7	fs: port ->getattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:25 +01:00
Christian Brauner	c1632a0f11	fs: port ->setattr() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in `256c8aed2b` ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-19 09:24:02 +01:00
Christian Brauner	64b4cdf22f	f2fs: project ids aren't idmapped Project ids are only settable filesystem wide in the initial namespace. They don't take the mount's idmapping into account. Note, that after we converted everything over to struct mnt_idmap mistakes such as the one here aren't possible anymore as struct mnt_idmap cannot be passed to functions that operate on k{g,u}ids. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2023-01-18 17:51:45 +01:00
Yangtao Li	7a2b15cfa8	f2fs: support accounting iostat count and avg_bytes Previously, we supported to account iostat io_bytes, in this patch, it adds to account iostat count and avg_bytes: time: 1671648667 io_bytes count avg_bytes [WRITE] app buffered data: 31 2 15 Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:19 -08:00
Yangtao Li	45c98f5a58	f2fs: convert discard_wake and gc_wake to bool type discard_wake and gc_wake have only two values, 0 or 1. So there is no need to use int type to store them. BTW, move discard_wake to the end of the discard_cmd_control structure. Before: - sizeof(struct discard_cmd_control): 8392 After move: - sizeof(struct discard_cmd_control): 8384 Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:19 -08:00
Yangtao Li	f08142bc3a	f2fs: convert to use MIN_DISCARD_GRANULARITY macro Commit `1cd2e6d544` ("f2fs: define MIN_DISCARD_GRANULARITY macro") introduce it, let's convert to use MIN_DISCARD_GRANULARITY macro. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:19 -08:00
Yangtao Li	c40e15a9a5	f2fs: merge f2fs_show_injection_info() into time_to_inject() There is no need to additionally use f2fs_show_injection_info() to output information. Concatenate time_to_inject() and __time_to_inject() via a macro. In the new __time_to_inject() function, pass in the caller function name and parent function. In this way, we no longer need the f2fs_show_injection_info() function, and let's remove it. Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:19 -08:00
Yangtao Li	1cd7565449	f2fs: add a f2fs_ prefix to punch_hole() and expand_inode_data() For example, f2fs_collapse_range(), f2fs_collapse_range(), f2fs_insert_range(), the functions used in f2fs_fallocate() are all prefixed with f2fs_, so let's keep the name consistent. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:18 -08:00
Yangtao Li	390d0b9921	f2fs: remove unnecessary blank lines Just cleanup. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:18 -08:00
Yangtao Li	a1357a91ec	f2fs: mark f2fs_init_compress_mempool w/ __init f2fs_init_compress_mempool() only initializes the memory pool during the f2fs module init phase. Let's mark it as __init like any other function. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:18 -08:00
Yangtao Li	b5a711acab	f2fs: judge whether discard_unit is section only when have CONFIG_BLK_DEV_ZONED The current logic, regardless of whether CONFIG_BLK_DEV_ZONED is enabled or not, will judge whether discard_unit is SECTION, when f2fs_sb_has_blkzoned. In fact, when CONFIG_BLK_DEV_ZONED is not enabled, this judgment is a path that will never be accessed. At this time, -EINVAL will be returned in the parse_options function, accompanied by the message "Zoned block device support is not enabled". Let's wrap this discard_unit judgment with CONFIG_BLK_DEV_ZONED. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:18 -08:00
Zhang Qilong	ebaaec351e	f2fs: start freeing cluster pages from the unused number We can start freeing cluster page(s) from which compression is not used. It will get better performance. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-11 11:15:18 -08:00
Eric Biggers	feb0576a36	f2fs: simplify f2fs_readpage_limit() Now that the implementation of FS_IOC_ENABLE_VERITY has changed to not involve reading back Merkle tree blocks that were previously written, there is no need for f2fs_readpage_limit() to allow for this case. Signed-off-by: Eric Biggers <ebiggers@google.com> Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20221223203638.41293-10-ebiggers@kernel.org	2023-01-09 19:06:09 -08:00
Yuwei Guan	185a453bf1	f2fs: deliver the accumulated 'issued' to __issue_discard_cmd_orderly() Any of the following scenarios will send more than the number of max_requests at a time, which will not meet the design of the max_requests limit. - Set max_ordered_discard larger than discard_granularity from userspace. - It is a small size device, discard_granularity can be tuned to 1 in f2fs_tuning_parameters(). We need to deliver the accumulated @issued to __issue_discard_cmd_orderly() to meet the max_requests limit. BTW, convert the parameter type of @issued in __submit_discard_cmd(). Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:42 -08:00
Chao Yu	8358014d6b	f2fs: avoid to check PG_error flag After below changes: commit `14db0b3c7b` ("fscrypt: stop using PG_error to track error status") commit `98dc08bae6` ("fsverity: stop using PG_error to track error status") There is no place in f2fs we will set PG_error flag in page, let's remove other PG_error usage in f2fs, as a step towards freeing the PG_error flag for other uses. Cc: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:41 -08:00
Chao Yu	5eaac835f2	f2fs: fix to avoid potential deadlock There is a potential deadlock reported by syzbot as below: F2FS-fs (loop2): invalid crc value F2FS-fs (loop2): Found nat_bits in checkpoint F2FS-fs (loop2): Mounted with checkpoint version = 48b305e4 ====================================================== WARNING: possible circular locking dependency detected 6.1.0-rc8-syzkaller-33330-ga5541c0811a0 #0 Not tainted ------------------------------------------------------ syz-executor.2/32123 is trying to acquire lock: ffff0000c0e1a608 (&mm->mmap_lock){++++}-{3:3}, at: __might_fault+0x54/0xb4 mm/memory.c:5644 but task is already holding lock: ffff0001317c6088 (&sbi->sb_lock){++++}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2205 [inline] ffff0001317c6088 (&sbi->sb_lock){++++}-{3:3}, at: f2fs_ioc_get_encryption_pwsalt fs/f2fs/file.c:2334 [inline] ffff0001317c6088 (&sbi->sb_lock){++++}-{3:3}, at: __f2fs_ioctl+0x1370/0x3318 fs/f2fs/file.c:4151 which lock already depends on the new lock. Chain exists of: &mm->mmap_lock --> &nm_i->nat_tree_lock --> &sbi->sb_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sbi->sb_lock); lock(&nm_i->nat_tree_lock); lock(&sbi->sb_lock); lock(&mm->mmap_lock); Let's try to avoid above deadlock condition by moving __might_fault() out of sbi->sb_lock coverage. Fixes: `95fa90c9e5` ("f2fs: support recording errors into superblock") Link: https://lore.kernel.org/linux-f2fs-devel/000000000000cd5fe305ef617fe2@google.com/T/#u Reported-by: syzbot+4793f6096d174c90b4f7@syzkaller.appspotmail.com Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:41 -08:00
Yangtao Li	fdb7ccc3f9	f2fs: introduce IS_F2FS_IPU_* macro IS_F2FS_IPU_* macro can be used to identify whether f2fs ipu related policies are enabled. BTW, convert to use BIT() instead of open code. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:41 -08:00
Christoph Hellwig	fdbf69a7f5	f2fs: refactor the hole reporting and allocation logic in f2fs_map_blocks Add a is_hole local variable to figure out if the block number might need allocation, and untangle to logic to report the hole or fill it with a block allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:38 -08:00
Christoph Hellwig	817c968b79	f2fs: factor out a f2fs_map_no_dnode Factor out a helper to return a hole when no dnode was found. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:36 -08:00
Christoph Hellwig	0094e98bd1	f2fs: factor a f2fs_map_blocks_cached helper Add a helper to deal with everything needed to return a f2fs_map_blocks structure based on a lookup in the extent cache. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:31 -08:00
Christoph Hellwig	cd8fc5226b	f2fs: remove the create argument to f2fs_map_blocks The create argument is always identicaly to map->m_may_create, so use that consistently. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:29 -08:00
Christoph Hellwig	ffdeab71d5	f2fs: remove f2fs_get_block Fold f2fs_get_block into the two remaining callers to simplify the call chain a bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:26 -08:00
Christoph Hellwig	3cf684f2f8	f2fs: simplify __allocate_data_block Just use a simple if block for the conditional call to inc_valid_block_count. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:13 -08:00
Christoph Hellwig	44b0dfebbd	f2fs: reflow prepare_write_begin Reflow prepare_write_begin so that it reads more straight forward, and so that there is one place that does an extent cache lookup instead of three, two of which are hidden in f2fs_get_block calls. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:10 -08:00
Christoph Hellwig	2f51ade952	f2fs: f2fs_do_map_lock Split f2fs_do_map_lock into a lock and unlock helper to make the code using it easier to read. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:07 -08:00
Christoph Hellwig	cf342d3bed	f2fs: add a f2fs_get_block_locked helper This allows to keep the f2fs_do_map_lock based locking scheme private to data.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:03 -08:00
Christoph Hellwig	04a91ab016	f2fs: add a f2fs_lookup_extent_cache_block helper All but three callers of f2fs_lookup_extent_cache just want the block address. Add a small helper to simplify them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:13:00 -08:00
Christoph Hellwig	bc29835a9d	f2fs: split __submit_bio Split __submit_bio into one function each for reads and writes, and a helper for aligning writes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:12:56 -08:00
Christoph Hellwig	da8c7fecc9	f2fs: rename F2FS_MAP_UNWRITTEN to F2FS_MAP_DELALLOC NEW_ADDR blocks are purely in-memory preallocated blocks, and thus equivalent to what the core FS code calls delayed allocations, and not unwritten extents which do have on-disk blocks allocated from which reads always return zeroes until they are converted to written status. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:12:53 -08:00
Christoph Hellwig	62a134bd89	f2fs: decouple F2FS_MAP_ from buffer head flags m_flags is never interchanged with the buffer_heads b_flags directly, so use separate codepoints from that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:12:49 -08:00
Christoph Hellwig	8d3c1fa3fa	f2fs: don't rely on F2FS_MAP_* in f2fs_iomap_begin When testing with a mixed zoned / convention device combination, there are regular but not 100% reproducible failures in xfstests generic/113 where the __is_valid_data_blkaddr assert hits due to finding a hole. This seems to be because f2fs_map_blocks can set this flag on a hole when it was found in the extent cache. Rework f2fs_iomap_begin to just check the special block numbers directly. This has the added benefits of the WARN_ON showing which invalid block address we found, and being properly error out on delalloc blocks that are confusingly called unwritten but not actually suitable for direct I/O. Fixes: `1517c1a7a4` ("f2fs: implement iomap operations") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-06 15:12:28 -08:00
Chao Yu	6779b5db90	f2fs: fix to call clear_page_private_reference in .{release,invalid}_folio `b763f3bedc` ("f2fs: restructure f2fs page.private layout") missed to call clear_page_private_reference() in .{release,invalid}_folio, fix it, though it's not a big deal since folio_detach_private() was called to clear all privae info and reference count in the page. BTW, remove page_private_reference() definition as it never be used. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-04 12:56:04 -08:00
Chao Yu	b3107b3854	f2fs: remove unused PAGE_PRIVATE_ATOMIC_WRITE Commit `3db1de0e58` ("f2fs: change the current atomic write way") has removed all users of PAGE_PRIVATE_ATOMIC_WRITE, remove its definition and related functions. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-04 12:56:04 -08:00
Chao Yu	f35474ec00	f2fs: fix to support .migrate_folio for compressed inode Add missed .migrate_folio for compressed inode, in order to support migration of compressed inode's page. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-04 12:56:04 -08:00
Sergey Shtylyov	39bee2e6ac	f2fs: file: drop useless initializer in expand_inode_data() In expand_inode_data(), the 'new_size' local variable is initialized to the result of i_size_read(), however this value isn't ever used, so we can drop this initializer... Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-04 12:56:04 -08:00
Jaegeuk Kim	df9d44b645	f2fs: let's avoid panic if extent_tree is not created This patch avoids the below panic. pc : __lookup_extent_tree+0xd8/0x760 lr : f2fs_do_write_data_page+0x104/0x87c sp : ffffffc010cbb3c0 x29: ffffffc010cbb3e0 x28: 0000000000000000 x27: ffffff8803e7f020 x26: ffffff8803e7ed40 x25: ffffff8803e7f020 x24: ffffffc010cbb460 x23: ffffffc010cbb480 x22: 0000000000000000 x21: 0000000000000000 x20: ffffffff22e90900 x19: 0000000000000000 x18: ffffffc010c5d080 x17: 0000000000000000 x16: 0000000000000020 x15: ffffffdb1acdbb88 x14: ffffff888759e2b0 x13: 0000000000000000 x12: ffffff802da49000 x11: 000000000a001200 x10: ffffff8803e7ed40 x9 : ffffff8023195800 x8 : ffffff802da49078 x7 : 0000000000000001 x6 : 0000000000000000 x5 : 0000000000000006 x4 : ffffffc010cbba28 x3 : 0000000000000000 x2 : ffffffc010cbb480 x1 : 0000000000000000 x0 : ffffff8803e7ed40 Call trace: __lookup_extent_tree+0xd8/0x760 f2fs_do_write_data_page+0x104/0x87c f2fs_write_single_data_page+0x420/0xb60 f2fs_write_cache_pages+0x418/0xb1c __f2fs_write_data_pages+0x428/0x58c f2fs_write_data_pages+0x30/0x40 do_writepages+0x88/0x190 __writeback_single_inode+0x48/0x448 writeback_sb_inodes+0x468/0x9e8 __writeback_inodes_wb+0xb8/0x2a4 wb_writeback+0x33c/0x740 wb_do_writeback+0x2b4/0x400 wb_workfn+0xe4/0x34c process_one_work+0x24c/0x5bc worker_thread+0x3e8/0xa50 kthread+0x150/0x1b4 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-03 08:59:06 -08:00
Jaegeuk Kim	22a341b430	f2fs: should use a temp extent_info for lookup Otherwise, __lookup_extent_tree() will override the given extent_info which will be used by caller. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-03 08:59:06 -08:00
Jaegeuk Kim	ed2724765e	f2fs: don't mix to use union values in extent_info Let's explicitly use the defined values in block_age case only. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-03 08:59:06 -08:00
Jaegeuk Kim	fe59109ae5	f2fs: initialize extent_cache parameter This can avoid confusing tracepoint values. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-03 08:59:05 -08:00
Chao Yu	b3d83066cb	f2fs: fix to avoid NULL pointer dereference in f2fs_issue_flush() With below two cases, it will cause NULL pointer dereference when accessing SM_I(sbi)->fcc_info in f2fs_issue_flush(). a) If kthread_run() fails in f2fs_create_flush_cmd_control(), it will release SM_I(sbi)->fcc_info, - mount -o noflush_merge /dev/vda /mnt/f2fs - mount -o remount,flush_merge /dev/vda /mnt/f2fs -- kthread_run() fails - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=1 conv=fsync b) we will never allocate memory for SM_I(sbi)->fcc_info w/ below testcase, - mount -o ro /dev/vda /mnt/f2fs - mount -o rw,remount /dev/vda /mnt/f2fs - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=1 conv=fsync In order to fix this issue, let change as below: - fix error path handling in f2fs_create_flush_cmd_control(). - allocate SM_I(sbi)->fcc_info even if readonly is on. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2023-01-03 08:58:47 -08:00
Eric Biggers	72ea15f0dd	fsverity: pass pos and size to ->write_merkle_tree_block fsverity_operations::write_merkle_tree_block is passed the index of the block to write and the log base 2 of the block size. However, all implementations of it use these parameters only to calculate the position and the size of the block, in bytes. Therefore, make ->write_merkle_tree_block take 'pos' and 'size' parameters instead of 'index' and 'log_blocksize'. Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Dave Chinner <dchinner@redhat.com> Link: https://lore.kernel.org/r/20221214224304.145712-5-ebiggers@kernel.org	2023-01-01 15:46:48 -08:00
Linus Torvalds	041fae9c10	f2fs-for-6.2-rc1 In this round, we've added two features: 1) F2FS_IOC_START_ATOMIC_REPLACE and 2) per-block age-based extent cache. 1) is a variant of the previous atomic write feature which guarantees a per-file atomicity. It would be more efficient than AtomicFile implementation in Android framework. 2) implements another type of extent cache in memory which keeps the per-block age in a file, so that block allocator could split the hot and cold data blocks more accurately. Enhancement: - introduce F2FS_IOC_START_ATOMIC_REPLACE - refactor extent_cache to add a new per-block-age-based extent cache support - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs - add proc entry to show discard_plist info - optimize iteration over sparse directories - add barrier mount option Bug fix - avoid victim selection from previous victim section - fix to enable compress for newly created file if extension matches - set zstd compress level correctly - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot - correct i_size change for atomic writes - allow to read node block after shutdown - allow to set compression for inlined file - fix gc mode when gc_urgent_high_remaining is 1 - should put a page when checking the summary info Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and doc. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmOaTNUACgkQQBSofoJI UNIQnw//V7Q8DUHw5YNj04jutwXH2DNMLAmn/NJh5S6dIzy/LiywlSzVg53/0/FP 4K577urUkIhgilRO+yncUMSnSQk7BluQvGSx4ja2AV+dpDomjxM3GwIacGzSvr7D VfVf8Vig10UEFrrtEEKtv1VFlYHAmo8lLpubzrZHV8aZFLHHYO2fakQhPu8BYsaz eGCJwxjvTZcQUPkaeG9tWto3ChI3F6PzreiQ5TztHhLWSEgw/o0qijpsc+2SthaV my7uGjeBY8EGPeSYbeCxRtdx8g8Qu11K3ISuDj8zBybmjG3IWOGt1CVcrY6tZbal aL70CMtHkMqMn03VqbpCTqBtdWNMrrw5sYSL3qXIUdXlX/2yJBh9fLAeNxKNs5Nu 6veSb2WgYMHqIsClkAAcP0xJ8g6kodGoG60wVr4ek0Vdt4osaQqwq+bnffpwwxtQ F+7aRuinv+rdrHJ4CuFXAmHPKh2lBe2lTTWZEKg2RptTxZ5DhD2Qn6x1khPD2GFA mG2Aeiq6PVxxEeIO+w/VBCuAgpGTFV2N/ZIF8VfjFNdWiN5OGLWQNHC2KGj2G2uV +fA+B91txQWtjY9h72YJb2+aGIixcnLY24ni4mDgDItqtpCB4PW56W8cbnbv9Pl+ aXAWdADqJdDyllHoVB/JQ24gr2fATJGRIDeYDnw+vPP4f5ZT5vg= =f00t -----END PGP SIGNATURE----- Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've added two features: F2FS_IOC_START_ATOMIC_REPLACE and a per-block age-based extent cache. F2FS_IOC_START_ATOMIC_REPLACE is a variant of the previous atomic write feature which guarantees a per-file atomicity. It would be more efficient than AtomicFile implementation in Android framework. The per-block age-based extent cache implements another type of extent cache in memory which keeps the per-block age in a file, so that block allocator could split the hot and cold data blocks more accurately. Enhancements: - introduce F2FS_IOC_START_ATOMIC_REPLACE - refactor extent_cache to add a new per-block-age-based extent cache support - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs - add proc entry to show discard_plist info - optimize iteration over sparse directories - add barrier mount option Bug fixes: - avoid victim selection from previous victim section - fix to enable compress for newly created file if extension matches - set zstd compress level correctly - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot - correct i_size change for atomic writes - allow to read node block after shutdown - allow to set compression for inlined file - fix gc mode when gc_urgent_high_remaining is 1 - should put a page when checking the summary info Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and doc" * tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits) f2fs: reset wait_ms to default if any of the victims have been selected f2fs: fix some format WARNING in debug.c and sysfs.c f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super() f2fs: fix iostat parameter for discard f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache f2fs: add block_age-based extent cache f2fs: allocate the extent_cache by default f2fs: refactor extent_cache to support for read and more f2fs: remove unnecessary __init_extent_tree f2fs: move internal functions into extent_cache.c f2fs: specify extent cache for read explicitly f2fs: introduce f2fs_is_readonly() for readability f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro f2fs: do some cleanup for f2fs module init MAINTAINERS: Add f2fs bug tracker link f2fs: remove the unused flush argument to change_curseg f2fs: open code allocate_segment_by_default f2fs: remove struct segment_allocation default_salloc_ops f2fs: introduce discard_urgent_util sysfs node f2fs: define MIN_DISCARD_GRANULARITY macro ...	2022-12-14 15:27:57 -08:00
Linus Torvalds	ad0d9da164	fsverity updates for 6.2 The main change this cycle is to stop using the PG_error flag to track verity failures, and instead just track failures at the bio level. This follows a similar fscrypt change that went into 6.1, and it is a step towards freeing up PG_error for other uses. There's also one other small cleanup. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY5anyRQcZWJpZ2dlcnNA Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK1IPAP0SMSKJRgehpXHKp5QZxHSpAjkFlcGa 2y8Lc+DlHOrfLQEAmpGAxewowkMzpYVXmlAVVHRgUPWLjoMQQELEUQ8mWgU= =M+pB -----END PGP SIGNATURE----- Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fsverity updates from Eric Biggers: "The main change this cycle is to stop using the PG_error flag to track verity failures, and instead just track failures at the bio level. This follows a similar fscrypt change that went into 6.1, and it is a step towards freeing up PG_error for other uses. There's also one other small cleanup" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fsverity: simplify fsverity_get_digest() fsverity: stop using PG_error to track error status	2022-12-12 20:06:35 -08:00
Linus Torvalds	6a518afcc2	fs.acl.rework.v6.2 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY5bwTgAKCRCRxhvAZXjc ovd2AQCK00NAtGjQCjQPQGyTa4GAPqvWgq1ef0lnhv+TL5US5gD9FncQ8UofeMXt pBfjtAD6ettTPCTxUQfnTwWEU4rc7Qg= =27Wm -----END PGP SIGNATURE----- Merge tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull VFS acl updates from Christian Brauner: "This contains the work that builds a dedicated vfs posix acl api. The origins of this work trace back to v5.19 but it took quite a while to understand the various filesystem specific implementations in sufficient detail and also come up with an acceptable solution. As we discussed and seen multiple times the current state of how posix acls are handled isn't nice and comes with a lot of problems: The current way of handling posix acls via the generic xattr api is error prone, hard to maintain, and type unsafe for the vfs until we call into the filesystem's dedicated get and set inode operations. It is already the case that posix acls are special-cased to death all the way through the vfs. There are an uncounted number of hacks that operate on the uapi posix acl struct instead of the dedicated vfs struct posix_acl. And the vfs must be involved in order to interpret and fixup posix acls before storing them to the backing store, caching them, reporting them to userspace, or for permission checking. Currently a range of hacks and duct tape exist to make this work. As with most things this is really no ones fault it's just something that happened over time. But the code is hard to understand and difficult to maintain and one is constantly at risk of introducing bugs and regressions when having to touch it. Instead of continuing to hack posix acls through the xattr handlers this series builds a dedicated posix acl api solely around the get and set inode operations. Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() helpers must be used in order to interact with posix acls. They operate directly on the vfs internal struct posix_acl instead of abusing the uapi posix acl struct as we currently do. In the end this removes all of the hackiness, makes the codepaths easier to maintain, and gets us type safety. This series passes the LTP and xfstests suites without any regressions. For xfstests the following combinations were tested: - xfs - ext4 - btrfs - overlayfs - overlayfs on top of idmapped mounts - orangefs - (limited) cifs There's more simplifications for posix acls that we can make in the future if the basic api has made it. A few implementation details: - The series makes sure to retain exactly the same security and integrity module permission checks. Especially for the integrity modules this api is a win because right now they convert the uapi posix acl struct passed to them via a void pointer into the vfs struct posix_acl format to perform permission checking on the mode. There's a new dedicated security hook for setting posix acls which passes the vfs struct posix_acl not a void pointer. Basing checking on the posix acl stored in the uapi format is really unreliable. The vfs currently hacks around directly in the uapi struct storing values that frankly the security and integrity modules can't correctly interpret as evidenced by bugs we reported and fixed in this area. It's not necessarily even their fault it's just that the format we provide to them is sub optimal. - Some filesystems like 9p and cifs need access to the dentry in order to get and set posix acls which is why they either only partially or not even at all implement get and set inode operations. For example, cifs allows setxattr() and getxattr() operations but doesn't allow permission checking based on posix acls because it can't implement a get acl inode operation. Thus, this patch series updates the set acl inode operation to take a dentry instead of an inode argument. However, for the get acl inode operation we can't do this as the old get acl method is called in e.g., generic_permission() and inode_permission(). These helpers in turn are called in various filesystem's permission inode operation. So passing a dentry argument to the old get acl inode operation would amount to passing a dentry to the permission inode operation which we shouldn't and probably can't do. So instead of extending the existing inode operation Christoph suggested to add a new one. He also requested to ensure that the get and set acl inode operation taking a dentry are consistently named. So for this version the old get acl operation is renamed to ->get_inode_acl() and a new ->get_acl() inode operation taking a dentry is added. With this we can give both 9p and cifs get and set acl inode operations and in turn remove their complex custom posix xattr handlers. In the future I hope to get rid of the inode method duplication but it isn't like we have never had this situation. Readdir is just one example. And frankly, the overall gain in type safety and the more pleasant api wise are simply too big of a benefit to not accept this duplication for a while. - We've done a full audit of every codepaths using variant of the current generic xattr api to get and set posix acls and surprisingly it isn't that many places. There's of course always a chance that we might have missed some and if so I'm sure we'll find them soon enough. The crucial codepaths to be converted are obviously stacking filesystems such as ecryptfs and overlayfs. For a list of all callers currently using generic xattr api helpers see [2] including comments whether they support posix acls or not. - The old vfs generic posix acl infrastructure doesn't obey the create and replace semantics promised on the setxattr(2) manpage. This patch series doesn't address this. It really is something we should revisit later though. The patches are roughly organized as follows: (1) Change existing set acl inode operation to take a dentry argument (Intended to be a non-functional change) (2) Rename existing get acl method (Intended to be a non-functional change) (3) Implement get and set acl inode operations for filesystems that couldn't implement one before because of the missing dentry. That's mostly 9p and cifs (Intended to be a non-functional change) (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl() including security and integrity hooks (Intended to be a non-functional change) (5) Implement get and set acl inode operations for stacking filesystems (Intended to be a non-functional change) (6) Switch posix acl handling in stacking filesystems to new posix acl api now that all filesystems it can stack upon support it. (7) Switch vfs to new posix acl api (semantical change) (8) Remove all now unused helpers (9) Additional regression fixes reported after we merged this into linux-next Thanks to Seth for a lot of good discussion around this and encouragement and input from Christoph" * tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits) posix_acl: Fix the type of sentinel in get_acl orangefs: fix mode handling ovl: call posix_acl_release() after error checking evm: remove dead code in evm_inode_set_acl() cifs: check whether acl is valid early acl: make vfs_posix_acl_to_xattr() static acl: remove a slew of now unused helpers 9p: use stub posix acl handlers cifs: use stub posix acl handlers ovl: use stub posix acl handlers ecryptfs: use stub posix acl handlers evm: remove evm_xattr_acl_change() xattr: use posix acl api ovl: use posix acl api ovl: implement set acl method ovl: implement get acl method ecryptfs: implement set acl method ecryptfs: implement get acl method ksmbd: use vfs_remove_acl() acl: add vfs_remove_acl() ...	2022-12-12 18:46:39 -08:00
Yuwei Guan	26a8057a1a	f2fs: reset wait_ms to default if any of the victims have been selected In non-foreground gc mode, if no victim is selected, the gc process will wait for no_gc_sleep_time before waking up again. In this subsequent time, even though a victim will be selected, the gc process still waits for no_gc_sleep_time before waking up. The configuration of wait_ms is not reasonable. After any of the victims have been selected, we need to reset wait_ms to default sleep time from no_gc_sleep_time. Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 15:18:36 -08:00
Yangtao Li	7411143f20	f2fs: fix some format WARNING in debug.c and sysfs.c To fix: WARNING: function definition argument 'struct f2fs_attr ' should also have an identifier name + ssize_t (show)(struct f2fs_attr , struct f2fs_sb_info , char *); WARNING: return sysfs_emit(...) formats should include a terminating newline + return sysfs_emit(buf, "(none)"); WARNING: Prefer 'unsigned int' to bare use of 'unsigned' + unsigned npages = NODE_MAPPING(sbi)->nrpages; WARNING: Missing a blank line after declarations + unsigned npages = COMPRESS_MAPPING(sbi)->nrpages; + si->page_mem += (unsigned long long)npages << PAGE_SHIFT; WARNING: quoted string split across lines + seq_printf(s, "CP merge (Queued: %4d, Issued: %4d, Total: %4d, " + "Cur time: %4d(ms), Peak time: %4d(ms))\n", Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:59:39 -08:00
Yangtao Li	25547439f1	f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super() No need to call f2fs_issue_discard_timeout() in f2fs_put_super, when no discard command requires issue. Since the caller of f2fs_issue_discard_timeout() usually judges the number of discard commands before using it. Let's move this logic to f2fs_issue_discard_timeout(). By the way, use f2fs_realtime_discard_enable to simplify the code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:59:38 -08:00
Yangtao Li	15e38ee44d	f2fs: fix iostat parameter for discard Just like other data we count uses the number of bytes as the basic unit, but discard uses the number of cmds as the statistical unit. In fact the discard command contains the number of blocks, so let's change to the number of bytes as the base unit. Fixes: `b0af6d491a` ("f2fs: add app/fs io stat") Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:59:38 -08:00
Colin Ian King	db8dcd25ec	f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache There is a spelling mistake in a label name. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:59:32 -08:00
Jaegeuk Kim	71644dff48	f2fs: add block_age-based extent cache This patch introduces a runtime hot/cold data separation method for f2fs, in order to improve the accuracy for data temperature classification, reduce the garbage collection overhead after long-term data updates. Enhanced hot/cold data separation can record data block update frequency as "age" of the extent per inode, and take use of the age info to indicate better temperature type for data block allocation: - It records total data blocks allocated since mount; - When file extent has been updated, it calculate the count of data blocks allocated since last update as the age of the extent; - Before the data block allocated, it searches for the age info and chooses the suitable segment for allocation. Test and result: - Prepare: create about 30000 files * 3% for cold files (with cold file extension like .apk, from 3M to 10M) * 50% for warm files (with random file extension like .FcDxq, from 1K to 4M) * 47% for hot files (with hot file extension like .db, from 1K to 256K) - create(5%)/random update(90%)/delete(5%) the files * total write amount is about 70G * fsync will be called for .db files, and buffered write will be used for other files The storage of test device is large enough(128G) so that it will not switch to SSR mode during the test. Benefit: dirty segment count increment reduce about 14% - before: Dirty +21110 - after: Dirty +18286 Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com> Signed-off-by: xiongping1 <xiongping1@xiaomi.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:56 -08:00
Jaegeuk Kim	72840cccc0	f2fs: allocate the extent_cache by default Let's allocate it to remove the runtime complexity. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:56 -08:00
Jaegeuk Kim	e7547daccd	f2fs: refactor extent_cache to support for read and more This patch prepares extent_cache to be ready for addition. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:56 -08:00
Jaegeuk Kim	749d543c0d	f2fs: remove unnecessary __init_extent_tree Added into the caller. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:56 -08:00
Jaegeuk Kim	3bac20a8f0	f2fs: move internal functions into extent_cache.c No functional change. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:55 -08:00
Jaegeuk Kim	12607c1ba7	f2fs: specify extent cache for read explicitly Let's descrbie it's read extent cache. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:55 -08:00
Yangtao Li	ed8ac22b6b	f2fs: introduce f2fs_is_readonly() for readability Introduce f2fs_is_readonly() and use it to simplify code. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:48 -08:00
Yangtao Li	e480751970	f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() have never been used since they were introduced by this commit 76f105a2dbcd("f2fs: add feature facility in superblock"). So let's remove them. BTW, convert f2fs_sb_has_##name to return bool. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-12 14:53:20 -08:00
Yangtao Li	870af777da	f2fs: do some cleanup for f2fs module init Just for cleanup, no functional changes. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-08 09:32:20 -08:00
Christoph Hellwig	5bcd655fff	f2fs: remove the unused flush argument to change_curseg Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-08 09:32:16 -08:00
Christoph Hellwig	8442d94b8a	f2fs: open code allocate_segment_by_default allocate_segment_by_default has just two callers, which use very different code pathes inside it based on the force paramter. Just open code the logic in the two callers using a new helper to decided if a new segment should be allocated. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-08 09:32:13 -08:00
Christoph Hellwig	1c8a8ec0a0	f2fs: remove struct segment_allocation default_salloc_ops There is only single instance of these ops, so remove the indirection and call allocate_segment_by_default directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-12-08 09:32:10 -08:00
Eric Biggers	98dc08bae6	fsverity: stop using PG_error to track error status As a step towards freeing the PG_error flag for other uses, change ext4 and f2fs to stop using PG_error to track verity errors. Instead, if a verity error occurs, just mark the whole bio as failed. The coarser granularity isn't really a problem since it isn't any worse than what the block layer provides, and errors from a multi-page readahead aren't reported to applications unless a single-page read fails too. f2fs supports compression, which makes the f2fs changes a bit more complicated than desired, but the basic premise still works. Note: there are still a few uses of PageError in f2fs, but they are on the write path, so they are unrelated and this patch doesn't touch them. Reviewed-by: Chao Yu <chao@kernel.org> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20221129070401.156114-1-ebiggers@kernel.org	2022-11-28 23:15:10 -08:00
Yangtao Li	8a47d228de	f2fs: introduce discard_urgent_util sysfs node Through this node, you can control the background discard to run more aggressively or not aggressively when reach the utilization rate of the space. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-28 12:51:01 -08:00
Yangtao Li	1cd2e6d544	f2fs: define MIN_DISCARD_GRANULARITY macro Do cleanup in f2fs_tuning_parameters() and __init_discard_policy(), let's use macro instead of number. Suggested-by: Chao Yu <chao@kernel.org> Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-28 12:50:58 -08:00
Yangtao Li	48c08c51f9	f2fs: init discard policy after thread wakeup Under the current logic, after the discard thread wakes up, it will not run according to the expected policy, but will use the expected policy before sleep. Move the strategy selection to after the thread wakes up, so that the running state of the thread meets expectations. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-28 12:50:19 -08:00
Yonggil Song	e219aecfd4	f2fs: avoid victim selection from previous victim section When f2fs chooses GC victim in large section & LFS mode, next_victim_seg[gc_type] is referenced first. After segment is freed, next_victim_seg[gc_type] has the next segment number. However, next_victim_seg[gc_type] still has the last segment number even after the last segment of section is freed. In this case, when f2fs chooses a victim for the next GC round, the last segment of previous victim section is chosen as a victim. Initialize next_victim_seg[gc_type] to NULL_SEGNO for the last segment in large section. Fixes: `e3080b0120` ("f2fs: support subsectional garbage collection") Signed-off-by: Yonggil Song <yonggil.song@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2022-11-28 12:49:00 -08:00

1 2 3 4 5 ...

3839 Commits