linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Al Viro	09bf4f5b6e	ufs: avoid grabbing ->truncate_mutex if possible tail unpacking is done in a wrong place; the deadlocks galore is best dealt with by doing that in ->write_iter() (and switching to iomap, while we are at it), but that's rather painful to backport. The trouble comes from grabbing pages that cover the beginning of tail from inside of ufs_new_fragments(); ongoing pageout of any of those is going to deadlock on ->truncate_mutex with process that got around to extending the tail holding that and waiting for page to get unlocked, while ->writepage() on that page is waiting on ->truncate_mutex. The thing is, we don't need ->truncate_mutex when the fragment we are trying to map is within the tail - the damn thing is allocated (tail can't contain holes). Let's do a plain lookup and if the fragment is present, we can just pretend that we'd won the race in almost all cases. The only exception is a fragment between the end of tail and the end of block containing tail. Protect ->i_lastfrag with ->meta_lock - read_seqlock_excl() is sufficient. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-15 00:41:18 -04:00
Al Viro	67a70017fa	ufs: we need to sync inode before freeing it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-10 12:02:28 -04:00
Al Viro	babef37dcc	excessive checks in ufs_write_failed() and ufs_evict_inode() As it is, short copy in write() to append-only file will fail to truncate the excessive allocated blocks. As the matter of fact, all checks in ufs_truncate_blocks() are either redundant or wrong for that caller. As for the only other caller (ufs_evict_inode()), we only need the file type checks there. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	006351ac8e	ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	940ef1a0ed	ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() ... and it really needs splitting into "new" and "extend" cases, but that's for later Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	8785d84d00	ufs: restore proper tail allocation Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Jeff Layton	f698cccbc8	ufs: fix function declaration for ufs_truncate_blocks sparse says: fs/ufs/inode.c:1195:6: warning: symbol 'ufs_truncate_blocks' was not declared. Should it be static? Note that the forward declaration in the file is already marked static. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:03:41 -05:00
Jan Kara	e64855c6cf	fs: Add helper to clean bdev aliases under a bh and use it Add a helper function that clears buffer heads from a block device aliasing passed bh. Use this helper function from filesystems instead of the original unmap_underlying_metadata() to save some boiler plate code and also have a better name for the functionalily since it is not unmapping anything for a long time. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-11-04 14:34:47 -06:00
Linus Torvalds	101105b171	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning	2016-10-10 20:16:43 -07:00
Deepa Dinamani	02027d42c3	fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps CURRENT_TIME_SEC is not y2038 safe. current_time() will be transitioned to use 64 bit time along with vfs in a separate patch. There is no plan to transistion CURRENT_TIME_SEC to use y2038 safe time interfaces. current_time() will also be extended to use superblock range checking parameters when range checking is introduced. This works because alloc_super() fills in the the s_time_gran in super block to NSEC_PER_SEC. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-09-27 21:06:22 -04:00
Jan Kara	31051c85b5	fs: Give dentry to inode_change_ok() instead of inode inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2016-09-22 10:56:19 +02:00
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Al Viro	21fc61c73c	don't put symlink bodies in pagecache into highmem kmap() in page_follow_link_light() needed to go - allowing to hold an arbitrary number of kmaps for long is a great way to deadlocking the system. new helper (inode_nohighmem(inode)) needs to be used for pagecache symlinks inodes; done for all in-tree cases. page_follow_link_light() instrumented to yell about anything missed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-12-08 22:41:36 -05:00
Al Viro	9cdce3c074	ufs: get rid of ->setattr() for symlinks It was to needed for a couple of months in 2010, until UFS quota support got dropped. Since then it's equivalent to simple_setattr() (i.e. the default) for everything except the regular files. And dropping it there allows to convert all UFS symlinks to {page,simple}_symlink_inode_operations, getting rid of fs/ufs/symlink.c completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-12-06 20:43:26 -05:00
Al Viro	4e317ce73a	ufs_inode_get{frag,block}(): get rid of 'phys' argument Just pass NULL as locked_page in case of first block in the indirect chain. Old calling conventions aside, a reason for having 'phys' was that ufs_inode_getfrag() used to be able to do _two_ allocations - indirect block and extending/reallocating a tail. We needed locked_page for the latter (it's a data), but we also needed to figure out that indirect block is metadata. So we used to pass non-NULL locked_page in all cases and used NULL phys as indication of being asked to allocate an indirect. With tail unpacking taken into a separate function we don't need those convolutions anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:05 -04:00
Al Viro	0385f1f9e3	ufs_getfrag_block(): tidy up a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:04 -04:00
Al Viro	5fbfb238f7	ufs_inode_getblock(): failure to read an indirect block is -EIO ... and not "write to beginning of the disk", TYVM... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:03 -04:00
Al Viro	4eeff4c932	ufs_getfrag_block(): turn following indirects into a loop Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:02 -04:00
Al Viro	5336970be0	ufs_inode_getfrag(): pass index instead of 'fragment' same story as with ufs_inode_getblock() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:01 -04:00
Al Viro	0f3c1294be	ufs_inode_getfrag(): split extending the partial blocks off ufs_extend_tail() is handling that now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:00 -04:00
Al Viro	619cfac091	ufs_inode_getblock(): pass indirect block number and full index ... instead of messing with buffer_head. We can bloody well do sb_bread() in there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:59 -04:00
Al Viro	721435a767	ufs_inode_getblock(): pass index instead of 'fragment' The value passed to ufs_inode_getblock() as the 3rd argument had lower bits ignored; the upper bits were shifted down and used and they actually make sense - those are _lower_ bits of index in indirect block (i.e. they form the index within a fragment within an indirect block). Pass those as argument. Upper bits of index (i.e. the number of fragment within indirect block) will join them shortly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:58 -04:00
Al Viro	177848a018	ufs_inode_get{frag,block}(): leave sb_getblk() to caller just return the damn block number Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:57 -04:00
Al Viro	8d9dcf1436	ufs_getfrag_block(): get rid of macro jungles Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:56 -04:00
Al Viro	bbb3eb9d34	ufs_inode_get{frag,block}(): consolidate success exits These calling conventions are rudiments of pre-2.3 times; they really need to be sanitized. This is the first step; next will be _always_ returning a block number, instead of this "return a pointer to buffer_head, except when we get to the actual data" crap. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:55 -04:00
Al Viro	71dd42846f	ufs: use the branch depth in ufs_getfrag_block() we'd already calculated it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:54 -04:00
Al Viro	4b7068c8b1	ufs: move calculation of offsets into ufs_getfrag_block() ... and massage ufs_frag_map() to take those instead of fragment number. As it is, we duplicate the damn thing on the write side, open-coded and bloody hard to follow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:53 -04:00
Al Viro	5a39c25562	ufs_inode_get{frag,block}(): get rid of retries We are holding ->truncate_mutex, so nobody else can alter our block pointers. Rechecks/retries were needed back when we only held BKL there, and had to cope with write_begin/writepage and writepage/truncate races. Can't happen anymore... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:52 -04:00
Al Viro	f53bd1421b	__ufs_truncate_blocks(): avoid excessive dirtying of indirect blocks There's a case when an indirect block gets dirtied for no good reason - when there's a hole starting in the middle of area covered by it and spanning past its end, and truncate() is done precisely to the beginning of the hole. The block is obviously not modified at all - all removals happen beyond it. However, existing code ends up dirtying it just in case. It's trivial to fix and while it's not a real bug by any stretch of imagination, it makes the damn thing harder to follow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:51 -04:00
Al Viro	cc7231e309	free_full_branch(): don't bother modifying the block we are going to free Note that it's already made unreachable from the inode, so we don't have to worry about ufs_frag_map() walking into something already freed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:50 -04:00
Al Viro	b6eede0ec6	move marking inode dirty to the end of __ufs_truncate_blocks() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:49 -04:00
Al Viro	163073db51	free_full_branch(): saner calling conventions Have caller fetch the block number and remove it from wherever it was. Pass the block number instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:48 -04:00
Al Viro	7b4e4f7f81	ufs_trunc_branch(): kill recursion turn recursion into a pair of loops Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:47 -04:00
Al Viro	6aab6dd379	ufs_trunc_branch(): massage towards killing recursion We always have 0 < depth2 <= depth in there, so if (--depth) { if (--depth2) A B } else { C // not using depth2 } D // not using depth2 is equivalent to if (--depth2) A with s/depth/depth - 1/ if (--depth) B else C D Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:46 -04:00
Al Viro	6d1ebbca2b	split ufs_truncate_branch() into full- and partial-branch variants Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:45 -04:00
Al Viro	a138b4b688	ufs: unify the logics for collecting adjacent data blocks to free open-coded in several places... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:44 -04:00
Al Viro	a96574233c	ufs_trunc_branch(): separate the calls with non-NULL offsets Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:43 -04:00
Al Viro	97e0f8f87c	ufs_trunc_branch(): never call with offsets != NULL && depth2 == 0 For calls in __ufs_truncate_blocks() it's just a matter of not incrementing offsets[0] and not making that call - immediately following loop will be executed one extra time and we'll be just fine. For recursive call in ufs_trunc_branch() itself, just assing NULL to offsets if we would be about to make such call. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:42 -04:00
Al Viro	42432739b5	__ufs_trunc_blocks(): turn the part after switch into a loop ... and turn the switch into if (), since all cases with depth != 1 have just become identical. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:41 -04:00
Al Viro	ef3a315d4c	__ufs_truncate_blocks(): unify freeing the full branches Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:40 -04:00
Al Viro	9e0fbbde27	unify ufs_trunc_..indirect() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:39 -04:00
Al Viro	6775e24d9c	ufs_trunc_..indirect(): more massage towards unifying Instead of manually checking that the array contains only zeroes, find the position of the last non-zero (in __ufs_truncate(), where we can conveniently do that) and use that to tell if there's any non-zero in the array tail passed to ufs_trunc_...indirect(). The goal of all that clumsiness is to get fold these functions together. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:38 -04:00
Al Viro	85416288bf	ufs_trunc_...indirect(): pass the array of indices instead of offsets rather than bitslicing the offset just formed as sum of shifted indices, pass the array of those indices itself. NULL is used as equivalent of "all zeroes" (== free the entire branch). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:37 -04:00
Al Viro	7a4fdda724	__ufs_truncate(); find cutoff distances into branches by offsets[] array Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:36 -04:00
Al Viro	7bad5939fc	ufs_trunc_dindirect(): pass the number of blocks to keep same as the previous two. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:35 -04:00
Al Viro	6ac36b8777	ufs_trunc_indirect(): pass the index of the first pointer to free ... instead of file offset. Same cleanups as in the tindirect conversion in previous commit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:34 -04:00
Al Viro	18ca51d821	ufs_trunc_tindirect(): pass the number of blocks to keep IOW, the distance of cutoff from the begining of the branch (in blocks). That (and the fact that block just prior to cutoff is guaranteed to be present) allows to tell whether to free triple indirect block just by looking at the offset. While we are at it, using u64 for index in the block is wrong - those should be unsigned int. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:33 -04:00
Al Viro	31cd043e1a	ufs: beginning of __ufs_truncate_block() massage Use ufs_block_to_path() to find the cutoff path in the block pointers' tree. For now just use the information about the depth (to bypass the fully preserved subtrees); subsequent commits will use the information about actual path. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:32 -04:00
Al Viro	4e3911f3d7	ufs: the offsets ufs_block_to_path() puts into array are not sector_t type makes no sense - those are indices in block number arrays, not block numbers. And no, UFS is not likely to grow indirect blocks with 4Gpointers in them... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:31 -04:00

1 2 3

117 Commits