OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Al Viro	babef37dcc	excessive checks in ufs_write_failed() and ufs_evict_inode() As it is, short copy in write() to append-only file will fail to truncate the excessive allocated blocks. As the matter of fact, all checks in ufs_truncate_blocks() are either redundant or wrong for that caller. As for the only other caller (ufs_evict_inode()), we only need the file type checks there. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	006351ac8e	ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	940ef1a0ed	ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() ... and it really needs splitting into "new" and "extend" cases, but that's for later Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	6b0d144fa7	ufs: set correct ->s_maxsize Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	eb315d2ae6	ufs: restore maintaining ->i_blocks Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	414cf7186d	fix ufs_isblockset() Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Al Viro	8785d84d00	ufs: restore proper tail allocation Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-06-09 16:28:01 -04:00
Richard Narron	239e250e4a	fs/ufs: Set UFS default maximum bytes per file This fixes a problem with reading files larger than 2GB from a UFS-2 file system: https://bugzilla.kernel.org/show_bug.cgi?id=195721 The incorrect UFS s_maxsize limit became a problem as of commit `c2a9737f45` ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") which started using s_maxbytes to avoid a page index overflow in do_generic_file_read(). That caused files to be truncated on UFS-2 file systems because the default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it. Here I simply increase the default to a common value used by other file systems. Signed-off-by: Richard Narron <comet.berkeley@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Will B <will.brokenbourgh2877@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <stable@vger.kernel.org> # v4.9 and backports of `c2a9737f45` Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-04 16:33:54 -07:00
Deepa Dinamani	a88e99e976	fs: ufs: use ktime_get_real_ts64() for birthtime CURRENT_TIME is not y2038 safe. Replace it with ktime_get_real_ts64(). Inode time formats are already 64 bit long and accommodates time64_t. Link: http://lkml.kernel.org/r/1491613030-11599-6-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-08 17:15:15 -07:00
Linus Torvalds	7c0f6ba682	Replace <asm/uaccess.h> with <linux/uaccess.h> globally This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-12-24 11:46:01 -08:00
Jeff Layton	f698cccbc8	ufs: fix function declaration for ufs_truncate_blocks sparse says: fs/ufs/inode.c:1195:6: warning: symbol 'ufs_truncate_blocks' was not declared. Should it be static? Note that the forward declaration in the file is already marked static. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-12-22 23:03:41 -05:00
Jan Kara	e64855c6cf	fs: Add helper to clean bdev aliases under a bh and use it Add a helper function that clears buffer heads from a block device aliasing passed bh. Use this helper function from filesystems instead of the original unmap_underlying_metadata() to save some boiler plate code and also have a better name for the functionalily since it is not unmapping anything for a long time. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-11-04 14:34:47 -06:00
Christoph Hellwig	2f8b544477	block,fs: untangle fs.h and blk_types.h Nothing in fs.h should require blk_types.h to be included. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-11-01 09:43:26 -06:00
Linus Torvalds	101105b171	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning	2016-10-10 20:16:43 -07:00
Al Viro	3873691e5a	Merge remote-tracking branch 'ovl/rename2' into for-linus	2016-10-10 23:02:51 -04:00
Deepa Dinamani	02027d42c3	fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps CURRENT_TIME_SEC is not y2038 safe. current_time() will be transitioned to use 64 bit time along with vfs in a separate patch. There is no plan to transistion CURRENT_TIME_SEC to use y2038 safe time interfaces. current_time() will also be extended to use superblock range checking parameters when range checking is introduced. This works because alloc_super() fills in the the s_time_gran in super block to NSEC_PER_SEC. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-09-27 21:06:22 -04:00
Miklos Szeredi	2773bf00ae	fs: rename "rename2" i_op to "rename" Generated patch: sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2` sed -i "s/\brename2\b/rename/g" `git grep -wl rename2` Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2016-09-27 11:03:58 +02:00
Miklos Szeredi	f03b8ad8d3	fs: support RENAME_NOREPLACE for local filesystems This is trivial to do: - add flags argument to foo_rename() - check if flags doesn't have any other than RENAME_NOREPLACE - assign foo_rename() to .rename2 instead of .rename Filesystems converted: affs, bfs, exofs, ext2, hfs, hfsplus, jffs2, jfs, logfs, minix, msdos, nilfs2, omfs, reiserfs, sysvfs, ubifs, udf, ufs, vfat. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Boaz Harrosh <ooo@electrozaur.com> Acked-by: Richard Weinberger <richard@nod.at> Acked-by: Bob Copeland <me@bobcopeland.com> Acked-by: Jan Kara <jack@suse.cz> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org>	2016-09-27 11:03:57 +02:00
Jan Kara	31051c85b5	fs: Give dentry to inode_change_ok() instead of inode inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2016-09-22 10:56:19 +02:00
Linus Torvalds	6784725ab0	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted cleanups and fixes. Probably the most interesting part long-term is ->d_init() - that will have a bunch of followups in (at least) ceph and lustre, but we'll need to sort the barrier-related rules before it can get used for really non-trivial stuff. Another fun thing is the merge of ->d_iput() callers (dentry_iput() and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all except the one in __d_lookup_lru())" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) fs/dcache.c: avoid soft-lockup in dput() vfs: new d_init method vfs: Update lookup_dcache() comment bdev: get rid of ->bd_inodes Remove last traces of ->sync_page new helper: d_same_name() dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends() vfs: clean up documentation vfs: document ->d_real() vfs: merge .d_select_inode() into .d_real() unify dentry_iput() and dentry_unlink_inode() binfmt_misc: ->s_root is not going anywhere drop redundant ->owner initializations ufs: get rid of redundant checks orangefs: constify inode_operations missed comment updates from ->direct_IO() prototype change file_inode(f)->i_mapping is f->f_mapping trim fsnotify hooks a bit 9p: new helper - v9fs_parent_fid() debugfs: ->d_parent is never NULL or negative ...	2016-07-28 12:59:05 -07:00
Mike Christie	dfec8a14fc	fs: have ll_rw_block users pass in op and flags separately This has ll_rw_block users pass in the operation and flags separately, so ll_rw_block can setup the bio op and bi_rw flags on the bio that is submitted. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-06-07 13:41:38 -06:00
Mike Christie	2a222ca992	fs: have submit_bh users pass in op and flags separately This has submit_bh users pass in the operation and flags separately, so submit_bh_wbc can setup the bio op and bi_rw flags on the bio that is submitted. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2016-06-07 13:41:38 -06:00
Al Viro	e0d508f109	ufs: get rid of redundant checks ufs_check_page() makes sure there's no entries with zero ->reclen Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-29 19:07:07 -04:00
Al Viro	3b0a3c1ac1	simple local filesystems: switch to ->iterate_shared() no changes needed (XFS isn't simple, but it has the same parallelism in the interesting parts exercised from CXFS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:49:32 -04:00
Al Viro	be5b82dbfe	make ext2_get_page() and friends work without external serialization Right now ext2_get_page() (and its analogues in a bunch of other filesystems) relies upon the directory being locked - the way it sets and tests Checked and Error bits would be racy without that. Switch to a slightly different scheme, _not_ setting Checked in case of failure. That way the logics becomes if Checked => OK else if Error => fail else if !validate => fail else => OK with validation setting Checked or Error on success and failure resp. and returning which one had happened. Equivalent to the current logics, but unlike the current logics not sensitive to the order of set_bit, test_bit getting reordered by CPU, etc. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-05-02 19:47:25 -04:00
Al Viro	84695ffee7	Merge getxattr prototype change into work.lookups The rest of work.xattr stuff isn't needed for this branch	2016-05-02 19:45:47 -04:00
Al Viro	fc64005c93	don't bother with ->d_inode->i_sb - it's always equal to ->d_sb ... and neither can ever be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-04-10 17:11:51 -04:00
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Vladimir Davydov	5d097056c9	kmemcg: account certain kmem allocations to memcg Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-14 16:00:49 -08:00
Al Viro	21fc61c73c	don't put symlink bodies in pagecache into highmem kmap() in page_follow_link_light() needed to go - allowing to hold an arbitrary number of kmaps for long is a great way to deadlocking the system. new helper (inode_nohighmem(inode)) needs to be used for pagecache symlinks inodes; done for all in-tree cases. page_follow_link_light() instrumented to yell about anything missed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-12-08 22:41:36 -05:00
Al Viro	9cdce3c074	ufs: get rid of ->setattr() for symlinks It was to needed for a couple of months in 2010, until UFS quota support got dropped. Since then it's equivalent to simple_setattr() (i.e. the default) for everything except the regular files. And dropping it there allows to convert all UFS symlinks to {page,simple}_symlink_inode_operations, getting rid of fs/ufs/symlink.c completely. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-12-06 20:43:26 -05:00
Al Viro	bd2843fe1f	fix ufs write vs readpage race when writing into a hole Followup to the UFS series - with the way we clear the new blocks (via buffer cache, possibly on more than a page worth of file) we really should not insert a reference to new block into inode block tree until after we'd cleared it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-09 10:43:12 -07:00
Al Viro	4e317ce73a	ufs_inode_get{frag,block}(): get rid of 'phys' argument Just pass NULL as locked_page in case of first block in the indirect chain. Old calling conventions aside, a reason for having 'phys' was that ufs_inode_getfrag() used to be able to do _two_ allocations - indirect block and extending/reallocating a tail. We needed locked_page for the latter (it's a data), but we also needed to figure out that indirect block is metadata. So we used to pass non-NULL locked_page in all cases and used NULL phys as indication of being asked to allocate an indirect. With tail unpacking taken into a separate function we don't need those convolutions anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:05 -04:00
Al Viro	0385f1f9e3	ufs_getfrag_block(): tidy up a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:04 -04:00
Al Viro	5fbfb238f7	ufs_inode_getblock(): failure to read an indirect block is -EIO ... and not "write to beginning of the disk", TYVM... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:03 -04:00
Al Viro	4eeff4c932	ufs_getfrag_block(): turn following indirects into a loop Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:02 -04:00
Al Viro	5336970be0	ufs_inode_getfrag(): pass index instead of 'fragment' same story as with ufs_inode_getblock() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:01 -04:00
Al Viro	0f3c1294be	ufs_inode_getfrag(): split extending the partial blocks off ufs_extend_tail() is handling that now. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:40:00 -04:00
Al Viro	619cfac091	ufs_inode_getblock(): pass indirect block number and full index ... instead of messing with buffer_head. We can bloody well do sb_bread() in there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:59 -04:00
Al Viro	721435a767	ufs_inode_getblock(): pass index instead of 'fragment' The value passed to ufs_inode_getblock() as the 3rd argument had lower bits ignored; the upper bits were shifted down and used and they actually make sense - those are _lower_ bits of index in indirect block (i.e. they form the index within a fragment within an indirect block). Pass those as argument. Upper bits of index (i.e. the number of fragment within indirect block) will join them shortly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:58 -04:00
Al Viro	177848a018	ufs_inode_get{frag,block}(): leave sb_getblk() to caller just return the damn block number Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:57 -04:00
Al Viro	8d9dcf1436	ufs_getfrag_block(): get rid of macro jungles Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:56 -04:00
Al Viro	bbb3eb9d34	ufs_inode_get{frag,block}(): consolidate success exits These calling conventions are rudiments of pre-2.3 times; they really need to be sanitized. This is the first step; next will be _always_ returning a block number, instead of this "return a pointer to buffer_head, except when we get to the actual data" crap. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:55 -04:00
Al Viro	71dd42846f	ufs: use the branch depth in ufs_getfrag_block() we'd already calculated it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:54 -04:00
Al Viro	4b7068c8b1	ufs: move calculation of offsets into ufs_getfrag_block() ... and massage ufs_frag_map() to take those instead of fragment number. As it is, we duplicate the damn thing on the write side, open-coded and bloody hard to follow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:53 -04:00
Al Viro	5a39c25562	ufs_inode_get{frag,block}(): get rid of retries We are holding ->truncate_mutex, so nobody else can alter our block pointers. Rechecks/retries were needed back when we only held BKL there, and had to cope with write_begin/writepage and writepage/truncate races. Can't happen anymore... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:52 -04:00
Al Viro	f53bd1421b	__ufs_truncate_blocks(): avoid excessive dirtying of indirect blocks There's a case when an indirect block gets dirtied for no good reason - when there's a hole starting in the middle of area covered by it and spanning past its end, and truncate() is done precisely to the beginning of the hole. The block is obviously not modified at all - all removals happen beyond it. However, existing code ends up dirtying it just in case. It's trivial to fix and while it's not a real bug by any stretch of imagination, it makes the damn thing harder to follow. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:51 -04:00
Al Viro	cc7231e309	free_full_branch(): don't bother modifying the block we are going to free Note that it's already made unreachable from the inode, so we don't have to worry about ufs_frag_map() walking into something already freed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:50 -04:00
Al Viro	b6eede0ec6	move marking inode dirty to the end of __ufs_truncate_blocks() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:49 -04:00
Al Viro	163073db51	free_full_branch(): saner calling conventions Have caller fetch the block number and remove it from wherever it was. Pass the block number instead. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:48 -04:00
Al Viro	7b4e4f7f81	ufs_trunc_branch(): kill recursion turn recursion into a pair of loops Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:47 -04:00
Al Viro	6aab6dd379	ufs_trunc_branch(): massage towards killing recursion We always have 0 < depth2 <= depth in there, so if (--depth) { if (--depth2) A B } else { C // not using depth2 } D // not using depth2 is equivalent to if (--depth2) A with s/depth/depth - 1/ if (--depth) B else C D Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:46 -04:00
Al Viro	6d1ebbca2b	split ufs_truncate_branch() into full- and partial-branch variants Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:45 -04:00
Al Viro	a138b4b688	ufs: unify the logics for collecting adjacent data blocks to free open-coded in several places... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:44 -04:00
Al Viro	a96574233c	ufs_trunc_branch(): separate the calls with non-NULL offsets Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:43 -04:00
Al Viro	97e0f8f87c	ufs_trunc_branch(): never call with offsets != NULL && depth2 == 0 For calls in __ufs_truncate_blocks() it's just a matter of not incrementing offsets[0] and not making that call - immediately following loop will be executed one extra time and we'll be just fine. For recursive call in ufs_trunc_branch() itself, just assing NULL to offsets if we would be about to make such call. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:42 -04:00
Al Viro	42432739b5	__ufs_trunc_blocks(): turn the part after switch into a loop ... and turn the switch into if (), since all cases with depth != 1 have just become identical. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:41 -04:00
Al Viro	ef3a315d4c	__ufs_truncate_blocks(): unify freeing the full branches Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:40 -04:00
Al Viro	9e0fbbde27	unify ufs_trunc_..indirect() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:39 -04:00
Al Viro	6775e24d9c	ufs_trunc_..indirect(): more massage towards unifying Instead of manually checking that the array contains only zeroes, find the position of the last non-zero (in __ufs_truncate(), where we can conveniently do that) and use that to tell if there's any non-zero in the array tail passed to ufs_trunc_...indirect(). The goal of all that clumsiness is to get fold these functions together. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:38 -04:00
Al Viro	85416288bf	ufs_trunc_...indirect(): pass the array of indices instead of offsets rather than bitslicing the offset just formed as sum of shifted indices, pass the array of those indices itself. NULL is used as equivalent of "all zeroes" (== free the entire branch). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:37 -04:00
Al Viro	7a4fdda724	__ufs_truncate(); find cutoff distances into branches by offsets[] array Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:36 -04:00
Al Viro	7bad5939fc	ufs_trunc_dindirect(): pass the number of blocks to keep same as the previous two. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:35 -04:00
Al Viro	6ac36b8777	ufs_trunc_indirect(): pass the index of the first pointer to free ... instead of file offset. Same cleanups as in the tindirect conversion in previous commit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:34 -04:00
Al Viro	18ca51d821	ufs_trunc_tindirect(): pass the number of blocks to keep IOW, the distance of cutoff from the begining of the branch (in blocks). That (and the fact that block just prior to cutoff is guaranteed to be present) allows to tell whether to free triple indirect block just by looking at the offset. While we are at it, using u64 for index in the block is wrong - those should be unsigned int. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:33 -04:00
Al Viro	31cd043e1a	ufs: beginning of __ufs_truncate_block() massage Use ufs_block_to_path() to find the cutoff path in the block pointers' tree. For now just use the information about the depth (to bypass the fully preserved subtrees); subsequent commits will use the information about actual path. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:32 -04:00
Al Viro	4e3911f3d7	ufs: the offsets ufs_block_to_path() puts into array are not sector_t type makes no sense - those are indices in block number arrays, not block numbers. And no, UFS is not likely to grow indirect blocks with 4Gpointers in them... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:31 -04:00
Al Viro	010d331fc3	ufs: move truncate code into inode.c It is closely tied to block pointers handling there, can benefit from existing helpers, etc. - no point keeping them apart. Trimmed the trailing whitespaces in inode.c at the same time. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:30 -04:00
Al Viro	0d23cf7616	ufs: no retries are needed on truncate Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:29 -04:00
Al Viro	687857930d	ufs: ufs_trunc_...() has exclusion with everything that might cause allocations Currently - on lock_ufs(), eventually - on per-inode mutex. lock_ufs() used to be mere BKL, which is much weaker, so it needed those rechecks. BKL doesn't provide any exclusion once we lose CPU; its blind replacement, OTOH, _does_. Making that per-filesystem was an atrocity, but at least we can simplify life here. And yes, we certainly need to make that sucker per-inode - these days inode.c and truncate.c uses are needed only to protect the block pointers. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:28 -04:00
Al Viro	6a799d3514	ufs: ufs_trunc_direct() always returns 0 make it return void Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:27 -04:00
Al Viro	dff7cfd36e	ufs: kill lock_ufs() There were 3 remaining users; in two of them we took ->s_lock immediately after lock_ufs() and held it until just before unlock_ufs(); the third one (statfs) could not be called from itself or from other two (remount and sync_fs). Just use ->s_lock in statfs and don't bother with lock_ufs at all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:26 -04:00
Al Viro	724bb09fdc	ufs: don't use lock_ufs() for block pointers tree protection * stores to block pointers are under per-inode seqlock (meta_lock) and mutex (truncate_mutex) * fetches of block pointers are either under truncate_mutex, or wrapped into seqretry loop on meta_lock * all changes of ->i_size are under truncate_mutex and i_mutex * all changes of ->i_lastfrag are under truncate_mutex It's similar to what ext2 is doing; the main difference is that unlike ext2 we can't rely upon the atomicity of stores into block pointers - on UFS2 they are 64bit. So we can't cut the corner when switching a pointer from NULL to non-NULL as we could in ext2_splice_branch() and need to use meta_lock on all modifications. We use seqlock where ext2 uses rwlock; ext2 could probably also benefit from such change... Another non-trivial difference is that with UFS we cannot have reader grab truncate_mutex in case of race - it has to keep retrying. That might be possible to change, but not until we lift tail unpacking several levels up in call chain. After that commit we do NOT hold fs-wide serialization on accesses to block pointers anymore. Moreover, lock_ufs() can become a normal mutex now - it's only used on statfs, remount and sync_fs and none of those uses are recursive. As the matter of fact, now it can be collapsed with ->s_lock, and be eventually replaced with saner per-cylinder-group spinlocks, but that's a separate story. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:25 -04:00
Al Viro	4af7b2c080	ufs: bforget() indirect blocks before freeing them right now it doesn't matter (lock_ufs() serializes everything), but when we switch to per-inode locking, it will be needed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:24 -04:00
Al Viro	493b4537a2	ufs: move lock_ufs() down into __ufs_truncate_blocks() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:23 -04:00
Al Viro	2401aa29ab	ufs: move truncate_setsize() down into ufs_truncate() just prior to __ufs_truncate_blocks(), with matching change of calling conventions Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:22 -04:00
Al Viro	3b7a3a05e8	ufs: free excessive blocks upon ->write_begin() failure/short copy Broken in "[PATCH] ufs: truncate should allocate block for last byte"; all way back in 2006. ufs_setattr() hadn't been the only user of vmtruncate() and eliminating ->truncate() method required corrections in a bunch of places. Eventually those places had migrated into ->write_begin() failure exit and ->write_end() after short copy... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:21 -04:00
Al Viro	d622f167b8	ufs: switch ufs_evict_inode() to trimmed-down variant of ufs_truncate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:20 -04:00
Al Viro	f3e0f3da1b	ufs: kill more lock_ufs() calls a) move it inside ufs_truncate() b) ufs_free_inode() doesn't need it - it's serialized on ->s_lock c) ufs_write_inode() doesn't need it either (and can be called without it anyway). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-07-06 17:39:19 -04:00
Linus Torvalds	1dc51b8288	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "Assorted VFS fixes and related cleanups (IMO the most interesting in that part are f_path-related things and Eric's descriptor-related stuff). UFS regression fixes (it got broken last cycle). 9P fixes. fs-cache series, DAX patches, Jan's file_remove_suid() work" [ I'd say this is much more than "fixes and related cleanups". The file_table locking rule change by Eric Dumazet is a rather big and fundamental update even if the patch isn't huge. - Linus ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits) 9p: cope with bogus responses from server in p9_client_{read,write} p9_client_write(): avoid double p9_free_req() 9p: forgetting to cancel request on interrupted zero-copy RPC dax: bdev_direct_access() may sleep block: Add support for DAX reads/writes to block devices dax: Use copy_from_iter_nocache dax: Add block size note to documentation fs/file.c: __fget() and dup2() atomicity rules fs/file.c: don't acquire files->file_lock in fd_install() fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation vfs: avoid creation of inode number 0 in get_next_ino namei: make set_root_rcu() return void make simple_positive() public ufs: use dir_pages instead of ufs_dir_pages() pagemap.h: move dir_pages() over there remove the pointless include of lglock.h fs: cleanup slight list_entry abuse xfs: Correctly lock inode when removing suid and file capabilities fs: Call security_ops->inode_killpriv on truncate fs: Provide function telling whether file_remove_privs() will do anything ...	2015-07-04 19:36:06 -07:00
Linus Torvalds	e4bc13adfd	Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-block Pull cgroup writeback support from Jens Axboe: "This is the big pull request for adding cgroup writeback support. This code has been in development for a long time, and it has been simmering in for-next for a good chunk of this cycle too. This is one of those problems that has been talked about for at least half a decade, finally there's a solution and code to go with it. Also see last weeks writeup on LWN: http://lwn.net/Articles/648292/" * 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits) writeback, blkio: add documentation for cgroup writeback support vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB writeback: do foreign inode detection iff cgroup writeback is enabled v9fs: fix error handling in v9fs_session_init() bdi: fix wrong error return value in cgwb_create() buffer: remove unusued 'ret' variable writeback: disassociate inodes from dying bdi_writebacks writeback: implement foreign cgroup inode bdi_writeback switching writeback: add lockdep annotation to inode_to_wb() writeback: use unlocked_inode_to_wb transaction in inode_congested() writeback: implement unlocked_inode_to_wb transaction and use it for stat updates writeback: implement [locked_]inode_to_wb_and_lock_list() writeback: implement foreign cgroup inode detection writeback: make writeback_control track the inode being written back writeback: relocate wb[_try]_get(), wb_put(), inode_{attach\|detach}_wb() mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use writeback: implement memcg writeback domain based throttling writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes writeback: implement memcg wb_domain writeback: update wb_over_bg_thresh() to use wb_domain aware operations ...	2015-06-25 16:00:17 -07:00
Fabian Frederick	5d754ced15	ufs: use dir_pages instead of ufs_dir_pages() dir_pages was declared in a lot of filesystems. Use newly dir_pages() from pagemap.h Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-23 18:02:01 -04:00
Al Viro	4ef51e8b7a	Merge branch 'for-linus' into for-next	2015-06-17 14:44:05 -04:00
Fabian Frederick	e4f95517f1	fs/ufs: restore s_lock mutex_init() Add last missing line in commit "cdd9eefdf905" ("fs/ufs: restore s_lock mutex") Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-17 14:43:02 -04:00
Al Viro	70d45cdb66	ufs: don't touch mtime/ctime of directory being moved See "ext2: Do not update mtime of a moved directory" (and followup in "ext2: fix unbalanced kmap()/kunmap()") for background; this is UFS equivalent - the same problem exists here. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-16 02:08:34 -04:00
Al Viro	a50e4a02ad	ufs: don't bother with lock_ufs()/unlock_ufs() for directory access We are already serialized by ->i_mutex and operations on different directories are independent. These calls are just rudiments of blind BKL conversion and they should've been removed back then. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-16 02:08:31 -04:00
Jan Kara	514d748f69	ufs: Fix possible deadlock when looking up directories Commit `e4502c63f5` (ufs: deal with nfsd/iget races) made ufs create inodes with I_NEW flag set. However ufs_mkdir() never cleared this flag. Thus if someone ever tried to lookup the directory by inode number, he would deadlock waiting for I_NEW to be cleared. Luckily this mostly happens only if the filesystem is exported over NFS since otherwise we have the inode attached to dentry and don't look it up by inode number. In rare cases dentry can get freed without inode being freed and then we'd hit the deadlock even without NFS export. Fix the problem by clearing I_NEW before instantiating new directory inode. Fixes: `e4502c63f5` Reported-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-16 02:08:12 -04:00
Jan Kara	12ecbb4b1d	ufs: Fix warning from unlock_new_inode() Commit `e4502c63f5` (ufs: deal with nfsd/iget races) introduced unlock_new_inode() call into ufs_add_nondir(). However that function gets called also from ufs_link() which hands it already initialized inode and thus unlock_new_inode() complains. The problem is harmless but annoying. Fix the problem by opencoding necessary stuff in ufs_link() Fixes: `e4502c63f5` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-16 02:08:07 -04:00
Fabian Frederick	cdd9eefdf9	fs/ufs: restore s_lock mutex Commit `0244756edc` ("ufs: sb mutex merge + mutex_destroy") generated deadlocks in read/write mode on mkdir. This patch partially reverts it keeping fixes by Andrew Morton and mutex_destroy() [AV: fixed a missing bit in ufs_remount()] Signed-off-by: Fabian Frederick <fabf@skynet.be> Reported-by: Ian Campbell <ian.campbell@citrix.com> Suggested-by: Jan Kara <jack@suse.cz> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Roger Pau Monne <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-16 02:07:38 -04:00
Fabian Frederick	13b987ea27	fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge" This reverts commit `9ef7db7f38` ("ufs: fix deadlocks introduced by sb mutex merge") That patch tried to solve commit `0244756edc` ("ufs: sb mutex merge + mutex_destroy") which is itself partially reverted due to multiple deadlocks. Signed-off-by: Fabian Frederick <fabf@skynet.be> Suggested-by: Jan Kara <jack@suse.cz> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Roger Pau Monne <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2015-06-14 11:31:51 -04:00
Tejun Heo	66114cad64	writeback: separate out include/linux/backing-dev-defs.h With the planned cgroup writeback support, backing-dev related declarations will be more widely used across block and cgroup; unfortunately, including backing-dev.h from include/linux/blkdev.h makes cyclic include dependency quite likely. This patch separates out backing-dev-defs.h which only has the essential definitions and updates blkdev.h to include it. c files which need access to more backing-dev details now include backing-dev.h directly. This takes backing-dev.h off the common include dependency chain making it a lot easier to use it across block and cgroup. v2: fs/fat build failure fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-06-02 08:33:34 -06:00
Al Viro	4b8061a67f	ufs: switch to simple_follow_link() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:18:25 -04:00
David Howells	2b0143b5c9	VFS: normal filesystems (and lustre): d_inode() annotations that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:57 -04:00
Al Viro	5d5d568975	make new_sync_{read,write}() static All places outside of core VFS that checked ->read and ->write for being NULL or called the methods directly are gone now, so NULL {read,write} with non-NULL {read,write}_iter will do the right thing in all cases. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-11 22:29:40 -04:00
Fabian Frederick	ed3ad79f87	fs/ufs/super.c: fix potential race condition Let locking subsystem decide on mutex management. As reported by Andrew Morton this patch fixes a bug: : lock_ufs() is assuming that on non-preempt uniprocessor, the calling : code will run atomically up to the matching unlock_ufs(). : : But that isn't true. The very first site I looked at (ufs_frag_map) : does sb_bread() under lock_ufs(). And sb_bread() will call schedule(), : very commonly. : : The ->mutex_owner stuff is a bit hacky but should work OK. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-17 14:34:51 -08:00
Fabian Frederick	61da3ae241	fs/ufs/super.c: remove unnecessary casting Fix the following coccinelle warning: fs/ufs/super.c:1418:7-28: WARNING: casting value returned by memory allocation function to (struct ufs_inode_info *) is useless. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-17 14:34:51 -08:00
Fabian Frederick	35c0b380d8	fs/ufs/balloc.c: remove unused variable ucg is defined and set in ufs_bitmap_search but never used. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-14 02:18:20 +02:00
Al Viro	e4502c63f5	ufs: deal with nfsd/iget races Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:17:52 -04:00
Alexey Khoroshilov	9ef7db7f38	ufs: fix deadlocks introduced by sb mutex merge Commit `0244756edc` ("ufs: sb mutex merge + mutex_destroy") introduces deadlocks in ufs_new_inode() and ufs_free_inode(). Most callers of that functions acqure the mutex by themselves and ufs_{new,free}_inode() do that via lock_ufs(), i.e we have an unavoidable double lock. The patch proposes to resolve the issue by making sure that ufs_{new,free}_inode() are not called with the mutex held. Found by Linux Driver Verification project (linuxtesting.org). Cc: stable@vger.kernel.org # 3.16 Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-07 13:26:39 -04:00
Fabian Frederick	edc023caf4	fs/ufs/inode.c: kernel-doc warning fixes Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:21 -07:00

1 2 3 4 5 ...

388 Commits