OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Daniel Mack	381bf7cad9	fuse: mark variables uninitialized gcc 4.6.3 complains about uninitialized variables in fs/fuse/control.c: CC fs/fuse/control.o fs/fuse/control.c: In function 'fuse_conn_congestion_threshold_write': fs/fuse/control.c:165:29: warning: 'val' may be used uninitialized in this function [-Wuninitialized] fs/fuse/control.c: In function 'fuse_conn_max_background_write': fs/fuse/control.c:128:23: warning: 'val' may be used uninitialized in this function [-Wuninitialized] fuse_conn_limit_write() will always return non-zero unless the &val is modified, so the warning is misleading. Let the compiler know about it by marking 'val' with 'uninitialized_var'. Signed-off-by: Daniel Mack <zonque@gmail.com> Cc: Brian Foster <bfoster@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-09-03 17:44:06 +02:00
Miklos Szeredi	8d39d801d6	cuse: kill connection on initialization error Luca Risolia reported that a CUSE daemon will continue to run even if initialization of the emulated device failes for some reason (e.g. the device number is already registered by another driver). This patch disconnects the fuse device on error, which will make the userspace CUSE daemon exit, albeit without indication about what the problem was. Reported-by: Luca Risolia <luca.risolia@studio.unibo.it> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-08-30 19:24:35 +02:00
Miklos Szeredi	bbd9979797	cuse: fix fuse_conn_kill() fuse_conn_kill() removed fc->entry, called fuse_ctl_remove_conn() and fuse_bdi_destroy(). None of which is appropriate for cuse cleanup. The fuse_ctl_remove_conn() decrements the nlink on the control filesystem, which is totally bogus. The others are harmless but unnecessary. So move these out from fuse_conn_kill() to fuse_put_super() where they belong. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-08-30 19:24:34 +02:00
Linus Torvalds	20fb1936de	Merge branch 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull vfs fixes from Miklos Szeredi. This mainly fixes some confusion about whether the open 'mode' variable passed around should contain the full file type (S_IFREG etc) information or just the permission mode. In particular, the lack of proper file type information had confused fuse. * 'vfs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: vfs: fix propagation of atomic_open create error on negative dentry fuse: check create mode in atomic open vfs: pass right create mode to may_o_create() vfs: atomic_open(): fix create mode usage vfs: canonicalize create mode in build_open_flags()	2012-08-18 10:02:17 -07:00
Linus Torvalds	2eac9eb8a2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: verify all ioctl retry iov elements fuse: add missing INIT flag descriptions fuse: add missing INIT flags fuse: update attributes on aio_read fuse: invalidate inode mapping if mtime changes fuse: add FUSE_AUTO_INVAL_DATA init flag	2012-08-16 11:46:31 -07:00
Miklos Szeredi	af109bca94	fuse: check create mode in atomic open Verify that the VFS is passing us a complete create mode with the S_IFREG to atomic open. Reported-by: Steve <steveamigauk@yahoo.co.uk> Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Richard W.M. Jones <rjones@redhat.com>	2012-08-15 13:01:24 +02:00
Zach Brown	fb6ccff667	fuse: verify all ioctl retry iov elements Commit `7572777eef` attempted to verify that the total iovec from the client doesn't overflow iov_length() but it only checked the first element. The iovec could still overflow by starting with a small element. The obvious fix is to check all the elements. The overflow case doesn't look dangerous to the kernel as the copy is limited by the length after the overflow. This fix restores the intention of returning an error instead of successfully copying less than the iovec represented. I found this by code inspection. I built it but don't have a test case. I'm cc:ing stable because the initial commit did as well. Signed-off-by: Zach Brown <zab@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: <stable@vger.kernel.org> [2.6.37+]	2012-08-06 18:19:24 +02:00
Jan Kara	58ef6a75c3	fuse: Convert to new freezing mechanism Convert check in fuse_file_aio_write() to using new freeze protection. CC: fuse-devel@lists.sourceforge.net CC: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-31 09:45:50 +04:00
Miklos Szeredi	69fe05c90e	fuse: add missing INIT flags Add missing flags that userspace derived from the protocol version number. This makes the protocol more flexible. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-07-18 16:09:40 +02:00
Brian Foster	a8894274a3	fuse: update attributes on aio_read A fuse-based network filesystem might allow for the inode and/or file data to change unexpectedly. A local client that opens and repeatedly reads a file might never pick up on such changes and indefinitely return stale data. Always invoke fuse_update_attributes() in the read path to cause an attr revalidation when the attributes expire. This leads to a page cache invalidation if necessary and ensures fuse issues new read requests to the fuse client. The original logic (reval only on reads beyond EOF) is preserved unless the client specifies FUSE_AUTO_INVAL_DATA on init. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-07-18 16:09:40 +02:00
Brian Foster	eed2179efe	fuse: invalidate inode mapping if mtime changes We currently invalidate the inode address space mapping if the file size changes unexpectedly. In the case of a fuse network filesystem, a portion of a file could be overwritten remotely without changing the file size. Compare the old mtime as well to detect this condition and invalidate the mapping if the file has been updated. The original logic (to ignore changes in mtime) is preserved unless the client specifies FUSE_AUTO_INVAL_DATA on init. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-07-18 16:09:40 +02:00
Brian Foster	72d0d248ca	fuse: add FUSE_AUTO_INVAL_DATA init flag FUSE_AUTO_INVAL_DATA is provided to enable updated/auto cache invalidation logic. Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-07-18 16:09:40 +02:00
Al Viro	ebfc3b49a7	don't pass nameidata to ->create() boolean "does it have to be exclusive?" flag is passed instead; Local filesystem should just ignore it - the object is guaranteed not to be there yet. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:34:47 +04:00
Al Viro	00cd8dd3bf	stop passing nameidata to ->lookup() Just the flags; only NFS cares even about that, but there are legitimate uses for such argument. And getting rid of that completely would require splitting ->lookup() into a couple of methods (at least), so let's leave that alone for now... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:34:32 +04:00
Al Viro	0b728e1911	stop passing nameidata * to ->d_revalidate() Just the lookup flags. Die, bastard, die... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:34:14 +04:00
Al Viro	e45198a6ac	make finish_no_open() return int namely, 1 ;-) That's what we want to return from ->atomic_open() instances after finish_no_open(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:33:45 +04:00
Al Viro	30d9049474	kill struct opendata Just pass struct file . Methods are happier that way... There's no need to return struct file from finish_open() now, so let it return int. Next: saner prototypes for parts in namei.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:33:39 +04:00
Al Viro	d95852777b	make ->atomic_open() return int Change of calling conventions: old new NULL 1 file 0 ERR_PTR(-ve) -ve Caller knows that struct file *; no need to return it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:33:35 +04:00
Al Viro	47237687d7	->atomic_open() prototype change - pass int * instead of bool * ... and let finish_open() report having opened the file via that sucker. Next step: don't modify od->filp at all. [AV: FILE_CREATE was already used by cifs; Miklos' fix folded] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:33:31 +04:00
Miklos Szeredi	c8ccbe032f	fuse: implement i_op->atomic_open() Add an ->atomic_open implementation which replaces the atomic open+create operation implemented via ->create. No functionality is changed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:33:14 +04:00
Al Viro	b3d9b7a3c7	vfs: switch i_dentry/d_alias to hlist Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-14 16:32:55 +04:00
Linus Torvalds	f9ba7179ce	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix blksize calculation fuse: fix stat call on 32 bit platforms fuse: optimize fallocate on permanent failure fuse: add FALLOCATE operation fuse: Convert to kstrtoul_from_user	2012-06-05 10:11:11 -07:00
Josef Bacik	c3b2da3148	fs: introduce inode operation ->update_time Btrfs has to make sure we have space to allocate new blocks in order to modify the inode, so updating time can fail. We've gotten around this by having our own file_update_time but this is kind of a pain, and Christoph has indicated he would like to make xfs do something different with atime updates. So introduce ->update_time, where we will deal with i_version an a/m/c time updates and indicate which changes need to be made. The normal version just does what it has always done, updates the time and marks the inode dirty, and then filesystems can choose to do something different. I've gone through all of the users of file_update_time and made them check for errors with the exception of the fault code since it's complicated and I wasn't quite sure what to do there, also Jan is going to be pushing the file time updates into page_mkwrite for those who have it so that should satisfy btrfs and make it not a big deal to check the file_update_time() return code in the generic fault path. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com>	2012-06-01 12:07:25 -04:00
Al Viro	b0b0382bb4	->encode_fh() API change pass inode + parent's inode or NULL instead of dentry + bool saying whether we want the parent or not. NOTE: that needs ceph fix folded in. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-29 23:28:33 -04:00
Linus Torvalds	90324cc1b1	avoid iput() from flusher thread -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPw2J/AAoJECvKgwp+S8Ja5jkP/3uMxkhf8XQpXCI3O1QVfaQr uZFfM8sINqIPDVm1dtFjFj7f8Bw9mhE2KAnnJ1rKT8tQwqq9yAse1QPlhCG1ZqoP +AnMDDXHtx7WmQZXhBvS9b+unpZ7Jr6r6pO5XrmTL2kRL3YJPUhZ2+xbTT5belTB KoAu4WqORZRxfXoC76S7U8K+D4NcAGhAOxCClsIjmY+oocCiCag4FZOyzYIFViqc ghUN/+rLQ3fqGGv2yO7Ylx1gUM7sxIwkZQ/h962jFAtxz9czImr2NmRoMliOaOkS tvcnIf+E3u0n/zIjzFvzhxKgHJPP8PkcPMk60d3jKmFngBkqFTzNUeVTP8md7HrV 4DlXisWr+z7YVyWUCFaNcJLmjiWSwQ8DV/clRLobeBf9EJKan5F1PjFgl6PLJM5F Qr1+LHMNaetdulBwMRTyveZTzYqw9RmDnD9dWMo4mX/kTpvtC4jTPVV7hkRD+Qlv 5vTRR+VXL3Q50yClLf0AQMSKTnH2gBuepM/b+7cShLGfsMln8DtUjmbigv+niL63 BibcCIbIlP2uWGnl37VhsC34AT+RKt3lggrBOpn/7XJMq/wKR7IRP/7V9TfYgaUN NBa+wtnLDa1pZEn/X7izdcQP62PzDtmB+ObvYT0Yb40A4+2ud3qF/lB53c1A1ewF /9c4zxxekjHZnn2oooEa =oLXf -----END PGP SIGNATURE----- Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux Pull writeback tree from Wu Fengguang: "Mainly from Jan Kara to avoid iput() in the flusher threads." * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Avoid iput() from flusher thread vfs: Rename end_writeback() to clear_inode() vfs: Move waiting for inode writeback from end_writeback() to evict_inode() writeback: Refactor writeback_single_inode() writeback: Remove wb->list_lock from writeback_single_inode() writeback: Separate inode requeueing after writeback writeback: Move I_DIRTY_PAGES handling writeback: Move requeueing when I_SYNC set to writeback_sb_inodes() writeback: Move clearing of I_SYNC into inode_sync_complete() writeback: initialize global_dirty_limit fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds mm: page-writeback.c: local functions should not be exposed globally	2012-05-28 09:54:45 -07:00
Miklos Szeredi	203627bbc9	fuse: fix blksize calculation Don't use inode->i_blkbits which might be stale, instead calculate the blksize information from the freshly obtained attributes. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-05-14 17:12:56 +02:00
Pavel Shilovsky	45c72cd73c	fuse: fix stat call on 32 bit platforms Now we store attr->ino at inode->i_ino, return attr->ino at the first time and then return inode->i_ino if the attribute timeout isn't expired. That's wrong on 32 bit platforms because attr->ino is 64 bit and inode->i_ino is 32 bit in this case. Fix this by saving 64 bit ino in fuse_inode structure and returning it every time we call getattr. Also squash attr->ino into inode->i_ino explicitly. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-05-14 17:06:42 +02:00
Jan Kara	dbd5768f87	vfs: Rename end_writeback() to clear_inode() After we moved inode_sync_wait() from end_writeback() it doesn't make sense to call the function end_writeback() anymore. Rename it to clear_inode() which well says what the function really does - set I_CLEAR flag. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>	2012-05-06 13:43:41 +08:00
Miklos Szeredi	519c6040ce	fuse: optimize fallocate on permanent failure If userspace filesystem doesn't support fallocate, remember this and don't send request next time. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-04-26 10:56:36 +02:00
Anatol Pomozov	05ba1f0823	fuse: add FALLOCATE operation fallocate filesystem operation preallocates media space for the given file. If fallocate returns success then any subsequent write to the given range never fails with 'not enough space' error. Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-04-25 12:25:05 +02:00
Peter Huewe	e2690695ce	fuse: Convert to kstrtoul_from_user This patch replaces the code for getting an number from a userspace buffer by a simple call to kstroul_from_user. This makes it easier to read and less error prone. Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-04-25 12:25:05 +02:00
Linus Torvalds	dbfad21422	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: use flexible array in fuse.h fuse: allow nanosecond granularity fuse: O_DIRECT support for files fuse: fix nlink after unlink	2012-04-18 17:29:05 -07:00
Miklos Szeredi	0a2da9b2ef	fuse: allow nanosecond granularity Derrik Pates reports that an utimensat with a NULL argument results in the current time being sent from the kernel with 1 second granularity. Reported-by: Derrik Pates <demon@now.ai> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-04-11 11:45:06 +02:00
Linus Torvalds	e2a0883e40	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "This is _not_ all; in particular, Miklos' and Jan's stuff is not there yet." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits) ext4: initialization of ext4_li_mtx needs to be done earlier debugfs-related mode_t whack-a-mole hfsplus: add an ioctl to bless files hfsplus: change finder_info to u32 hfsplus: initialise userflags qnx4: new helper - try_extent() qnx4: get rid of qnx4_bread/qnx4_getblk take removal of PF_FORKNOEXEC to flush_old_exec() trim includes in inode.c um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it um: embed ->stub_pages[] into mmu_context gadgetfs: list_for_each_safe() misuse ocfs2: fix leaks on failure exits in module_init ecryptfs: make register_filesystem() the last potential failure exit ntfs: forgets to unregister sysctls on register_filesystem() failure logfs: missing cleanup on register_filesystem() failure jfs: mising cleanup on register_filesystem() failure make configfs_pin_fs() return root dentry on success configfs: configfs_create_dir() has parent dentry in dentry->d_parent configfs: sanitize configfs_create() ...	2012-03-21 13:36:41 -07:00
Al Viro	48fde701af	switch open-coded instances of d_make_root() to new helper Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-20 21:29:35 -04:00
Cong Wang	2408f6ef6b	fuse: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:22 +08:00
Anand Avati	4273b793ec	fuse: O_DIRECT support for files Implement ->direct_IO() method in aops. The ->direct_IO() method combines the existing fuse_direct_read/fuse_direct_write methods to implement O_DIRECT functionality. Reaching ->direct_IO() in the read path via generic_file_aio_read ensures proper synchronization with page cache with its existing framework. Reaching ->direct_IO() in the write path via fuse_file_aio_write is made to come via generic_file_direct_write() which makes it play nice with the page cache w.r.t other mmap pages etc. On files marked 'direct_io' by the filesystem server, IO always follows the fuse_direct_read/write path. There is no effect of fcntl(O_DIRECT) and it always succeeds. On files not marked with 'direct_io' by the filesystem server, the IO path depends on O_DIRECT flag by the application. This can be passed at the time of open() as well as via fcntl(). Note that asynchronous O_DIRECT iocb jobs are completed synchronously always (this has been the case with FUSE even before this patch) Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-03-05 15:48:11 +01:00
Miklos Szeredi	ac45d61357	fuse: fix nlink after unlink Anand Avati reports that the following sequence of system calls fail on a fuse filesystem: create("filename") => 0 link("filename", "linkname") => 0 unlink("filename") => 0 link("linkname", "filename") => -ENOENT ### BUG ### vfs_link() fails with ENOENT if i_nlink is zero, this is done to prevent resurrecting already deleted files. Fuse clears i_nlink on unlink even if there are other links pointing to the file. Reported-by: Anand Avati <avati@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-03-05 15:48:11 +01:00
Linus Torvalds	6733e54b66	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: FUSE: Notifying the kernel of deletion. fuse: support ioctl on directories fuse: Use kcalloc instead of kzalloc to allocate array fuse: llseek optimize SEEK_CUR and SEEK_SET	2012-01-12 12:39:21 -08:00
Al Viro	34c80b1d93	vfs: switch ->show_options() to struct dentry * Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-06 23:19:54 -05:00
Al Viro	541af6a074	fuse: propagate umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:07 -05:00
Al Viro	1a67aafb5f	switch ->mknod() to umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:54 -05:00
Al Viro	4acdaf27eb	switch ->create() to umode_t vfs_create() ignores everything outside of 16bit subset of its mode argument; switching it to umode_t is obviously equivalent and it's the only caller of the method Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:53 -05:00
Al Viro	18bb1db3e7	switch vfs_mkdir() and ->mkdir() to umode_t vfs_mkdir() gets int, but immediately drops everything that might not fit into umode_t and that's the only caller of ->mkdir()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:54:53 -05:00
Al Viro	6b520e0565	vfs: fix the stupidity with i_dentry in inode destructors Seeing that just about every destructor got that INIT_LIST_HEAD() copied into it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once(); the cost of taking it into inode_init_always() will be negligible for pipes and sockets and negative for everything else. Not to mention the removal of boilerplate code from ->destroy_inode() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:40 -05:00
Linus Torvalds	30aaca4582	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: llseek fix race fuse: fix llseek bug fuse: fix fuse_retrieve	2011-12-14 18:23:35 -08:00
Al Viro	988f032567	fuse: register_filesystem() called too early same story as with ubifs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-12-13 12:35:14 -05:00
John Muir	451d0f5999	FUSE: Notifying the kernel of deletion. Allows a FUSE file-system to tell the kernel when a file or directory is deleted. If the specified dentry has the specified inode number, the kernel will unhash it. The current 'fuse_notify_inval_entry' does not cause the kernel to clean up directories that are in use properly, and as a result the users of those directories see incorrect semantics from the file-system. The error condition seen when 'fuse_notify_inval_entry' is used to notify of a deleted directory is avoided when 'fuse_notify_delete' is used instead. The following scenario demonstrates the difference: 1. User A chdirs into 'testdir' and starts reading 'testfile'. 2. User B rm -rf 'testdir'. 3. User B creates 'testdir'. 4. User C chdirs into 'testdir'. If you run the above within the same machine on any file-system (including fuse file-systems), there is no problem: user C is able to chdir into the new testdir. The old testdir is removed from the dentry tree, but still open by user A. If operations 2 and 3 are performed via the network such that the fuse file-system uses one of the notify functions to tell the kernel that the nodes are gone, then the following error occurs for user C while user A holds the original directory open: muirj@empacher:~> ls /test/testdir ls: cannot access /test/testdir: No such file or directory The issue here is that the kernel still has a dentry for testdir, and so it is requesting the attributes for the old directory, while the file-system is responding that the directory no longer exists. If on the other hand, if the file-system can notify the kernel that the directory is deleted using the new 'fuse_notify_delete' function, then the above ls will find the new directory as expected. Signed-off-by: John Muir <john@jmuir.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-12-13 11:58:49 +01:00
Miklos Szeredi	b18da0c56e	fuse: support ioctl on directories Multiplexing filesystems may want to support ioctls on the underlying files and directores (e.g. FS_IOC_{GET,SET}FLAGS). Ioctl support on directories was missing so add it now. Reported-by: Antonio SJ Musumeci <bile@landofbile.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-12-13 11:58:49 +01:00
Thomas Meyer	c411cc88d8	fuse: Use kcalloc instead of kzalloc to allocate array The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The semantic patch that makes this change is available in https://lkml.org/lkml/2011/11/25/107 Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-12-13 11:58:49 +01:00
Miklos Szeredi	c07c3d1934	fuse: llseek optimize SEEK_CUR and SEEK_SET Use generic_file_llseek() instead of open coding the seek function. i_mutex protection is only necessary for SEEK_END (and SEEK_HOLE, SEEK_DATA), so move SEEK_CUR and SEEK_SET out from under i_mutex. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-12-13 11:58:48 +01:00
Miklos Szeredi	73104b6e37	fuse: llseek fix race Fix race between lseek(fd, 0, SEEK_CUR) and read/write. This was fixed in generic code by commit `5b6f1eb97d` (vfs: lseek(fd, 0, SEEK_CUR) race condition). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-12-13 11:40:59 +01:00
Roel Kluin	b48c6af208	fuse: fix llseek bug The test in fuse_file_llseek() "not SEEK_CUR or not SEEK_SET" always evaluates to true. This was introduced in 3.1 by commit `06222e49` (fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek) and changed the behavior of SEEK_CUR and SEEK_SET to always retrieve the file attributes. This is a performance regression. Fix the test so that it makes sense. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@vger.kernel.org CC: Josef Bacik <josef@redhat.com> CC: Al Viro <viro@zeniv.linux.org.uk>	2011-12-13 10:37:00 +01:00
Miklos Szeredi	48706d0a91	fuse: fix fuse_retrieve Fix two bugs in fuse_retrieve(): - retrieving more than one page would yield repeated instances of the first page - if more than FUSE_MAX_PAGES_PER_REQ pages were requested than the request page array would overflow fuse_retrieve() was added in 2.6.36 and these bugs had been there since the beginning. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@vger.kernel.org	2011-12-13 10:36:59 +01:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Miklos Szeredi	bfe8684869	filesystems: add set_nlink() Replace remaining direct i_nlink updates with a new set_nlink() updater function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2011-11-02 12:53:43 +01:00
Paul Gortmaker	143cb494cb	fs: add module.h to files that were implicitly using it Some files were using the complete module.h infrastructure without actually including the header at all. Fix them up in advance so once the implicit presence is removed, we won't get failures like this: CC [M] fs/nfsd/nfssvc.o fs/nfsd/nfssvc.c: In function 'nfsd_create_serv': fs/nfsd/nfssvc.c:335: error: 'THIS_MODULE' undeclared (first use in this function) fs/nfsd/nfssvc.c:335: error: (Each undeclared identifier is reported only once fs/nfsd/nfssvc.c:335: error: for each function it appears in.) fs/nfsd/nfssvc.c: In function 'nfsd': fs/nfsd/nfssvc.c:555: error: implicit declaration of function 'module_put_and_exit' make[3]: *** [fs/nfsd/nfssvc.o] Error 1 Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:31 -04:00
Miklos Szeredi	5dfcc87fd7	fuse: fix memory leak kmemleak is reporting that 32 bytes are being leaked by FUSE: unreferenced object 0xe373b270 (size 32): comm "fusermount", pid 1207, jiffies 4294707026 (age 2675.187s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<b05517d7>] kmemleak_alloc+0x27/0x50 [<b0196435>] kmem_cache_alloc+0xc5/0x180 [<b02455be>] fuse_alloc_forget+0x1e/0x20 [<b0245670>] fuse_alloc_inode+0xb0/0xd0 [<b01b1a8c>] alloc_inode+0x1c/0x80 [<b01b290f>] iget5_locked+0x8f/0x1a0 [<b0246022>] fuse_iget+0x72/0x1a0 [<b02461da>] fuse_get_root_inode+0x8a/0x90 [<b02465cf>] fuse_fill_super+0x3ef/0x590 [<b019e56f>] mount_nodev+0x3f/0x90 [<b0244e95>] fuse_mount+0x15/0x20 [<b019d1bc>] mount_fs+0x1c/0xc0 [<b01b5811>] vfs_kern_mount+0x41/0x90 [<b01b5af9>] do_kern_mount+0x39/0xd0 [<b01b7585>] do_mount+0x2e5/0x660 [<b01b7966>] sys_mount+0x66/0xa0 This leak report is consistent and happens once per boot on 3.1.0-rc5-dirty. This happens if a FORGET request is queued after the fuse device was released. Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-12 11:47:10 -07:00
Miklos Szeredi	24114504c4	fuse: fix flock breakage Commit `37fb3a30b4` ("fuse: fix flock") added in 3.1-rc4 caused flock() to fail with ENOSYS with the kernel ABI version 7.16 or earlier. Fix by falling back to testing FUSE_POSIX_LOCKS for ABI versions 7.16 and earlier. Reported-by: Martin Ziegler <ziegler@email.mathematik.uni-freiburg.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Tested-by: Martin Ziegler <ziegler@email.mathematik.uni-freiburg.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-12 11:47:10 -07:00
Linus Torvalds	051732bcbe	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: check size of FUSE_NOTIFY_INVAL_ENTRY message fuse: mark pages accessed when written to fuse: delete dead .write_begin and .write_end aops fuse: fix flock fuse: fix non-ANSI void function notation	2011-08-24 09:14:42 -07:00
Miklos Szeredi	c2183d1e9b	fuse: check size of FUSE_NOTIFY_INVAL_ENTRY message FUSE_NOTIFY_INVAL_ENTRY didn't check the length of the write so the message processing could overrun and result in a "kernel BUG at fs/fuse/dev.c:629!" Reported-by: Han-Wen Nienhuys <hanwenn@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2011-08-24 10:20:17 +02:00
Johannes Weiner	478e0841b3	fuse: mark pages accessed when written to As fuse does not use the page cache library functions when userspace writes to a file, it did not benefit from 'c8236db mm: mark page accessed before we write_end()' that made sure pages are properly marked accessed when written to. Signed-off-by: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-08-08 16:08:08 +02:00
Johannes Weiner	b40cdd56df	fuse: delete dead .write_begin and .write_end aops Ever since 'ea9b990 fuse: implement perform_write', the .write_begin and .write_end aops have been dead code. Their task - acquiring a page from the page cache, sending out a write request and releasing the page again - is now done batch-wise to maximize the number of pages send per userspace request. Signed-off-by: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-08-08 16:08:08 +02:00
Miklos Szeredi	37fb3a30b4	fuse: fix flock Commit `a9ff4f87` "fuse: support BSD locking semantics" overlooked a number of issues with supporing flock locks over existing POSIX locking infrastructure: - it's not backward compatible, passing flock(2) calls to userspace unconditionally (if userspace sets FUSE_POSIX_LOCKS) - it doesn't cater for the fact that flock locks are automatically unlocked on file release - it doesn't take into account the fact that flock exclusive locks (write locks) don't need an fd opened for write. The last one invalidates the original premise of the patch that flock locks can be emulated with POSIX locks. This patch fixes the first two issues. The last one needs to be fixed in userspace if the filesystem assumed that a write lock will happen only on a file operned for write (as in the case of the current fuse library). Reported-by: Sebastian Pipping <webmaster@hartwork.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-08-08 16:08:08 +02:00
Linus Torvalds	2dad3206db	Merge branch 'for-3.1' of git://linux-nfs.org/~bfields/linux * 'for-3.1' of git://linux-nfs.org/~bfields/linux: nfsd: don't break lease on CLAIM_DELEGATE_CUR locks: rename lock-manager ops nfsd4: update nfsv4.1 implementation notes nfsd: turn on reply cache for NFSv4 nfsd4: call nfsd4_release_compoundargs from pc_release nfsd41: Deny new lock before RECLAIM_COMPLETE done fs: locks: remove init_once nfsd41: check the size of request nfsd41: error out when client sets maxreq_sz or maxresp_sz too small nfsd4: fix file leak on open_downgrade nfsd4: remember to put RW access on stateid destruction NFSD: Added TEST_STATEID operation NFSD: added FREE_STATEID operation svcrpc: fix list-corrupting race on nfsd shutdown rpc: allow autoloading of gss mechanisms svcauth_unix.c: quiet sparse noise svcsock.c: include sunrpc.h to quiet sparse noise nfsd: Remove deprecated nfsctl system call and related code. NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND Fix up trivial conflicts in Documentation/feature-removal-schedule.txt	2011-07-25 22:49:19 -07:00
Josef Bacik	02c24a8218	fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers Btrfs needs to be able to control how filemap_write_and_wait_range() is called in fsync to make it less of a painful operation, so push down taking i_mutex and the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some file systems can drop taking the i_mutex altogether it seems, like ext3 and ocfs2. For correctness sake I just pushed everything down in all cases to make sure that we keep the current behavior the same for everybody, and then each individual fs maintainer can make up their mind about what to do from there. Thanks, Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 20:47:59 -04:00
Josef Bacik	06222e491e	fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek This converts everybody to handle SEEK_HOLE/SEEK_DATA properly. In some cases we just return -EINVAL, in others we do the normal generic thing, and in others we're simply making sure that the properly due-dilligence is done. For example in NFS/CIFS we need to make sure the file size is update properly for the SEEK_HOLE and SEEK_DATA case, but since it calls the generic llseek stuff itself that is all we have to do. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 20:47:58 -04:00
J. Bruce Fields	8fb47a4fbf	locks: rename lock-manager ops Both the filesystem and the lock manager can associate operations with a lock. Confusingly, one of them (fl_release_private) actually has the same name in both operation structures. It would save some confusion to give the lock-manager ops different names. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-20 20:23:19 -04:00
Al Viro	dd7dd556e4	no need to check for LOOKUP_OPEN in ->create() instances ... it will be set in nd->flag for all cases with non-NULL nd (i.e. when called from do_last()). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 01:43:56 -04:00
Al Viro	8a5e929dd2	don't transliterate lower bits of ->intent.open.flags to FMODE_... ->create() instances are much happier that way... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 01:43:52 -04:00
Al Viro	10556cb21a	->permission() sanitizing: don't pass flags to ->permission() not used by the instances anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 01:43:24 -04:00
Al Viro	2830ba7f34	->permission() sanitizing: don't pass flags to generic_permission() redundant; all callers get it duplicated in mask & MAY_NOT_BLOCK and none of them removes that bit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 01:43:22 -04:00
Al Viro	178ea73521	kill check_acl callback of generic_permission() its value depends only on inode and does not change; we might as well store it in ->i_op->check_acl and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 01:43:16 -04:00
Al Viro	9e1f1de02c	more conservative S_NOSEC handling Caching "we have already removed suid/caps" was overenthusiastic as merged. On network filesystems we might have had suid/caps set on another client, silently picked by this client on revalidate, all of that without clearing the S_NOSEC flag. AFAICS, the only reasonably sane way to deal with that is * new superblock flag; unless set, S_NOSEC is not going to be set. * local block filesystems set it in their ->mount() (more accurately, mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than local block ones clear it) * if any network filesystem (or a cluster one) wants to use S_NOSEC, it'll need to set MS_NOSEC in sb->s_flags AND take care to clear S_NOSEC when inode attribute changes are picked from other clients. It's not an earth-shattering hole (anybody that can set suid on another client will almost certainly be able to write to the file before doing that anyway), but it's a bug that needs fixing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-06-03 18:24:58 -04:00
Randy Dunlap	a2daff6803	fuse: fix non-ANSI void function notation Fix void function parameter list sparse warning: fs/fuse/inode.c:74:44: warning: non-ANSI function declaration of function 'fuse_alloc_forget' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-06-01 16:09:32 +02:00
Sage Weil	526e7ce552	fuse: remove unnecessary dentry_unhash on rmdir, dir rename Fuse has no problems with references to unlinked directories. CC: Miklos Szeredi <miklos@szeredi.hu> CC: fuse-devel@lists.sourceforge.net Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-05-28 01:02:53 -04:00
Linus Torvalds	32e51f141f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (25 commits) cifs: remove unnecessary dentry_unhash on rmdir/rename_dir ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir exofs: remove unnecessary dentry_unhash on rmdir/rename_dir nfs: remove unnecessary dentry_unhash on rmdir/rename_dir ext2: remove unnecessary dentry_unhash on rmdir/rename_dir ext3: remove unnecessary dentry_unhash on rmdir/rename_dir ext4: remove unnecessary dentry_unhash on rmdir/rename_dir btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir ceph: remove unnecessary dentry_unhash calls vfs: clean up vfs_rename_other vfs: clean up vfs_rename_dir vfs: clean up vfs_rmdir vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems libfs: drop unneeded dentry_unhash vfs: update dentry_unhash() comment vfs: push dentry_unhash on rename_dir into file systems vfs: push dentry_unhash on rmdir into file systems vfs: remove dget() from dentry_unhash() vfs: dentry_unhash immediately prior to rmdir vfs: Block mmapped writes while the fs is frozen ...	2011-05-26 09:52:14 -07:00
Sage Weil	e4eaac06bc	vfs: push dentry_unhash on rename_dir into file systems Only a few file systems need this. Start by pushing it down into each rename method (except gfs2 and xfs) so that it can be dealt with on a per-fs basis. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-05-26 07:26:48 -04:00
Sage Weil	79bf7c732b	vfs: push dentry_unhash on rmdir into file systems Only a few file systems need this. Start by pushing it down into each fs rmdir method (except gfs2 and xfs) so it can be dealt with on a per-fs basis. This does not change behavior for any in-tree file systems. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-05-26 07:26:47 -04:00
Miklos Szeredi	d24339059d	fuse: fix oops in revalidate when called with NULL nameidata Some cases (e.g. ecryptfs) can call ->dentry_revalidate with NULL nameidata. https://bugzilla.kernel.org/show_bug.cgi?id=34732 Tyler Hicks pointed out that this bug was introduced by commit `e7c0a16786` "fuse: make fuse_dentry_revalidate() RCU aware" Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-05-10 17:35:58 +02:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Linus Torvalds	6c51038900	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits) Documentation/iostats.txt: bit-size reference etc. cfq-iosched: removing unnecessary think time checking cfq-iosched: Don't clear queue stats when preempt. blk-throttle: Reset group slice when limits are changed blk-cgroup: Only give unaccounted_time under debug cfq-iosched: Don't set active queue in preempt block: fix non-atomic access to genhd inflight structures block: attempt to merge with existing requests on plug flush block: NULL dereference on error path in __blkdev_get() cfq-iosched: Don't update group weights when on service tree fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away block: Require subsystems to explicitly allocate bio_set integrity mempool jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging fs: make fsync_buffers_list() plug mm: make generic_writepages() use plugging blk-cgroup: Add unaccounted time to timeslice_used. block: fixup plugging stubs for !CONFIG_BLOCK block: remove obsolete comments for blkdev_issue_zeroout. blktrace: Use rq->cmd_flags directly in blk_add_trace_rq. ... Fix up conflicts in fs/{aio.c,super.c}	2011-03-24 10:16:26 -07:00
Miklos Szeredi	ef6a3c6311	mm: add replace_page_cache_page() function This function basically does: remove_from_page_cache(old); page_cache_release(old); add_to_page_cache_locked(new); Except it does this atomically, so there's no possibility for the "add" to fail because of a race. If memory cgroups are enabled, then the memory cgroup charge is also moved from the old page to the new. This function is currently used by fuse to move pages into the page cache on read, instead of copying the page contents. [minchan.kim@gmail.com: add freepage() hook to replace_page_cache_page()] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:02 -07:00
Linus Torvalds	f741a79e98	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: make fuse_dentry_revalidate() RCU aware fuse: make fuse_permission() RCU aware fuse: wakeup pollers on connection release/abort fuse: reduce size of struct fuse_request	2011-03-22 10:42:43 -07:00
Miklos Szeredi	e7c0a16786	fuse: make fuse_dentry_revalidate() RCU aware Only bail out of fuse_dentry_revalidate() on LOOKUP_RCU when blocking is actually necessary. CC: Nick Piggin <npiggin@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-03-21 13:58:06 +01:00
Miklos Szeredi	19690ddb65	fuse: make fuse_permission() RCU aware Only bail out of fuse_permission() on IPERM_FLAG_RCU when blocking is actually necessary. CC: Nick Piggin <npiggin@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-03-21 13:58:06 +01:00
Bryan Green	357ccf2b69	fuse: wakeup pollers on connection release/abort If a fuse dev connection is broken, wake up any processes that are blocking, in a poll system call, on one of the files in the now defunct filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-03-21 13:58:05 +01:00
Miklos Szeredi	07d5f69b45	fuse: reduce size of struct fuse_request Reduce the size of struct fuse_request by removing cuse_init_out from the request structure and allocating it dinamically instead. CC: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-03-21 13:58:05 +01:00
Linus Torvalds	e16b396ce3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits) doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore Update cpuset info & webiste for cgroups dcdbas: force SMI to happen when expected arch/arm/Kconfig: remove one to many l's in the word. asm-generic/user.h: Fix spelling in comment drm: fix printk typo 'sracth' Remove one to many n's in a word Documentation/filesystems/romfs.txt: fixing link to genromfs drivers:scsi Change printk typo initate -> initiate serial, pch uart: Remove duplicate inclusion of linux/pci.h header fs/eventpoll.c: fix spelling mm: Fix out-of-date comments which refers non-existent functions drm: Fix printk typo 'failled' coh901318.c: Change initate to initiate. mbox-db5500.c Change initate to initiate. edac: correct i82975x error-info reported edac: correct i82975x mci initialisation edac: correct commented info fs: update comments to point correct document target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c ... Trivial conflict in fs/eventpoll.c (spelling vs addition)	2011-03-18 10:37:40 -07:00
Aneesh Kumar K.V	5fe0c23788	exportfs: Return the minimum required handle size The exportfs encode handle function should return the minimum required handle size. This helps user to find out the handle size by passing 0 handle size in the first step and then redoing to the call again with the returned handle size value. Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-14 09:15:28 -04:00
Al Viro	529c5f958f	fuse: fix d_revalidate oopsen on NFS exports can't blindly check nd->flags in ->d_revalidate() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-10 03:44:31 -05:00
Jens Axboe	4c63f5646e	Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core Conflicts: block/blk-core.c block/blk-flush.c drivers/md/raid1.c drivers/md/raid10.c drivers/md/raid5.c fs/nilfs2/btnode.c fs/nilfs2/mdt.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-03-10 08:58:35 +01:00
Jens Axboe	7eaceaccab	block: remove per-queue plugging Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-03-10 08:52:07 +01:00
Miklos Szeredi	8d56addd70	fuse: fix truncate after open Commit `e1181ee6` "vfs: pass struct file to do_truncate on O_TRUNC opens" broke the behavior of open(O_TRUNC\|O_RDONLY) in fuse. Fuse assumed that when called from open, a truncate() will be done, not an ftruncate(). Fix by restoring the old behavior, based on the ATTR_OPEN flag. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-02-25 14:44:58 +01:00
Miklos Szeredi	5a18ec176c	fuse: fix hang of single threaded fuseblk filesystem Single threaded NTFS-3G could get stuck if a delayed RELEASE reply triggered a DESTROY request via path_put(). Fix this by a) making RELEASE requests synchronous, whenever possible, on fuseblk filesystems b) if not possible (triggered by an asynchronous read/write) then do the path_put() in a separate thread with schedule_work(). Reported-by: Oliver Neukum <oneukum@suse.de> Cc: stable@kernel.org Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2011-02-25 14:44:58 +01:00
Paul Bolle	8272f4c9c5	fuse/cuse: fix comment typo initilaization Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-02-15 10:26:38 +01:00
Al Viro	c35eebe993	switch fuse Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-12 20:02:44 -05:00
Linus Torvalds	7d44b04401	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix ioctl ABI fuse: allow batching of FORGET requests fuse: separate queue for FORGET requests fuse: ioctl cleanup Fix up trivial conflict in fs/fuse/inode.c due to RCU lookup having done the RCU-freeing of the inode in fuse_destroy_inode().	2011-01-10 07:43:54 -08:00
Nick Piggin	b74c79e993	fs: provide rcu-walk aware permission i_ops Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:29 +11:00
Nick Piggin	34286d6662	fs: rcu-walk aware d_revalidate method Require filesystems be aware of .d_revalidate being called in rcu-walk mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning -ECHILD from all implementations. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:29 +11:00
Nick Piggin	fb045adb99	fs: dcache reduce branches in lookup path Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])d_op([[:space:]])=' \| xargs sed -e 's/$[^\t ]$->d_op = $.$;/d_set_d_op(\1, \2);/' -e 's/$[^\t ]$\.d_op = $.$;/d_set_d_op(\&\1, \2);/' -i Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:28 +11:00
Nick Piggin	fa0d7e3de6	fs: icache RCU free inodes RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:26 +11:00
Miklos Szeredi	1baa26b2be	fuse: fix ioctl ABI In kernel ABI version 7.16 and later FUSE_IOCTL_RETRY reply from a unrestricted IOCTL request shall return with an array of 'struct fuse_ioctl_iovec' instead of 'struct iovec'. This fixes the ABI ambiguity of 32bit vs. 64bit. Reported-by: "ccmail111" <ccmail111@yahoo.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Tejun Heo <tj@kernel.org>	2010-12-07 20:16:56 +01:00
Miklos Szeredi	02c048b919	fuse: allow batching of FORGET requests Terje Malmedal reports that a fuse filesystem with 32 million inodes on a machine with lots of memory can take up to 30 minutes to process FORGET requests when all those inodes are evicted from the icache. To solve this, create a BATCH_FORGET request that allows up to about 8000 FORGET requests to be sent in a single message. This request is only sent if userspace supports interface version 7.16 or later, otherwise fall back to sending individual FORGET messages. Reported-by: Terje Malmedal <terje.malmedal@usit.uio.no> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-12-07 20:16:56 +01:00
Miklos Szeredi	07e77dca8a	fuse: separate queue for FORGET requests Terje Malmedal reports that a fuse filesystem with 32 million inodes on a machine with lots of memory can go unresponsive for up to 30 minutes when all those inodes are evicted from the icache. The reason is that FORGET messages, sent when the inode is evicted, are queued up together with regular filesystem requests, and while the huge queue of FORGET messages are processed no other filesystem operation can proceed. Since a full fuse request structure is allocated for each inode, these take up quite a bit of memory as well. To solve these issues, create a slim 'fuse_forget_link' structure containing just the minimum of information required to send the FORGET request and chain these on a separate queue. When userspace is asking for a request make sure that FORGET and non-FORGET requests are selected fairly: for each 8 non-FORGET allow 16 FORGET requests. This will make sure FORGETs do not pile up, yet other requests are also allowed to proceed while the queued FORGETs are processed. Reported-by: Terje Malmedal <terje.malmedal@usit.uio.no> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-12-07 20:16:56 +01:00
Miklos Szeredi	8ac835056c	fuse: ioctl cleanup Get rid of unnecessary page_address()-es. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Tejun Heo <tj@kernel.org>	2010-12-07 20:16:56 +01:00
Miklos Szeredi	7572777eef	fuse: verify ioctl retries Verify that the total length of the iovec returned in FUSE_IOCTL_RETRY doesn't overflow iov_length(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Tejun Heo <tj@kernel.org> CC: <stable@kernel.org> [2.6.31+]	2010-11-30 16:39:27 +01:00
Miklos Szeredi	d9d318d39d	fuse: fix ioctl when server is 32bit If a 32bit CUSE server is run on 64bit this results in EIO being returned to the caller. The reason is that FUSE_IOCTL_RETRY reply was defined to use 'struct iovec', which is different on 32bit and 64bit archs. Work around this by looking at the size of the reply to determine which struct was used. This is only needed if CONFIG_COMPAT is defined. A more permanent fix for the interface will be to use the same struct on both 32bit and 64bit. Reported-by: "ccmail111" <ccmail111@yahoo.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: Tejun Heo <tj@kernel.org> CC: <stable@kernel.org> [2.6.31+]	2010-11-30 16:39:27 +01:00
Ken Sumrall	a0822c5577	fuse: fix attributes after open(O_TRUNC) The attribute cache for a file was not being cleared when a file is opened with O_TRUNC. If the filesystem's open operation truncates the file ("atomic_o_trunc" feature flag is set) then the kernel should invalidate the cached st_mtime and st_ctime attributes. Also i_size should be explicitly be set to zero as it is used sometimes without refreshing the cache. Signed-off-by: Ken Sumrall <ksumrall@android.com> Cc: Anfei <anfei.zhou@gmail.com> Cc: "Anand V. Avati" <avati@gluster.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-25 06:50:41 +09:00
Al Viro	3c26ff6e49	convert get_sb_nodev() users Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:31 -04:00
Al Viro	fc14f2fef6	convert get_sb_single() users Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:28 -04:00
Al Viro	152a083666	new helper: mount_bdev() ... and switch of the obvious get_sb_bdev() users to ->mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:13 -04:00
Miklos Szeredi	0be8557bcd	fuse: use release_pages() Replace iterated page_cache_release() with release_pages(), which is faster and shorter. Needs release_pages() to be exported to modules. Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:17 -07:00
Linus Torvalds	426e1f5cec	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) split invalidate_inodes() fs: skip I_FREEING inodes in writeback_sb_inodes fs: fold invalidate_list into invalidate_inodes fs: do not drop inode_lock in dispose_list fs: inode split IO and LRU lists fs: switch bdev inode bdi's correctly fs: fix buffer invalidation in invalidate_list fsnotify: use dget_parent smbfs: use dget_parent exportfs: use dget_parent fs: use RCU read side protection in d_validate fs: clean up dentry lru modification fs: split __shrink_dcache_sb fs: improve DCACHE_REFERENCED usage fs: use percpu counter for nr_dentry and nr_dentry_unused fs: simplify __d_free fs: take dcache_lock inside __d_path fs: do not assign default i_ino in new_inode fs: introduce a per-cpu last_ino allocator new helper: ihold() ...	2010-10-26 17:58:44 -07:00
Miklos Szeredi	b6777c40c7	fuse: use clear_highpage() and KM_USER0 instead of KM_USER1 Commit `7909b1c640` ("fuse: don't use atomic kmap") removed KM_USER0 usage from fuse/dev.c. Switch KM_USER1 uses to KM_USER0 for clarity. Also replace open coded clear_highpage(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:13 -07:00
Jan Beulich	3ecb01df32	use clear_page()/copy_page() in favor of memset()/memcpy() on whole pages After all that's what they are intended for. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:13 -07:00
Christoph Hellwig	85fe4025c6	fs: do not assign default i_ino in new_inode Instead of always assigning an increasing inode number in new_inode move the call to assign it into those callers that actually need it. For now callers that need it is estimated conservatively, that is the call is added to all filesystems that do not assign an i_ino by themselves. For a few more filesystems we can avoid assigning any inode number given that they aren't user visible, and for others it could be done lazily when an inode number is actually needed, but that's left for later patches. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-25 21:26:11 -04:00
Linus Torvalds	092e0e7e52	Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek	2010-10-22 10:52:56 -07:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
Geert Uytterhoeven	0157443c56	fuse: Initialize total_len in fuse_retrieve() fs/fuse/dev.c:1357: warning: ‘total_len’ may be used uninitialized in this function Initialize total_len to zero, else its value will be undefined. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-10-04 10:45:32 +02:00
Miklos Szeredi	b9ca67b2dd	fuse: fix lock annotations Sparse doesn't understand lock annotations of the form __releases(&foo->lock). Change them to __releases(foo->lock). Same for __acquires(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-09-07 13:42:41 +02:00
Miklos Szeredi	595afaf9e6	fuse: flush background queue on connection close David Bartly reported that fuse can hang in fuse_get_req_nofail() when the connection to the filesystem server is no longer active. If bg_queue is not empty then flush_bg_queue() called from request_end() can put more requests on to the pending queue. If this happens while ending requests on the processing queue then those background requests will be queued to the pending list and never ended. Another problem is that fuse_dev_release() didn't wake up processes sleeping on blocked_waitq. Solve this by: a) flushing the background queue before calling end_requests() on the pending and processing queues b) setting blocked = 0 and waking up processes waiting on blocked_waitq() Thanks to David for an excellent bug report. Reported-by: David Bartley <andareed@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2010-09-07 13:42:41 +02:00
Linus Torvalds	5f248c9c25	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c	2010-08-10 11:26:52 -07:00
Al Viro	b57922d97f	convert remaining ->clear_inode() to ->evict_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:48:37 -04:00
Christoph Hellwig	2c27c65ed0	check ATTR_SIZE contraints in inode_change_ok Make sure we check the truncate constraints early on in ->setattr by adding those checks to inode_change_ok. Also clean up and document inode_change_ok to make this obvious. As a fallout we don't have to call inode_newsize_ok from simple_setsize and simplify it down to a truncate_setsize which doesn't return an error. This simplifies a lot of setattr implementations and means we use truncate_setsize almost everywhere. Get rid of fat_setsize now that it's trivial and mark ext2_setsize static to make the calling convention obvious. Keep the inode_newsize_ok in vmtruncate for now as all callers need an audit for its removal anyway. Note: setattr code in ecryptfs doesn't call inode_change_ok at all and needs a deeper audit, but that is left for later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:47:39 -04:00
Christoph Hellwig	db78b877f7	always call inode_change_ok early in ->setattr Make sure we call inode_change_ok before doing any changes in ->setattr, and make sure to call it even if our fs wants to ignore normal UNIX permissions, but use the ATTR_FORCE to skip those. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:47:38 -04:00
Linus Torvalds	fe21ea18c7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add retrieve request fuse: add store request fuse: don't use atomic kmap	2010-08-07 13:18:36 -07:00
Eric Paris	9cfcac810e	vfs: re-introduce MAY_CHDIR Currently MAY_ACCESS means that filesystems must check the permissions right then and not rely on cached results or the results of future operations on the object. This can be because of a call to sys_access() or because of a call to chdir() which needs to check search without relying on any future operations inside that dir. I plan to use MAY_ACCESS for other purposes in the security system, so I split the MAY_ACCESS and the MAY_CHDIR cases. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2010-08-02 15:35:06 +10:00
Miklos Szeredi	2d45ba381a	fuse: add retrieve request Userspace filesystem can request data to be retrieved from the inode's mapping. This request is synchronous and the retrieved data is queued as a new request. If the write to the fuse device returns an error then the retrieve request was not completed and a reply will not be sent. Only present pages are returned in the retrieve reply. Retrieving stops when it finds a non-present page and only data prior to that is returned. This request doesn't change the dirty state of pages. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-07-12 14:41:40 +02:00
Miklos Szeredi	a1d75f2582	fuse: add store request Userspace filesystem can request data to be stored in the inode's mapping. This request is synchronous and has no reply. If the write to the fuse device returns an error then the store request was not fully completed (but may have updated some pages). If the stored data overflows the current file size, then the size is extended, similarly to a write(2) on the filesystem. Pages which have been completely stored are marked uptodate. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-07-12 14:41:40 +02:00
Miklos Szeredi	7909b1c640	fuse: don't use atomic kmap Don't use atomic kmap for mapping userspace buffers in device read/write/splice. This is necessary because the next patch (adding store notify) requires that caller of fuse_copy_page() may sleep between invocations. The simplest way to ensure this is to change the atomic kmaps to non-atomic ones. Thankfully architectures where kmap() is not a no-op are going out of fashion, so we can ignore the (probably negligible) performance impact of this change. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-07-12 14:41:40 +02:00
Linus Torvalds	003386fff3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: mm: export generic_pipe_buf_() to modules fuse: support splice() reading from fuse device fuse: allow splice to move pages mm: export remove_from_page_cache() to modules mm: export lru_cache_add_() to modules fuse: support splice() writing to fuse device fuse: get page reference for readpages fuse: use get_user_pages_fast() fuse: remove unneeded variable	2010-05-30 09:16:14 -07:00
Christoph Hellwig	7ea8085910	drop unused dentry argument to ->fsync Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:05:02 -04:00
Kay Sievers	578454ff7e	driver core: add devname module aliases to allow module on-demand auto-loading This adds: alias: devname:<name> to some common kernel modules, which will allow the on-demand loading of the kernel module when the device node is accessed. Ideally all these modules would be compiled-in, but distros seems too much in love with their modularization that we need to cover the common cases with this new facility. It will allow us to remove a bunch of pretty useless init scripts and modprobes from init scripts. The static device node aliases will be carried in the module itself. The program depmod will extract this information to a file in the module directory: $ cat /lib/modules/2.6.34-00650-g537b60d-dirty/modules.devname # Device nodes to trigger on-demand module loading. microcode cpu/microcode c10:184 fuse fuse c10:229 ppp_generic ppp c108:0 tun net/tun c10:200 dm_mod mapper/control c10:235 Udev will pick up the depmod created file on startup and create all the static device nodes which the kernel modules specify, so that these modules get automatically loaded when the device node is accessed: $ /sbin/udevd --debug ... static_dev_create_from_modules: mknod '/dev/cpu/microcode' c10:184 static_dev_create_from_modules: mknod '/dev/fuse' c10:229 static_dev_create_from_modules: mknod '/dev/ppp' c108:0 static_dev_create_from_modules: mknod '/dev/net/tun' c10:200 static_dev_create_from_modules: mknod '/dev/mapper/control' c10:235 udev_rules_apply_static_dev_perms: chmod '/dev/net/tun' 0666 udev_rules_apply_static_dev_perms: chmod '/dev/fuse' 0666 A few device nodes are switched to statically allocated numbers, to allow the static nodes to work. This might also useful for systems which still run a plain static /dev, which is completely unsafe to use with any dynamic minor numbers. Note: The devname aliases must be limited to the common and singleinstance* device nodes, like the misc devices, and never be used for conceptually limited systems like the loop devices, which should rather get fixed properly and get a control node for losetup to talk to, instead of creating a random number of device nodes in advance, regardless if they are ever used. This facility is to hide the mess distros are creating with too modualized kernels, and just to hide that these modules are not compiled-in, and not to paper-over broken concepts. Thanks! :) Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: David S. Miller <davem@davemloft.net> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Chris Mason <chris.mason@oracle.com> Cc: Alasdair G Kergon <agk@redhat.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Ian Kent <raven@themaw.net> Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-05-25 15:08:26 -07:00
Miklos Szeredi	c3021629a0	fuse: support splice() reading from fuse device Allow userspace filesystem implementation to use splice() to read from the fuse device. The userspace filesystem can now transfer data coming from a WRITE request to an arbitrary file descriptor (regular file, block device or socket) without having to go through a userspace buffer. The semantics of using splice() to read messages are: 1) with a single splice() call move the whole message from the fuse device to a temporary pipe 2) read the header from the pipe and determine the message type 3a) if message is a WRITE then splice data from pipe to destination 3b) else read rest of message to userspace buffer Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:07 +02:00
Miklos Szeredi	ce534fb052	fuse: allow splice to move pages When splicing buffers to the fuse device with SPLICE_F_MOVE, try to move pages from the pipe buffer into the page cache. This allows populating the fuse filesystem's cache without ever touching the page contents, i.e. zero copy read capability. The following steps are performed when trying to move a page into the page cache: - buf->ops->confirm() to make sure the new page is uptodate - buf->ops->steal() to try to remove the new page from it's previous place - remove_from_page_cache() on the old page - add_to_page_cache_locked() on the new page If any of the above steps fail (non fatally) then the code falls back to copying the page. In particular ->steal() will fail if there are external references (other than the page cache and the pipe buffer) to the page. Also since the remove_from_page_cache() + add_to_page_cache_locked() are non-atomic it is possible that the page cache is repopulated in between the two and add_to_page_cache_locked() will fail. This could be fixed by creating a new atomic replace_page_cache_page() function. fuse_readpages_end() needed to be reworked so it works even if page->mapping is NULL for some or all pages which can happen if the add_to_page_cache_locked() failed. A number of sanity checks were added to make sure the stolen pages don't have weird flags set, etc... These could be moved into generic splice/steal code. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:07 +02:00
Miklos Szeredi	dd3bb14f44	fuse: support splice() writing to fuse device Allow userspace filesystem implementation to use splice() to write to the fuse device. The semantics of using splice() are: 1) buffer the message header and data in a temporary pipe 2) with a single splice() call move the message from the temporary pipe to the fuse device The READ reply message has the most interesting use for this, since now the data from an arbitrary file descriptor (which could be a regular file, a block device or a socket) can be tranferred into the fuse device without having to go through a userspace buffer. It will also allow zero copy moving of pages. One caveat is that the protocol on the fuse device requires the length of the whole message to be written into the header. But the length of the data transferred into the temporary pipe may not be known in advance. The current library implementation works around this by using vmplice to write the header and modifying the header after splicing the data into the pipe (error handling omitted): struct fuse_out_header out; iov.iov_base = &out; iov.iov_len = sizeof(struct fuse_out_header); vmsplice(pip[1], &iov, 1, 0); len = splice(input_fd, input_offset, pip[1], NULL, len, 0); /* retrospectively modify the header: */ out.len = len + sizeof(struct fuse_out_header); splice(pip[0], NULL, fuse_chan_fd(req->ch), NULL, out.len, flags); This works since vmsplice only saves a pointer to the data, it does not copy the data itself. Since pipes are currently limited to 16 pages and messages need to be spliced atomically, the length of the data is limited to 15 pages (or 60kB for 4k pages). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:06 +02:00
Miklos Szeredi	b5dd328537	fuse: get page reference for readpages Acquire a page ref on pages in ->readpages() and release them when the read has finished. Not acquiring a reference didn't seem to cause any trouble since the page is locked and will not be kicked out of the page cache during the read. However the following patches will want to remove the page from the cache so a separate ref is needed. Making the reference in req->pages explicit also makes the code easier to understand. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:06 +02:00
Miklos Szeredi	1bf94ca73e	fuse: use get_user_pages_fast() Replace uses of get_user_pages() with get_user_pages_fast(). It looks nicer and should be faster in most cases. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:06 +02:00
Dan Carpenter	4aa0edd294	fuse: remove unneeded variable "map" isn't needed any more after: `0bd87182d3` "fuse: fix kunmap in fuse_ioctl_copy_user" Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-05-25 15:06:05 +02:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
Linus Torvalds	60f8a8d4c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix large stack use fuse: cleanup in fuse_notify_inval_...()	2010-03-03 08:08:21 -08:00
Daniel Mack	3ad2f3fbb9	tree-wide: Assorted spelling fixes In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-09 11:13:56 +01:00
Fang Wenqi	b2d82ee3c8	fuse: fix large stack use gcc 4.4 warns about: fs/fuse/dev.c: In function ‘fuse_notify_inval_entry’: fs/fuse/dev.c:925: warning: the frame size of 1060 bytes is larger than 1024 bytes The problem is we declare two structures and a large array on the stack, I move the array alway from the stack and allocate memory for it dynamically. Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-02-05 12:08:31 +01:00
Miklos Szeredi	b21dda438b	fuse: cleanup in fuse_notify_inval_...() Small cleanup in fuse_notify_inval_inode() and fuse_notify_inval_entry(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2010-02-05 12:08:31 +01:00
anfei zhou	931e80e4b3	mm: flush dcache before writing into page to avoid alias The cache alias problem will happen if the changes of user shared mapping is not flushed before copying, then user and kernel mapping may be mapped into two different cache line, it is impossible to guarantee the coherence after iov_iter_copy_from_user_atomic. So the right steps should be: flush_dcache_page(page); kmap_atomic(page); write to page; kunmap_atomic(page); flush_dcache_page(page); More precisely, we might create two new APIs flush_dcache_user_page and flush_dcache_kern_page to replace the two flush_dcache_page accordingly. Here is a snippet tested on omap2430 with VIPT cache, and I think it is not ARM-specific: int val = 0x11111111; fd = open("abc", O_RDWR); addr = mmap(NULL, 4096, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); (addr+0) = 0x44444444; tmp = (addr+0); *(addr+1) = 0x77777777; write(fd, &val, sizeof(int)); close(fd); The results are not always 0x11111111 0x77777777 at the beginning as expected. Sometimes we see 0x44444444 0x77777777. Signed-off-by: Anfei <anfei.zhou@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <linux-arch@vger.kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-02-02 18:11:21 -08:00
Csaba Henk	1b7323965a	fuse: reject O_DIRECT flag also in fuse_create The comment in fuse_open about O_DIRECT: "VFS checks this, but only _after_ ->open()" also holds for fuse_create, however, the same kind of check was missing there. As an impact of this bug, open(newfile, O_RDWR\|O_CREAT\|O_DIRECT) fails, but a stub newfile will remain if the fuse server handled the implied FUSE_CREATE request appropriately. Other impact: in the above situation ima_file_free() will complain to open/free imbalance if CONFIG_IMA is set. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Harshavardhana <harsha@gluster.com> Cc: stable@kernel.org	2009-11-27 16:37:13 +01:00
Miklos Szeredi	5219f346b0	fuse: invalidate target of rename Invalidate the target's attributes, which may have changed (such as nlink, change time) so that they are refreshed on the next getattr(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-11-04 10:24:52 +01:00
Jens Axboe	0bd87182d3	fuse: fix kunmap in fuse_ioctl_copy_user Looks like another victim of the confusing kmap() vs kmap_atomic() API differences. Reported-by: Todor Gyumyushev <yodor1@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: stable@kernel.org	2009-11-04 10:24:51 +01:00
Anand V. Avati	f60311d5f7	fuse: prevent fuse_put_request on invalid pointer fuse_direct_io() has a loop where requests are allocated in each iteration. if allocation fails, the loop is broken out and follows into an unconditional fuse_put_request() on that invalid pointer. Signed-off-by: Anand V. Avati <avati@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@kernel.org	2009-11-04 10:24:50 +01:00
Alexey Dobriyan	f0f37e2f77	const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-27 11:39:25 -07:00
npiggin@suse.de	c08d3b0e33	truncate: use new helpers Update some fs code to make use of new helper functions introduced in the previous patch. Should be no significant change in behaviour (except CIFS now calls send_sig under i_lock, via inode_newsize_ok). Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Cc: linux-nfs@vger.kernel.org Cc: Trond.Myklebust@netapp.com Cc: linux-cifs-client@lists.samba.org Cc: sfrench@samba.org Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 08:41:47 -04:00
Linus Torvalds	9eead2a811	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add fusectl interface to max_background fuse: limit user-specified values of max background requests fuse: use drop_nlink() instead of direct nlink manipulation fuse: document protocol version negotiation fuse: make the number of max background requests and congestion threshold tunable	2009-09-18 09:23:03 -07:00
Jens Axboe	32a88aa1b6	fs: Assign bdi in super_block We do this automatically in get_sb_bdev() from the set_bdev_super() callback. Filesystems that have their own private backing_dev_info must assign that in ->fill_super(). Note that ->s_bdi assignment is required for proper writeback! Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-16 15:18:51 +02:00
Csaba Henk	79a9d99434	fuse: add fusectl interface to max_background Make the max_background and congestion_threshold parameters of a FUSE mount tunable at runtime by adding the respective knobs to its directory within the fusectl filesystem. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-09-16 14:15:29 +02:00
Csaba Henk	487ea5af63	fuse: limit user-specified values of max background requests An untrusted user could DoS the system if s/he were allowed to accumulate an arbitrary number of pending background requests by setting the above limits to extremely high values in INIT. This patch excludes this possibility by imposing global upper limits on the possible values of per-mount "max background requests" and "congestion threshold" parameters for unprivileged FUSE filesystems. These global limits are implemented as module parameters. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-09-16 14:15:29 +02:00
Csaba Henk	d6db07ded5	fuse: use drop_nlink() instead of direct nlink manipulation drop_nlink() is the API function to decrease the link count of an inode. However, at a place the control filesystem used the decrement operator on i_nlink directly. Fix this. Cc: Anand Avati <avati@gluster.com> Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-09-16 14:15:28 +02:00
Jens Axboe	d993831fa7	writeback: add name to backing_dev_info This enables us to track who does what and print info. Its main use is catching dirty inodes on the default_backing_dev_info, so we can fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 09:20:26 +02:00
Linus Torvalds	81e4e1ba7e	Revert "fuse: Fix build error" as unnecessary This reverts commit `097041e576`. Trond had a better fix, which is the parent of this one ("Fix compile error due to congestion_wait() changes") Requested-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-11 11:22:34 -07:00
Larry Finger	097041e576	fuse: Fix build error When building v2.6.31-rc2-344-g69ca06c, the following build errors are found due to missing includes: CC [M] fs/fuse/dev.o fs/fuse/dev.c: In function ‘request_end’: fs/fuse/dev.c:289: error: ‘BLK_RW_SYNC’ undeclared (first use in this function) ... fs/nfs/write.c: In function ‘nfs_set_page_writeback’: fs/nfs/write.c:207: error: ‘BLK_RW_ASYNC’ undeclared (first use in this function) Signed-off-by: Larry Finger@lwfinger.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-10 19:09:46 -07:00
Jens Axboe	8aa7e847d8	Fix congestion_wait() sync/async vs read/write confusion Commit `1faa16d228` accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-10 20:31:53 +02:00
Csaba Henk	7a6d3c8b30	fuse: make the number of max background requests and congestion threshold tunable The practical values for these limits depend on the design of the filesystem server so let userspace set them at initialization time. Signed-off-by: Csaba Henk <csaba@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-07-07 17:28:52 +02:00
John Muir	3b463ae0c6	fuse: invalidation reverse calls Add notification messages that allow the filesystem to invalidate VFS caches. Two notifications are added: 1) inode invalidation - invalidate cached attributes - invalidate a range of pages in the page cache (this is optional) 2) dentry invalidation - try to invalidate a subtree in the dentry cache Care must be taken while accessing the 'struct super_block' for the mount, as it can go away while an invalidation is in progress. To prevent this, introduce a rw-semaphore, that is taken for read during the invalidation and taken for write in the ->kill_sb callback. Cc: Csaba Henk <csaba@gluster.com> Cc: Anand Avati <avati@zresearch.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-30 20:12:24 +02:00
Miklos Szeredi	e0a43ddcc0	fuse: allow umask processing in userspace This patch lets filesystems handle masking the file mode on creation. This is needed if filesystem is using ACLs. - The CREATE, MKDIR and MKNOD requests are extended with a "umask" parameter. - A new FUSE_DONT_MASK flag is added to the INIT request/reply. With this the filesystem may request that the create mode is not masked. CC: Jean-Pierre André <jean-pierre.andre@wanadoo.fr> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-30 20:12:23 +02:00
Miklos Szeredi	201fa69a28	fuse: fix bad return value in fuse_file_poll() Fix fuse_file_poll() which returned a -errno value instead of a poll mask. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-06-30 20:06:24 +02:00
Csaba Henk	b4c458b3a2	fuse: fix return value of fuse_dev_write() On 64 bit systems -- where sizeof(ssize_t) > sizeof(int) -- the following test exposes a bug due to a non-careful return of an int or unsigned value: implement a FUSE filesystem which sends an unsolicited notification to the kernel with invalid opcode. The respective write to /dev/fuse will return (1 << 32) - EINVAL with errno == 0 instead of -1 with errno == EINVAL. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-06-30 20:06:23 +02:00
Al Viro	66c6af2e8b	fuse doesn't need BKL in ->umount_begin() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-17 00:36:36 -04:00
Linus Torvalds	c34752bc8b	Merge branch 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: CUSE: implement CUSE - Character device in Userspace fuse: export symbols to be used by CUSE fuse: update fuse_conn_init() and separate out fuse_conn_kill() fuse: don't use inode in fuse_file_poll fuse: don't use inode in fuse_do_ioctl() helper fuse: don't use inode in fuse_sync_release() fuse: create fuse_do_open() helper for CUSE fuse: clean up args in fuse_finish_open() and fuse_release_fill() fuse: don't use inode in helpers called by fuse_direct_io() fuse: add members to struct fuse_file fuse: prepare fuse_direct_io() for CUSE fuse: clean up fuse_write_fill() fuse: use struct path in release structure fuse: misc cleanups	2009-06-12 09:31:20 -07:00
Tejun Heo	151060ac13	CUSE: implement CUSE - Character device in Userspace CUSE enables implementing character devices in userspace. With recent additions of ioctl and poll support, FUSE already has most of what's necessary to implement character devices. All CUSE has to do is bonding all those components - FUSE, chardev and the driver model - nicely. When client opens /dev/cuse, kernel starts conversation with CUSE_INIT. The client tells CUSE which device it wants to create. As the previous patch made fuse_file usable without associated fuse_inode, CUSE doesn't create super block or inodes. It attaches fuse_file to cdev file->private_data during open and set ff->fi to NULL. The rest of the operation is almost identical to FUSE direct IO case. Each CUSE device has a corresponding directory /sys/class/cuse/DEVNAME (which is symlink to /sys/devices/virtual/class/DEVNAME if SYSFS_DEPRECATED is turned off) which hosts "waiting" and "abort" among other things. Those two files have the same meaning as the FUSE control files. The only notable lacking feature compared to in-kernel implementation is mmap support. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-09 11:24:11 +02:00
Linus Torvalds	a6aeeebf51	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: destroy bdi on error	2009-05-13 16:32:57 -07:00
Alessio Igor Bogani	67e55205ec	vfs: umount_begin BKL pushdown Push BKL down into ->umount_begin() Signed-off-by: Alessio Igor Bogani <abogani@texware.it> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:38 -04:00
Tejun Heo	08cbf542bf	fuse: export symbols to be used by CUSE Export the following symbols for CUSE. fuse_conn_put() fuse_conn_get() fuse_conn_kill() fuse_send_init() fuse_do_open() fuse_sync_release() fuse_direct_io() fuse_do_ioctl() fuse_file_poll() fuse_request_alloc() fuse_get_req() fuse_put_request() fuse_request_send() fuse_abort_conn() fuse_dev_release() fuse_dev_operations Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:42 +02:00
Tejun Heo	a325f9b922	fuse: update fuse_conn_init() and separate out fuse_conn_kill() Update fuse_conn_init() such that it doesn't take @sb and move bdi registration into a separate function. Also separate out fuse_conn_kill() from fuse_put_super(). These will be used to implement cuse. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:41 +02:00
Miklos Szeredi	797759aaf3	fuse: don't use inode in fuse_file_poll Use ff->fc and ff->nodeid instead of file->f_dentry->d_inode in the fuse_file_poll() implementation. This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:41 +02:00
Miklos Szeredi	d36f248710	fuse: don't use inode in fuse_do_ioctl() helper Create a helper for sending an IOCTL request that doesn't use a struct inode. This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:39 +02:00
Miklos Szeredi	8b0797a498	fuse: don't use inode in fuse_sync_release() Make fuse_sync_release() a generic helper function that doesn't need a struct inode pointer. This makes it suitable for use by CUSE. Change return value of fuse_release_common() from int to void. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:39 +02:00
Miklos Szeredi	91fe96b403	fuse: create fuse_do_open() helper for CUSE Create a helper for sending an OPEN request that doesn't need a struct inode pointer. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:37 +02:00
Miklos Szeredi	c7b7143c63	fuse: clean up args in fuse_finish_open() and fuse_release_fill() Move setting ff->fh, ff->nodeid and file->private_data outside fuse_finish_open(). Add ->open_flags member to struct fuse_file. This simplifies the argument passing to fuse_finish_open() and fuse_release_fill(), and paves the way for creating an open helper that doesn't need an inode pointer. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:37 +02:00
Miklos Szeredi	2106cb1893	fuse: don't use inode in helpers called by fuse_direct_io() Use ff->fc and ff->nodeid instead of passing down the inode. This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:37 +02:00
Miklos Szeredi	da5e471457	fuse: add members to struct fuse_file Add new members ->fc and ->nodeid to struct fuse_file. This will aid in converting functions for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	d09cb9d7f6	fuse: prepare fuse_direct_io() for CUSE Move code operating on the inode out from fuse_direct_io(). This prepares this function for use by CUSE, where the inode is not owned by a fuse filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	2d698b0703	fuse: clean up fuse_write_fill() Move out code from fuse_write_fill() which is not common to all callers. Remove two function arguments which become unnecessary. Also remove unnecessary memset(), the request is already initialized to zero. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	b0be46ebf7	fuse: use struct path in release structure Use struct path instead of separate dentry and vfsmount in req->misc.release. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:36 +02:00
Miklos Szeredi	fd9db72977	fuse: destroy bdi on error Destroy bdi on error in fuse_fill_super(). This was an omission from commit `26c3679101` "fuse: destroy bdi on umount", which moved the bdi_destroy() call from fuse_conn_put() to fuse_put_super(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-04-28 16:56:35 +02:00
Tejun Heo	6b2db28a7a	fuse: misc cleanups * fuse_file_alloc() was structured in weird way. The success path was split between else block and code following the block. Restructure the code such that it's easier to read and modify. * Unindent success path of fuse_release_common() to ease future changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-28 16:56:35 +02:00
Miklos Szeredi	3121bfe763	fuse: fix "direct_io" private mmap MAP_PRIVATE mmap could return stale data from the cache for "direct_io" files. Fix this by flushing the cache on mmap. Found with a slightly modified fsx-linux. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-09 17:37:53 +02:00
Miklos Szeredi	ce60a2f157	fuse: fix argument type in fuse_get_user_pages() Fix the following warning: fs/fuse/file.c: In function 'fuse_direct_io': fs/fuse/file.c:1002: warning: passing argument 3 of 'fuse_get_user_pages' from incompatible pointer type This was introduced by commit `f4975c67` "fuse: allow kernel to access "direct_io" files". Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-09 17:37:52 +02:00
Miklos Szeredi	fc280c9692	fuse: allow private mappings of "direct_io" files Allow MAP_PRIVATE mmaps of "direct_io" files. This is necessary for execute support. MAP_SHARED mappings require some sort of coherency between the underlying file and the mapping. With "direct_io" it is difficult to provide this, so for the moment just disallow shared (read-write and read-only) mappings altogether. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-02 14:25:35 +02:00
Miklos Szeredi	f4975c67dd	fuse: allow kernel to access "direct_io" files Allow the kernel read and write on "direct_io" files. This is necessary for nfs export and execute support. The implementation is simple: if an access from the kernel is detected, don't perform get_user_pages(), just use the kernel address provided by the requester to copy from/to the userspace filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-04-02 14:25:34 +02:00
Nick Piggin	c2ec175c39	mm: page_mkwrite change prototype to match fault Change the page_mkwrite prototype to take a struct vm_fault, and return VM_FAULT_xxx flags. There should be no functional change. This makes it possible to return much more detailed error information to the VM (and also can provide more information eg. virtual_address to the driver, which might be important in some special cases). This is required for a subsequent fix. And will also make it easier to merge page_mkwrite() with fault() in future. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Felix Blyakher <felixb@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-01 08:59:14 -07:00
Dan Carpenter	5291658d87	fuse: fix fuse_file_lseek returning with lock held This bug was found with smatch (http://repo.or.cz/w/smatch.git/). If we return directly the inode->i_mutex lock doesn't get released. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-03-30 17:26:24 +02:00
Al Viro	4269590a72	constify dentry_operations: FUSE Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-27 14:44:01 -04:00
Linus Torvalds	a1c70a756f	Merge branch 'Kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/misc * 'Kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/misc: (36 commits) fs/Kconfig: move 9p out fs/Kconfig: move afs out fs/Kconfig: move coda out fs/Kconfig: move the rest of ncpfs out fs/Kconfig: move smbfs out fs/Kconfig: move sunrpc out fs/Kconfig: move nfsd out fs/Kconfig: move nfs out fs/Kconfig: move ufs out fs/Kconfig: move sysv out fs/Kconfig: move romfs out fs/Kconfig: move qnx4 out fs/Kconfig: move hpfs out fs/Kconfig: move omfs out fs/Kconfig: move minix out fs/Kconfig: move vxfs out fs/Kconfig: move squashfs out fs/Kconfig: move cramfs out fs/Kconfig: move efs out fs/Kconfig: move bfs out ...	2009-01-26 10:08:50 -08:00
Miklos Szeredi	f6d47a1761	fuse: fix poll notify Move fuse_copy_finish() to before calling fuse_notify_poll_wakeup(). This is not a big issue because fuse_notify_poll_wakeup() should be atomic, but it's cleaner this way, and later uses of notification will need to be able to finish the copying before performing some actions. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-01-26 15:00:59 +01:00
Miklos Szeredi	26c3679101	fuse: destroy bdi on umount If a fuse filesystem is unmounted but the device file descriptor remains open and a new mount reuses the old device number, then the mount fails with EEXIST and the following warning is printed in the kernel log: WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d() sysfs: duplicate filename '0:15' can not be created The cause is that the bdi belonging to the fuse filesystem was destoryed only after the device file was released. Fix this by calling bdi_destroy() from fuse_put_super() instead. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-01-26 15:00:59 +01:00
Miklos Szeredi	c2b8f00690	fuse: fuse_fill_super error handling cleanup Clean up error handling for the whole of fuse_fill_super() function. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-01-26 15:00:58 +01:00
Miklos Szeredi	3ddf1e7f57	fuse: fix missing fput on error Fix the leaking file reference if allocation or initialization of fuse_conn failed. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-01-26 15:00:58 +01:00
Dan Carpenter	bb875b38dc	fuse: fix NULL deref in fuse_file_alloc() ff is set to NULL and then dereferenced on line 65. Compile tested only. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org	2009-01-26 15:00:58 +01:00
Alexey Dobriyan	3ef7784e47	fs/Kconfig: move fuse out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:15:55 +03:00
Linus Torvalds	5fec8bdbf9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: clean up annotations of fc->lock fuse: fix sparse warning in ioctl fuse: update interface version fuse: add fuse_conn->release() fuse: separate out fuse_conn_init() from new_conn() fuse: add fuse_ prefix to several functions fuse: implement poll support fuse: implement unsolicited notification fuse: add file kernel handle fuse: implement ioctl support fuse: don't let fuse_req->end() put the base reference fuse: move FUSE_MINOR to miscdevice.h fuse: style fixes	2009-01-06 17:01:20 -08:00
Nick Piggin	54566b2c15	fs: symlink write_begin allocation context fix With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Harvey Harrison	5d9ec854bf	fuse: clean up annotations of fc->lock Makes the existing annotations match the more common one per line style and adds a few missing annotations. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-12-02 14:49:42 +01:00
Miklos Szeredi	c9f0523d88	fuse: fix sparse warning in ioctl Fix sparse warning: CHECK fs/fuse/file.c fs/fuse/file.c:1615:17: warning: incorrect type in assignment (different address spaces) fs/fuse/file.c:1615:17: expected void [noderef] <asn:1>iov_base fs/fuse/file.c:1615:17: got void <noident> This was introduced by "fuse: implement ioctl support". Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-12-02 14:49:42 +01:00
Tejun Heo	43901aabd7	fuse: add fuse_conn->release() Add fuse_conn->release() so that fuse_conn can be embedded in other structures. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:56 +01:00
Tejun Heo	0d179aa592	fuse: separate out fuse_conn_init() from new_conn() Separate out fuse_conn_init() from new_conn() and while at it initialize fuse_conn->entry during conn initialization. This will be used by CUSE. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	b93f858ab2	fuse: add fuse_ prefix to several functions Add fuse_ prefix to request_send*() and get_root_inode() as some of those functions will be exported for CUSE. With or without CUSE export, having the function names scoped is a good idea for debuggability. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	95668a69a4	fuse: implement poll support Implement poll support. Polled files are indexed using kh in a RB tree rooted at fuse_conn->polled_files. Client should send FUSE_NOTIFY_POLL notification once after processing FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending notification unconditionally after the latest poll or everytime file content might have changed is inefficient but won't cause malfunction. fuse_file_poll() can sleep and requires patches from the following thread which allows f_op->poll() to sleep. http://thread.gmane.org/gmane.linux.kernel/726176 Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	8599396b50	fuse: implement unsolicited notification Clients always used to write only in response to read requests. To implement poll efficiently, clients should be able to issue unsolicited notifications. This patch implements basic notification support. Zero fuse_out_header.unique is now accepted and considered unsolicited notification and the error field contains notification code. This patch doesn't implement any actual notification. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	acf99433d9	fuse: add file kernel handle The file handle, fuse_file->fh, is opaque value supplied by userland FUSE server and uniqueness is not guaranteed. Add file kernel handle, fuse_file->kh, which is allocated by the kernel on file allocation and guaranteed to be unique. This will be used by poll to match notification to the respective file but can be used for other purposes where unique file handle is necessary. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	59efec7b90	fuse: implement ioctl support Generic ioctl support is tricky to implement because only the ioctl implementation itself knows which memory regions need to be read and/or written. To support this, fuse client can request retry of ioctl specifying memory regions to read and write. Deep copying (nested pointers) can be implemented by retrying multiple times resolving one depth of dereference at a time. For security and cleanliness considerations, ioctl implementation has restricted mode where the kernel determines data transfer directions and sizes using the _IOC_*() macros on the ioctl command. In this mode, retry is not allowed. For all FUSE servers, restricted mode is enforced. Unrestricted ioctl will be used by CUSE. Plese read the comment on top of fs/fuse/file.c::fuse_file_do_ioctl() for more information. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:55 +01:00
Tejun Heo	e9bb09dd6c	fuse: don't let fuse_req->end() put the base reference fuse_req->end() was supposed to be put the base reference but there's no reason why it should. It only makes things more complex. Move it out of ->end() and make it the responsibility of request_end(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:54 +01:00
Miklos Szeredi	1729a16c2c	fuse: style fixes Fix coding style errors reported by checkpatch and others. Uptdate copyright date to 2008. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-11-26 12:03:54 +01:00
David Howells	c69e8d9c01	CRED: Use RCU to access another task's creds and to release a task's own creds Use RCU to access another task's creds and to release a task's own creds. This means that it will be possible for the credentials of a task to be replaced without another task (a) requiring a full lock to read them, and (b) seeing deallocated memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:19 +11:00
David Howells	b6dff3ec5e	CRED: Separate task security context from task_struct Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:16 +11:00
David Howells	2186a71cbc	CRED: Wrap task credential accesses in the FUSE filesystem Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: fuse-devel@lists.sourceforge.net Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:38:53 +11:00
Al Viro	233e70f422	saner FASYNC handling on file close As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that does forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-01 09:49:46 -07:00
Christoph Hellwig	440037287c	[PATCH] switch all filesystems over to d_obtain_alias Switch all users of d_alloc_anon to d_obtain_alias. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-10-23 05:13:01 -04:00
Tejun Heo	a7c1b990f7	fuse: implement nonseekable open Let the client request nonseekable open using FOPEN_NONSEEKABLE and call nonseekable_open() on the file if requested. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:57 +02:00
Tejun Heo	29d434b39c	fuse: add include protectors Add include protectors to include/linux/fuse.h and fs/fuse/fuse_i.h. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:57 +02:00
Julia Lawall	17e18ab6ff	fuse: add missing fuse_request_free The error handling code for the second call to fuse_request_alloc should include freeing the result of the first one. This bug was found by the Coccinelle project: http://www.emn.fr/x-info/coccinelle/ Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:56 +02:00
Miklos Szeredi	769415c611	fuse: fix SEEK_END incorrectness Update file size before using it in lseek(..., SEEK_END). Reported-by: Amnon Shiloh <u3557@miso.sublimeip.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-10-16 16:08:56 +02:00
Steven Whitehouse	a447c09324	vfs: Use const for kernel parser table This is a much better version of a previous patch to make the parser tables constant. Rather than changing the typedef, we put the "const" in all the various places where its required, allowing the __initconst exception for nfsroot which was the cause of the previous trouble. This was posted for review some time ago and I believe its been in -mm since then. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-13 10:10:37 -07:00
Al Viro	a110343f0d	[PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess * MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS * MAY_ACCESS on fuse should affect only the last step of pathname resolution * fchdir() and chroot() should pass MAY_ACCESS, for the same reason why chdir() needs that. * now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be removed; it has no business being in nameidata. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:21 -04:00
Miklos Szeredi	2f1936b877	[patch 3/5] vfs: change remove_suid() to file_remove_suid() All calls to remove_suid() are made with a file pointer, because (similarly to file_update_time) it is called when the file is written. Clean up callers by passing in a file instead of a dentry. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2008-07-26 20:53:16 -04:00
Al Viro	e6305c43ed	[PATCH] sanitize ->permission() prototype * kill nameidata * argument; map the 3 bits in ->flags anybody cares about to new MAY_... ones and pass with the mask. * kill redundant gfs2_iop_permission() * sanitize ecryptfs_permission() * fix remaining places where ->permission() instances might barf on new MAY_... found in mask. The obvious next target in that direction is permission(9) folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:14 -04:00
Alexey Dobriyan	51cc50685a	SL*B: drop kmem cache argument from constructor Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:07 -07:00
Miklos Szeredi	48e90761b5	fuse: lockd support If fuse filesystem doesn't define it's own lock operations, then allow the lock manager to work with fuse. Adding lockd support for remote locking is also possible, but more rarely used, so leave it till later. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	33670fa296	fuse: nfs export special lookups Implement the get_parent export operation by sending a LOOKUP request with ".." as the name. Implement looking up an inode by node ID after it has been evicted from the cache. This is done by seding a LOOKUP request with "." as the name (for all file types, not just directories). The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to indicate that it supports these special lookups. Thanks to John Muir for the original implementation of this feature. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	c180eebe13	fuse: add fuse_lookup_name() helper Add a new helper function which sends a LOOKUP request with the supplied name. This will be used by the next patch to send special LOOKUP requests with "." and ".." as the name. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	dbd561d236	fuse: add export operations Implement export_operations, to allow fuse filesystems to be exported to NFS. This feature has been in the out-of-tree fuse module, and is widely used and tested. It has not been originally merged into mainline, because doing the NFS export in userspace was thought to be a cleaner and more efficient way of doing it, than through the kernel. While that is true, it would also have involved a lot of duplicated effort at reimplementing NFS exporting (all the different versions of the protocol). This effort was unfortunately not undertaken by anyone, so we are left with doing it the easy but less efficient way. If this feature goes in, the out-of-tree fuse module can go away, which would have several advantages: - not having to maintain two versions - less confusion for users - no bugs due to kernel API changes Comment from hch: - Use the same fh_type values as XFS, since we use the same fh encoding. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	0de6256daa	fuse: prepare lookup for nfs export Use d_splice_alias() instead of d_add() in fuse lookup code, to allow NFS exporting. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 10:53:48 -07:00
Miklos Szeredi	f948d56435	fuse: fix thinko in max I/O size calucation Use max not min to enforce a lower limit on the max I/O size. This bug was introduced by "fuse: fix max i/o size calculation" (commit `e5d9a0df07`). Thanks to Brian Wang for noticing. Reported-by: Brian Wang <ywang221@hotmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-17 18:08:10 -07:00
Miklos Szeredi	03fb0bce01	fuse: fix bdi naming conflict Fuse allocates a separate bdi for each filesystem, and registers them in sysfs with "MAJOR:MINOR" of sb->s_dev (st_dev). This works fine for anon devices normally used by fuse, but can conflict with an already registered BDI for "fuseblk" filesystems, where sb->s_dev represents a real block device. In particularl this happens if a non-partitioned device is being mounted. Fix by registering with a different name for "fuseblk" filesystems. Thanks to Ioan Ionita for the bug report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Ioan Ionita <opslynx@gmail.com> Tested-by: Ioan Ionita <opslynx@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-24 09:56:07 -07:00
Miklos Szeredi	78bb6cb9a8	fuse: add flag to turn on big writes Prior to 2.6.26 fuse only supported single page write requests. In theory all fuse filesystem should be able support bigger than 4k writes, as there's nothing in the API to prevent it. Unfortunately there's a known case in NTFS-3G where big writes cause filesystem corruption. There could also be other filesystems, where the lack of testing with big write requests would result in bugs. To prevent such problems on a kernel upgrade, disable big writes by default, but let filesystems set a flag to turn it on. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-13 08:02:26 -07:00
Harvey Harrison	bd7309677c	fuse: use clamp() rather than nested min/max clamp() exists for this use. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-01 08:04:02 -07:00
Miklos Szeredi	4dbf930ed6	fuse: fix sparse warnings fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	5559b8f4d1	fuse: fix race in llseek Fuse doesn't use i_mutex to protect setting i_size, and so generic_file_llseek() can be racy: it doesn't use i_size_read(). So do a fuse specific llseek method, which does use i_size_read(). [akpm@linux-foundation.org: make `retval' loff_t] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	b48badf013	fuse: fix node ID type Node ID is 64bit but it is passed as unsigned long to some functions. This breakage wasn't noticed, because libfuse uses unsigned long too. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	e5d9a0df07	fuse: fix max i/o size calculation Fix a bug that Werner Baumann reported: fuse can send a bigger write request than the maximum specified. This only affected direct_io operation. In addition set a sane minimum for the max_read and max_write tunables, so I/O always makes some progress. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:51 -07:00
Miklos Szeredi	5c5c5e51b2	fuse: update file size on short read If the READ request returned a short count, then either - cached size is incorrect - filesystem is buggy, as short reads are only allowed on EOF So assume that the size is wrong and refresh it, so that cached read() doesn't zero fill the missing chunk. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Nick Piggin	ea9b9907b8	fuse: implement perform_write Introduce fuse_perform_write. With fusexmp (a passthrough filesystem), large (1MB) writes into a backing tmpfs filesystem are sped up by almost 4 times (256MB/s vs 71MB/s). [mszeredi@suse.cz]: - split into smaller functions - testing - duplicate generic_file_aio_write(), so that there's no need to add a new ->perform_write() a_op. Comment from hch. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	854512ec35	fuse: clean up setting i_size in write Extract common code for setting i_size in write functions into a common helper. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	3be5a52b30	fuse: support writable mmap Quoting Linus (3 years ago, FUSE inclusion discussions): "User-space filesystems are hard to get right. I'd claim that they are almost impossible, unless you limit them somehow (shared writable mappings are the nastiest part - if you don't have those, you can reasonably limit your problems by limiting the number of dirty pages you accept through normal "write()" calls)." Instead of attempting the impossible, I've just waited for the dirty page accounting infrastructure to materialize (thanks to Peter Zijlstra and others). This nicely solved the biggest problem: limiting the number of pages used for write caching. Some small details remained, however, which this largish patch attempts to address. It provides a page writeback implementation for fuse, which is completely safe against VM related deadlocks. Performance may not be very good for certain usage patterns, but generally it should be acceptable. It has been tested extensively with fsx-linux and bash-shared-mapping. Fuse page writeback design -------------------------- fuse_writepage() allocates a new temporary page with GFP_NOFS\|__GFP_HIGHMEM. It copies the contents of the original page, and queues a WRITE request to the userspace filesystem using this temp page. The writeback is finished instantly from the MM's point of view: the page is removed from the radix trees, and the PageDirty and PageWriteback flags are cleared. For the duration of the actual write, the NR_WRITEBACK_TEMP counter is incremented. The per-bdi writeback count is not decremented until the actual write completes. On dirtying the page, fuse waits for a previous write to finish before proceeding. This makes sure, there can only be one temporary page used at a time for one cached page. This approach is wasteful in both memory and CPU bandwidth, so why is this complication needed? The basic problem is that there can be no guarantee about the time in which the userspace filesystem will complete a write. It may be buggy or even malicious, and fail to complete WRITE requests. We don't want unrelated parts of the system to grind to a halt in such cases. Also a filesystem may need additional resources (particularly memory) to complete a WRITE request. There's a great danger of a deadlock if that allocation may wait for the writepage to finish. Currently there are several cases where the kernel can block on page writeback: - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER - page migration - throttle_vm_writeout (through NR_WRITEBACK) - sync(2) Of course in some cases (fsync, msync) we explicitly want to allow blocking. So for these cases new code has to be added to fuse, since the VM is not tracking writeback pages for us any more. As an extra safetly measure, the maximum dirty ratio allocated to a single fuse filesystem is set to 1% by default. This way one (or several) buggy or malicious fuse filesystems cannot slow down the rest of the system by hogging dirty memory. With appropriate privileges, this limit can be raised through '/sys/class/bdi/<bdi>/max_ratio'. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:50 -07:00
Miklos Szeredi	b6f2fcbcfc	mm: bdi: expose the BDI object in sysfs for FUSE Register FUSE's backing_dev_info under sysfs with the name "fuse-MAJOR:MINOR" Make the fuse control filesystem use s_dev instead of a fuse specific ID. This makes it easier to match directories under /sys/fs/fuse/connections/ with directories under /sys/class/bdi, and with actual mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-30 08:29:49 -07:00
Al Viro	42faad9965	[PATCH] restore sane ->umount_begin() API Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-04-25 09:23:25 -04:00
Miklos Szeredi	1a823ac9ff	fuse: fix permission checking I added a nasty local variable shadowing bug to fuse in 2.6.24, with the result, that the 'default_permissions' mount option is basically ignored. How did this happen? - old err declaration in inner scope - new err getting declared in outer scope - 'return err' from inner scope getting removed - old declaration not being noticed -Wshadow would have saved us, but it doesn't seem practical for the kernel :( More testing would have also saved us :(( Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-23 17:12:13 -08:00
Miklos Szeredi	d1875dbaa5	mount options: fix fuse Add blksize= option to /proc/mounts for fuseblk filesystems. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:40 -08:00
David Howells	fa300b1914	iget: stop FUSE from using iget() and read_inode() Stop the FUSE filesystem from using read_inode(), which it doesn't use anyway. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:28 -08:00
David Howells	e231c2ee64	Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using: perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]PTR_ERR' fs crypto net security` Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:26 -08:00
Miklos Szeredi	d12def1bcb	fuse: limit queued background requests Libfuse basically creates a new thread for each new request. This is fine for synchronous requests, which are naturally limited. However background requests (especially writepage) can cause a thread creation storm. To avoid this, limit the number of background requests available to userspace. This is done by introducing another queue for background requests, and a counter for the number of "active" requests, which are currently available for userspace. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:13 -08:00
Miklos Szeredi	b57d426445	fuse: save space in struct fuse_req Move the fields 'dentry' and 'vfsmount' into the request specific union, since these are only used for the RELEASE request. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:13 -08:00
Miklos Szeredi	0952b2a4a8	fuse: fix attribute caching after create Invalidate attributes on create, since st_ctime is updated. Reported by Szabolcs Szakacsits. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:13 -08:00
Greg Kroah-Hartman	197b12d679	Kobject: convert fs/* from kobject_unregister() to kobject_put() There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	00d2666623	kobject: convert main fs kobject to use kobject_create This also renames fs_subsys to fs_kobj to catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman	5c89e17e9c	kobject: convert fuse to use kobject_create We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Miklos Szeredi	08b633070a	fuse: fix attribute caching after rename Invalidate attributes on rename, since some filesystems may update st_ctime. Reported by Szabolcs Szakacsits Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
John Muir	fbee36b92a	fuse: fix uninitialized field in fuse_inode I found problems accessing (executing) previously existing files, until I did chmod on them (or setattr). If the fi->attr_version is not initialized, then it could be larger than fc->attr_version until a setattr is executed, and as a result the inode attributes would never be set. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	d0186b25e6	fuse: fix FUSE_FILE_OPS sending FUSE_FILE_OPS is meant to signal that the kernel will send the open file to to the userspace filesystem for operations on open files, so that sillyrenaming unlinked files becomes unnecessary. However this needs VFS changes, which won't make it into 2.6.24. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	a6643094e7	fuse: pass open flags to read and write Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...) after open, but fuse currently only sends the flags to userspace in open. To make it possible to correcly handle changing flags, send the current value to userspace in each read and write. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	7dca9fd39f	fuse: cleanup: add fuse_get_attr_version() Extract repeated code into helper function, as suggested by Akpm. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	bcb4be809d	fuse: fix reading past EOF Currently reading a fuse file will stop at cached i_size and return EOF, even though the file might have grown since the attributes were last updated. So detect if trying to read past EOF, and refresh the attributes before continuing with the read. Thanks to mpb for the report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Adrian Bunk	8744969a81	fuse_file_alloc(): fix NULL dereferences Fix obvious NULL dereferences spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:42 -08:00
Miklos Szeredi	0e9663ee45	fuse: add blksize field to fuse_attr There are cases when the filesystem will be passed the buffer from a single read or write call, namely: 1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go through the page cache, but go directly to the userspace fs 2) currently buffered writes are done with single page requests, but if Nick's ->perform_write() patch goes it, it will be possible to do larger write requests. But only if the original write() was also bigger than a page. In these cases the filesystem might want to give a hint to the app about the optimal I/O size. Allow the userspace filesystem to supply a blksize value to be returned by stat() and friends. If the field is zero, it defaults to the old PAGE_CACHE_SIZE value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	f33321141b	fuse: add support for mandatory locking For mandatory locking the userspace filesystem needs to know the lock ownership for read, write and truncate operations. This patch adds the necessary fields to the protocol. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	b25e82e567	fuse: add helper for asynchronous writes This patch adds a new helper function fuse_write_fill() which makes it possible to send WRITE requests asynchronously. A new flag for WRITE requests is also added which indicates that this a write from the page cache, and not a "normal" file write. This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	93a8c3cd9e	fuse: add list of writable files to fuse_inode Each WRITE request must carry a valid file descriptor. When a page is written back from a memory mapping, the file through which the page was dirtied is not available, so a new mechananism is needed to find a suitable file in ->writepage(s). A list of fuse_files is added to fuse_inode. The file is removed from the list in fuse_release(). This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	a9ff4f8705	fuse: support BSD locking semantics It is trivial to add support for flock(2) semantics to the existing protocol, by setting the lock owner field to the file pointer, and passing a new FUSE_LK_FLOCK flag with the locking request. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	6ff958edbf	fuse: add atomic open+truncate support This patch allows fuse filesystems to implement open(..., O_TRUNC) as a single request, instead of separate truncate and open requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	17637cbaba	fuse: improve utimes support Add two new flags for setattr: FATTR_ATIME_NOW and FATTR_MTIME_NOW. These mean, that atime or mtime should be changed to the current time. Also it is now possible to update atime or mtime individually, not just together. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	49d4914fd7	fuse: clean up open file passing in setattr Clean up supplying open file to the setattr operation. In addition to being a cleanup it prepares for the changes in the way the open file is passed to the setattr method. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	c79e322f63	fuse: add file handle to getattr operation Add necessary protocol changes for supplying a file handle with the getattr operation. Step the API version to 7.9. This patch doesn't actually supply the file handle, because that needs some kind of VFS support, which we haven't yet been able to agree upon. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	1fb69e7817	fuse: fix race between getattr and write Getattr and lookup operations can be running in parallel to attribute changing operations, such as write and setattr. This means, that if for example getattr was slower than a write, the cached size attribute could be set to a stale value. To prevent this race, introduce a per-filesystem attribute version counter. This counter is incremented whenever cached attributes are modified, and the incremented value stored in the inode. Before storing new attributes in the cache, getattr and lookup check, using the version number, whether the attributes have been modified during the request's lifetime. If so, the returned attributes are not cached, because they might be stale. Thanks to Jakub Bogusz for the bug report and test program. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jakub Bogusz <jakub.bogusz@gemius.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	e57ac68378	fuse: fix allowing operations The following operation didn't check if sending the request was allowed: setattr listxattr statfs Some other operations don't explicitly do the check, but VFS calls ->permission() which checks this. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Miklos Szeredi	e8e961574b	fuse: clean up execute permission checking Define a new function fuse_refresh_attributes() that conditionally refreshes the attributes based on the validity timeout. In fuse_permission() only refresh the attributes for checking the execute bits if necessary. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	c9c9d7df5f	fuse: no ENOENT from fuse device read Don't return -ENOENT for a read() on the fuse device when the request was aborted. Instead return -ENODEV, meaning the filesystem has been force-umounted or aborted. Previously ENOENT meant that the request was interrupted, but now the 'aborted' flag is not set in case of interrupts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	a131de0a48	fuse: no abort on interrupt Don't set 'aborted' flag on a request if it's interrupted. We have to wait for the answer anyway, and this would only a very little time while copying the reply. This means, that write() on the fuse device will not return -ENOENT during normal operation, only if the filesystem is aborted by a forced umount or through the fusectl interface. This could simplify userspace code somewhat when backward compatibility with earlier kernel versions is not required. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	819c4b3b40	fuse: cleanup in release Move dput/mntput pair from request_end() to fuse_release_end(), because there's no other place they are used. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	ebc14c4dbe	fuse: fix permission checking on sticky directories The VFS checks sticky bits on the parent directory even if the filesystem defines it's own ->permission(). In some situations (sshfs, mountlo, etc) the user does have permission to delete a file even if the attribute based checking would not allow it. So work around this by storing the permission bits separately and returning them in stat(), but cutting the permission bits off from inode->i_mode. This is slightly hackish, but it's probably not worth it to add new infrastructure in VFS and a slight performance penalty for all filesystems, just for the sake of fuse. [Jan Engelhardt] cosmetic fixes Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	244f6385c2	fuse: refresh stale attributes in fuse_permission() fuse_permission() didn't refresh inode attributes before using them, even if the validity has already expired. Thanks to Junjiro Okajima for spotting this. Also remove some old code to unconditionally refresh the attributes on the root inode. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	074406fa63	fuse: set i_nlink to sane value after mount Aufs seems to depend on a positive i_nlink value. So fill in a dummy but sane value for the root inode at mount time. The inode attributes are refreshed with the correct values at the first opportunity. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Miklos Szeredi	b10099792b	fuse: fix page invalidation Other than truncate, there are two cases, when fuse tries to get rid of cached pages: a) in open, if KEEP_CACHE flag is not set b) in getattr, if file size changed spontaneously Until now invalidate_mapping_pages() were used, which didn't get rid of mapped pages. This is wrong, and becomes more wrong as dirty pages are introduced. So instead properly invalidate all pages with invalidate_inode_pages2(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	e00d2c2d4a	fuse: truncate on spontaneous size change Memory mappings were only truncated on an explicit truncate, but not when the file size was changed externally. Fix this by moving the truncation code from fuse_setattr to fuse_change_attributes. Yes, there are races between write and and external truncation, but we can't really do anything about them. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	c756e0a4d7	fuse: add reference counting to fuse_file Make lifetime of 'struct fuse_file' independent from 'struct file' by adding a reference counter and destructor. This will enable asynchronous page writeback, where it cannot be guaranteed, that the file is not released while a request with this file handle is being served. The actual RELEASE request is only sent when there are no more references to the fuse_file. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	de5e3dec42	fuse: fix reserved request wake up Use wake_up_all instead of wake_up in put_reserved_req(), otherwise it is possible that the right task is not woken up. Also create a separate reserved_req_waitq in addition to the blocked_waitq, since they fulfill totally separate functions. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Miklos Szeredi	f92b99b9dc	fuse: update backing_dev_info congestion state Set the read and write congestion state if the request queue is close to blocking, and clear it when it's not. This prevents unnecessary blocking in readahead and (when writable mmaps are allowed) writeback. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Peter Zijlstra	e0bf68ddec	mm: bdi init hooks provide BDI constructor/destructor hooks [akpm@linux-foundation.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Nick Piggin	5e6f58a1d7	fuse: convert to new aops [mszeredi] - don't send zero length write requests - it is not legal for the filesystem to return with zero written bytes Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:57 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Jens Axboe	5ffc4ef45b	sendfile: remove .sendfile from filesystems that use generic_file_sendfile() They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:13 +02:00
Alexey Dobriyan	edad01e2a1	fuse: ->fs_flags fixlet fs/fuse/inode.c:658:3: error: Initializer entry defined twice fs/fuse/inode.c:661:3: also defined here Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-06-16 13:16:15 -07:00
Miklos Szeredi	ead5f0b5fa	fuse: delete inode on drop When inode is dropped (no more references) delete it from cache. There's not much point in keeping it cached, when a new lookup will refresh the attributes anyway. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:13 -07:00
Miklos Szeredi	889f784831	fuse: generic_write_checks() for direct_io This fixes O_APPEND in direct IO mode. Also checks writes against file size limits, notably rlimits. Reported by Greg Bruno. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:13 -07:00
Miklos Szeredi	b9ba347f27	fuse: fix mknod of regular file The wrong lookup flag was tested in ->create() causing havoc (error or Oops) when a regular file was created with mknod() in a fuse filesystem. Thanks to J. Cameijo Cerdeira for the report. Kernels 2.6.18 onward are affected. Please apply to -stable as well. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-23 20:14:11 -07:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Christoph Lameter	a35afb830f	Remove SLAB_CTOR_CONSTRUCTOR SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-17 05:23:04 -07:00
Miklos Szeredi	79c0b2df79	add filesystem subtype support There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:01 -07:00
Linus Torvalds	2d56d3c43c	Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux * 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux: gfs2: nfs lock support for gfs2 lockd: add code to handle deferred lock requests lockd: always preallocate block in nlmsvc_lock() lockd: handle test_lock deferrals lockd: pass cookie in nlmsvc_testlock lockd: handle fl_grant callbacks lockd: save lock state on deferral locks: add fl_grant callback for asynchronous lock return nfsd4: Convert NFSv4 to new lock interface locks: add lock cancel command locks: allow {vfs,posix}_lock_file to return conflicting lock locks: factor out generic/filesystem switch from setlock code locks: factor out generic/filesystem switch from test_lock locks: give posix_test_lock same interface as ->lock locks: make ->lock release private data before returning in GETLK case locks: create posix-to-flock helper functions locks: trivial removal of unnecessary parentheses	2007-05-07 12:34:24 -07:00
Christoph Lameter	50953fe9e0	slab allocators: Remove SLAB_DEBUG_INITIAL flag I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-07 12:12:57 -07:00
Marc Eshel	9d6a8c5c21	locks: give posix_test_lock same interface as ->lock posix_test_lock() and ->lock() do the same job but have gratuitously different interfaces. Modify posix_test_lock() so the two agree, simplifying some code in the process. Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>	2007-05-06 17:39:00 -04:00
Greg Kroah-Hartman	823bccfc40	remove "struct subsystem" as it is no longer needed We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-05-02 18:57:59 -07:00
Timo Savola	a5bfffac64	[PATCH] fuse: validate rootmode mount option If rootmode isn't valid, we hit the BUG() in fuse_init_inode. Now EINVAL is returned. Signed-off-by: Timo Savola <tsavola@movial.fi> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-08 19:47:55 -07:00
Josef 'Jeff' Sipek	ee9b6d61a2	[PATCH] Mark struct super_operations const This patch is inspired by Arjan's "Patch series to mark struct file_operations and struct inode_operations const". Compile tested with gcc & sparse. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:47 -08:00
Arjan van de Ven	754661f143	[PATCH] mark struct inode_operations const 1 Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Andrew Morton	fc0ecff698	[PATCH] remove invalidate_inode_pages() Convert all calls to invalidate_inode_pages() into open-coded calls to invalidate_mapping_pages(). Leave the invalidate_inode_pages() wrapper in place for now, marked as deprecated. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:31 -08:00
Miklos Szeredi	ff79544754	[PATCH] fuse: fix bug in control filesystem mount The BUG in fuse_ctl_add_dentry() could be triggered if the control filesystem was unmounted and mounted again while one or more fuse filesystems were present. The fix is to reset the dentry counter in fuse_ctl_kill_sb(). Bug reported by Florent Mertens. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:45 -08:00
Miklos Szeredi	9280f6822c	[PATCH] fuse: remove clear_page_dirty() call The use by FUSE was just a remnant of an optimization from the time when writable mappings were supported. Now FUSE never actually allows the creation of dirty pages, so this invocation of clear_page_dirty() is effectively a no-op. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-21 09:25:08 -08:00
Josef Sipek	7706a9d618	[PATCH] struct path: convert fuse Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:45 -08:00
Miklos Szeredi	875d95ec9e	[PATCH] fuse: fix compile without CONFIG_BLOCK Randy Dunlap wote: > Should FUSE depend on BLOCK? Without that and with BLOCK=n, I get: > > inode.c:(.text+0x3acc5): undefined reference to `sb_set_blocksize' > inode.c:(.text+0x3a393): undefined reference to `get_sb_bdev' > fs/built-in.o:(.data+0xd718): undefined reference to `kill_block_super Most fuse filesystems work fine without block device support, so I think a better solution is to disable the 'fuseblk' filesystem type if BLOCK=n. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:32 -08:00
Miklos Szeredi	0ec7ca41f6	[PATCH] fuse: add DESTROY operation Add a DESTROY operation for block device based filesystems. With the help of this operation, such a filesystem can flush dirty data to the device synchronously before the umount returns. This is needed in situations where the filesystem is assumed to be clean immediately after unmount (e.g. ejecting removable media). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:32 -08:00
Miklos Szeredi	b2d2272fae	[PATCH] fuse: add bmap support Add support for the BMAP operation for block device based filesystems. This is needed to support swap-files and lilo. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:32 -08:00
Miklos Szeredi	d809161402	[PATCH] fuse: add blksize option Add 'blksize' option for block device based filesystems. During initialization this is used to set the block size on the device and the super block. The default block size is 512bytes. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:31 -08:00
Miklos Szeredi	d6392f873f	[PATCH] fuse: add support for block device based filesystems I never intended this, but people started using fuse to implement block device based "real" filesystems (ntfs-3g, zfs). The following four patches add better support for these kinds of filesystems. Unlike "normal" fuse filesystems, using this feature should require superuser privileges (enforced by the fusermount utility). Thanks to Szabolcs Szakacsits for the input and testing. This patch adds a 'fuseblk' filesystem type, which is only different from the 'fuse' filesystem type in how the 'dev_name' mount argument is interpreted. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:31 -08:00
Miklos Szeredi	bdcf250804	[PATCH] fuse: minor cleanup in fuse_dentry_revalidate Remove unneeded code from fuse_dentry_revalidate(). This made some sense while the validity time could wrap around, but now it's a very obvious no-op. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:31 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Christoph Lameter	e94b176609	[PATCH] slab: remove SLAB_KERNEL SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:24 -08:00
Miklos Szeredi	2d51013ed2	[PATCH] fuse: fix Oops in lookup Fix bug in certain error paths of lookup routines. The request object was reused for sending FORGET, which is illegal. This bug could cause an Oops in 2.6.18. In earlier versions it might silently corrupt memory, but this is very unlikely. These error paths are never triggered by libfuse, so this wasn't noticed even with the 2.6.18 kernel, only with a filesystem using the raw kernel interface. Thanks to Russ Cox for the bug report and test filesystem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-25 13:28:33 -08:00
OGAWA Hirofumi	2e990021bf	[PATCH] fuse: ->readpages() cleanup This just ignore the remaining pages. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Steven French <sfrench@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-11-03 12:27:57 -08:00
Miklos Szeredi	e956edd052	[PATCH] fuse: fix dereferencing dentry parent There's no locking for ->d_revalidate, so fuse_dentry_revalidate() should use dget_parent() instead of simply dereferencing ->d_parent. Due to topology changes in the directory tree the parent could become negative or be destroyed while being used. There hasn't been any reports about this yet. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	d2a85164aa	[PATCH] fuse: fix handling of moved directory Fuse considered it an error (EIO) if lookup returned a directory inode, to which a dentry already refered. This is because directory aliases are not allowed. But in a network filesystem this could happen legitimately, if a directory is moved on a remote client. This patch attempts to relax the restriction by trying to first evict the offending alias from the cache. If this fails, it still returns an error (EBUSY). A rarer situation is if an mkdir races with an indenpendent lookup, which finds the newly created directory already moved. In this situation the mkdir should return success, but that would be incorrect, since the dentry cannot be instantiated, so return EBUSY. Previously checking for a directory alias and instantiation of the dentry weren't done atomically in lookup/mkdir, hence two such calls racing with each other could create aliased directories. To prevent this introduce a new per-connection mutex: fuse_conn->inst_mutex, which is taken for instantiations with a directory inode. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	265126ba9e	[PATCH] fuse: fix spurious BUG Fix a spurious BUG in an unlikely race, where at least three parallel lookups return the same inode, but with different file type. This has not yet been observed in real life. Allowing unlimited retries could delay fuse_iget() indefinitely, but this is really for the broken userspace filesystem to worry about. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	8da5ff23ce	[PATCH] fuse: locking fix for nlookup An inode could be returned by independent parallel lookups, in this case an update of the lookup counter could be lost resulting in a memory leak in userspace. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Miklos Szeredi	9ffbb91623	[PATCH] fuse: fix hang on SMP Fuse didn't always call i_size_write() with i_mutex held which caused rare hangs on SMP/32bit. This bug has been present since fuse-2.2, well before being merged into mainline. The simplest solution is to protect i_size_write() with the per-connection spinlock. Using i_mutex for this purpose would require some restructuring of the code and I'm not even sure it's always safe to acquire i_mutex in all places i_size needs to be set. Since most of vmtruncate is already duplicated for other reasons, duplicate the remaining part as well, making all i_size_write() calls internal to fuse. Using i_size_write() was unnecessary in fuse_init_inode(), since this function is only called on a newly created locked inode. Reported by a few people over the years, but special thanks to Dana Henriksen who was persistent enough in helping me debug it. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-17 08:18:45 -07:00
Dave Hansen	ce71ec3684	[PATCH] r/o bind mounts: monitor zeroing of i_nlink Some filesystems, instead of simply decrementing i_nlink, simply zero it during an unlink operation. We need to catch these in addition to the decrement operations. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:30 -07:00
Dave Hansen	d8c76e6f45	[PATCH] r/o bind mount prepwork: inc_nlink() helper This is mostly included for parity with dec_nlink(), where we will have some more hooks. This one should stay pretty darn straightforward for now. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:30 -07:00
Badari Pulavarty	543ade1fc9	[PATCH] Streamline generic_file_* interfaces and filemap cleanups This patch cleans up generic_file__read/write() interfaces. Christoph Hellwig gave me the idea for this clean ups. In a nutshell, all filesystems should set .aio_read/.aio_write methods and use do_sync_read/ do_sync_write() as their .read/.write methods. This allows us to cleanup all variants of generic_file_ routines. Final available interfaces: generic_file_aio_read() - read handler generic_file_aio_write() - write handler generic_file_aio_write_nolock() - no lock write handler __generic_file_aio_write_nolock() - internal worker routine Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:28 -07:00
Badari Pulavarty	ee0b3e671b	[PATCH] Remove readv/writev methods and use aio_read/aio_write instead This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-01 00:39:28 -07:00
Miklos Szeredi	650a898342	[PATCH] vfs: define new lookup flag for chdir In the "operation does permission checking" model used by fuse, chdir permission is not checked, since there's no chdir method. For this case set a lookup flag, which will be passed to ->permission(), so fuse can distinguish it from permission checks for other operations. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:08 -07:00
Miklos Szeredi	5b35e8e58a	[PATCH] fuse: use dentry in statfs Some filesystems may want to report different values depending on the path within the filesystem, i.e. one mount is actually several filesystems. This can be the case for a network filesystem exported by an unprivileged server (e.g. sshfs). This is now possible, thanks to David Howells "VFS: Permit filesystem to perform statfs with a known root dentry" patch. This change is backward compatible, so no need to change interface version. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:08 -07:00
Josh Triplett	105f4d7a81	[PATCH] fuse: add lock annotations to request_end and fuse_read_interrupt request_end and fuse_read_interrupt release fc->lock. Add lock annotations to these two functions so that sparse can check callers for lock pairing, and so that sparse will not complain about these functions since they intentionally use locks in this manner. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-29 09:18:08 -07:00
Theodore Ts'o	ba52de123d	[PATCH] inode-diet: Eliminate i_blksize from the inode structure This eliminates the i_blksize field from struct inode. Filesystems that want to provide a per-inode st_blksize can do so by providing their own getattr routine instead of using the generic_fillattr() function. Note that some filesystems were providing pretty much random (and incorrect) values for i_blksize. [bunk@stusta.de: cleanup] [akpm@osdl.org: generic_fillattr() fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-27 08:26:18 -07:00
Theodore Ts'o	8e18e2941c	[PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private The following patches reduce the size of the VFS inode structure by 28 bytes on a UP x86. (It would be more on an x86_64 system). This is a 10% reduction in the inode size on a UP kernel that is configured in a production mode (i.e., with no spinlock or other debugging functions enabled; if you want to save memory taken up by in-core inodes, the first thing you should do is disable the debugging options; they are responsible for a huge amount of bloat in the VFS inode structure). This patch: The filesystem or device-specific pointer in the inode is inside a union, which is pretty pointless given that all 30+ users of this field have been using the void pointer. Get rid of the union and rename it to i_private, with a comment to explain who is allowed to use the void pointer. This is just a cleanup, but it allows us to reuse the union 'u' for something something where the union will actually be used. [judith@osdl.org: powerpc build fix] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Judith Lebzelter <judith@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-27 08:26:17 -07:00
Alexander Zarochentsev	1d7ea7324a	[PATCH] fuse: fix error case in fuse_readpages Don't let fuse_readpages leave the @pages list not empty when exiting on error. [akpm@osdl.org: kernel-doc fixes] Signed-off-by: Alexander Zarochentsev <zam@namesys.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-08-14 12:54:29 -07:00
Miklos Szeredi	873302c71c	[PATCH] fuse: fix typo Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Miklos Szeredi	0a0898cf41	[PATCH] fuse: use jiffies_64 It is entirely possible (though rare) that jiffies half-wraps around, while a dentry/inode remains in the cache. This could mean that the dentry/inode is not invalidated for another half wraparound-time. To get around this problem, use 64-bit jiffies. The only problem with this is that dentry->d_time is 32 bits on 32-bit archs. So use d_fsdata as the high 32 bits. This is an ugly hack, but far simpler, than having to allocate private data just for this purpose. Since 64-bit jiffies can be assumed never to wrap around, simple comparison can be used, and a zero time value can represent "invalid". Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Miklos Szeredi	685d16ddb0	[PATCH] fuse: fix zero timeout An attribute and entry timeout of zero should mean, that the entity is invalidated immediately after the operation. Previously invalidation only happened at the next clock tick. Reported and tested by Craig Davies. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-31 13:28:43 -07:00
Christoph Hellwig	f5e54d6e53	[PATCH] mark address_space_operations const Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-28 14:59:04 -07:00
Linus Torvalds	1d77062b14	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits) nfs: remove nfs_put_link() nfs-build-fix-99 git-nfs-build-fixes Merge branch 'odirect' NFS: alloc nfs_read/write_data as direct I/O is scheduled NFS: Eliminate nfs_get_user_pages() NFS: refactor nfs_direct_free_user_pages NFS: remove user_addr, user_count, and pos from nfs_direct_req NFS: "open code" the NFS direct write rescheduler NFS: Separate functions for counting outstanding NFS direct I/Os NLM: Fix reclaim races NLM: sem to mutex conversion locks.c: add the fl_owner to nlm_compare_locks NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts NFS: Split fs/nfs/inode.c NFS: Fix typo in nfs_do_clone_mount() NFS: Fix compile errors introduced by referrals patches NFSv4: Ensure that referral mounts bind to a reserved port NFSv4: A root pathname is sent as a zero component4 NFSv4: Follow a referral ...	2006-06-25 10:54:14 -07:00
Miklos Szeredi	9c8ef5614d	[PATCH] fuse: scramble lock owner ID VFS uses current->files pointer as lock owner ID, and it wouldn't be prudent to expose this value to userspace. So scramble it with XTEA using a per connection random key, known only to the kernel. Only one direction needs to be implemented, since the ID is never sent in the reverse direction. The XTEA algorithm is implemented inline since it's simple enough to do so, and this adds less complexity than if the crypto API were used. Thanks to Jesper Juhl for the idea. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:20 -07:00
Miklos Szeredi	a4d27e75ff	[PATCH] fuse: add request interruption Add synchronous request interruption. This is needed for file locking operations which have to be interruptible. However filesystem may implement interruptibility of other operations (e.g. like NFS 'intr' mount option). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	f9a2842e56	[PATCH] fuse: rename the interrupted flag Rename the 'interrupted' flag to 'aborted', since it indicates exactly that, and next patch will introduce an 'interrupted' flag for a Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	33649c91a3	[PATCH] fuse: ensure FLUSH reaches userspace All POSIX locks owned by the current task are removed on close(). If the FLUSH request resulting initiated by close() fails to reach userspace, there might be locks remaining, which cannot be removed. The only reason it could fail, is if allocating the request fails. In this case use the request reserved for RELEASE, or if that is currently used by another FLUSH, wait for it to become available. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	7142125937	[PATCH] fuse: add POSIX file locking support This patch adds POSIX file locking support to the fuse interface. This implementation doesn't keep any locking state in kernel. Unlocking on close() is handled by the FLUSH message, which now contains the lock owner id. Mandatory locking is not supported. The filesystem may enfoce mandatory locking in userspace if needed. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	bafa96541b	[PATCH] fuse: add control filesystem Add a control filesystem to fuse, replacing the attributes currently exported through sysfs. An empty directory '/sys/fs/fuse/connections' is still created in sysfs, and mounting the control filesystem here provides backward compatibility. Advantages of the control filesystem over the previous solution: - allows the object directory and the attributes to be owned by the filesystem owner, hence letting unpriviled users abort the filesystem connection - does not suffer from module unload race [akpm@osdl.org: fix this fs for recent dhowells depredations] [akpm@osdl.org: fix 64-bit printk warnings] Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Miklos Szeredi	51eb01e735	[PATCH] fuse: no backgrounding on interrupt Don't put requests into the background when a fatal interrupt occurs while the request is in userspace. This removes a major wart from the implementation. Backgrounding of requests was introduced to allow breaking of deadlocks. However now the same can be achieved by aborting the filesystem through the 'abort' sysfs attribute. This is a change in the interface, but should not cause problems, since these kinds of deadlocks never happen during normal operation. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:19 -07:00
Trond Myklebust	816724e65c	Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ Conflicts: fs/nfs/inode.c fs/super.c Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch 'VFS: Permit filesystem to override root dentry on mount'	2006-06-24 13:07:53 -04:00
Miklos Szeredi	75e1fcc0b1	[PATCH] vfs: add lock owner argument to flush operation Pass the POSIX lock owner ID to the flush operation. This is useful for filesystems which don't want to store any locking state in inode->i_flock but want to handle locking/unlocking POSIX locks internally. FUSE is one such filesystem but I think it possible that some network filesystems would need this also. Also add a flag to indicate that a POSIX locking request was generated by close(), so filesystems using the above feature won't send an extra locking request in this case. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:43:02 -07:00
David Howells	726c334223	[PATCH] VFS: Permit filesystem to perform statfs with a known root dentry Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
David Howells	454e2398be	[PATCH] VFS: Permit filesystem to override root dentry on mount Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: () the get_sb_() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. () If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). () generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[] dentries with child trees. [] Anonymous until discovered from another tree. () The documentation has been adjusted, including the additional bit of changing ext2_ into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
Trond Myklebust	8b512d9a88	VFS: Remove dependency of ->umount_begin() call on MNT_FORCE Allow filesystems to decide to perform pre-umount processing whether or not MNT_FORCE is set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:18 -04:00
Miklos Szeredi	8aa09a50b5	[fuse] fix race between checking and setting file->private_data BKL does not protect against races if the task may sleep between checking and setting a value. So move checking of file->private_data near to setting it in fuse_fill_super(). Found by Al Viro. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-26 10:49:16 +02:00
Miklos Szeredi	6dbbcb1205	[fuse] fix deadlock between fuse_put_super() and request_end(), try #2 A deadlock was possible, when the last reference to the superblock was held due to a background request containing a file reference. Releasing the file would release the vfsmount which in turn would release the superblock. Since sbput_sem is held during the fput() and fuse_put_super() tries to acquire this same semaphore, a deadlock results. The solution is to move the fput() outside the region protected by sbput_sem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-26 10:49:06 +02:00
Miklos Szeredi	5a5fb1ea74	Revert "[fuse] fix deadlock between fuse_put_super() and request_end()" This reverts `73ce8355c2` commit. It was wrong, because it didn't take into account the requirement, that iput() for background requests must be performed synchronously with ->put_super(), otherwise active inodes may remain after unmount. The right solution is to keep the sbput_sem and perform iput() within the locked region, but move fput() outside sbput_sem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-26 10:48:55 +02:00
Miklos Szeredi	56cf34ff07	[fuse] Direct I/O should not use fuse_reset_request It's cleaner to allocate a new request, otherwise the uid/gid/pid fields of the request won't be filled in. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:51 +02:00
Miklos Szeredi	4858cae4f0	[fuse] Don't init request twice Request is already initialized in fuse_request_alloc() so no need to do it again in fuse_get_req(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:38 +02:00
Miklos Szeredi	9bc5dddad1	[fuse] Fix accounting the number of waiting requests Properly accounting the number of waiting requests was forgotten in "clean up request accounting" patch. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:16:09 +02:00
Miklos Szeredi	73ce8355c2	[fuse] fix deadlock between fuse_put_super() and request_end() A deadlock was possible, when the last reference to the superblock was held due to a background request containing a file reference. Releasing the file would release the vfsmount which in turn would release the superblock. Since sbput_sem is held during the fput() and fuse_put_super() tries to acquire this same semaphore, a deadlock results. The chosen soltuion is to get rid of sbput_sem, and instead use the spinlock to ensure the referenced inodes/file are released only once. Since the actual release may sleep, defer these outside the locked region, but using local variables instead of the structure members. This is a much more rubust solution. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>	2006-04-11 21:14:26 +02:00
Miklos Szeredi	08a53cdce6	[PATCH] fuse: account background requests The previous patch removed limiting the number of outstanding requests. This patch adds a much simpler limiting, that is also compatible with file locking operations. A task may have at most one synchronous request allocated. So these requests need not be otherwise limited. However the number of background requests (release, forget, asynchronous reads, interrupted requests) can grow indefinitely. This can be used by a malicous user to cause FUSE to allocate arbitrary amounts of unswappable kernel memory, denying service. For this reason add a limit for the number of background requests, and block allocations of new requests until the number goes bellow the limit. Also use this mechanism to block all requests until the INIT reply is received. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:49 -07:00
Miklos Szeredi	ce1d5a491f	[PATCH] fuse: clean up request accounting FUSE allocated most requests from a fixed size pool filled at mount time. However in some cases (release/forget) non-pool requests were used. File locking operations aren't well served by the request pool, since they may block indefinetly thus exhausting the pool. This patch removes the request pool and always allocates requests on demand. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:49 -07:00
Miklos Szeredi	a87046d822	[PATCH] fuse: consolidate device errors Return consistent error values for the case when the opened device file has no mount associated yet. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	d713311464	[PATCH] fuse: use a per-mount spinlock Remove the global spinlock in favor of a per-mount one. This patch is basically find & replace. The difficult part has already been done by the previous patch. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	0720b31597	[PATCH] fuse: simplify locking This is in preparation for removing the global spinlock in favor of a per-mount one. The only critical part is the interaction between fuse_dev_release() and fuse_fill_super(): fuse_dev_release() must see the assignment to file->private_data, otherwise it will leak the reference to fuse_conn. This is ensured by the fput() operation, which will synchronize the assignment with other CPU's that may do a final fput() soon after this. Also redundant locking is removed from fuse_fill_super(), where exclusion is already ensured by the BKL held for this function by the VFS. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Jeff Dike	e5ac1d1e70	[PATCH] fuse: add O_NONBLOCK support to FUSE device I don't like duplicating the connected and list_empty tests in fuse_dev_readv, but this seemed cleaner than adding the f_flags test to request_wait. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Jeff Dike	385a17bfc3	[PATCH] fuse: add O_ASYNC support to FUSE device This adds asynchronous notification to FUSE - a FUSE server can request O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input available. One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested, does no locking, unlink the other methods. I think it's unnecessary, as the fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync, which provide their own locking. It would also be wrong to use the fuse_lock, as it's a spin lock and fasync_helper can sleep. My one concern with this is the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a reference on the file struct, so this seems not to be a problem. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:48 -07:00
Miklos Szeredi	7025d9ad10	[PATCH] fuse: fix fuse_dev_poll() return value fuse_dev_poll() returned an error value instead of a poll mask. Luckily (or unluckily) -ENODEV does contain the POLLERR bit. There's also a race if filesystem is unmounted between fuse_get_conn() and spin_lock(), in which case this event will be missed by poll(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:47 -07:00
Miklos Szeredi	d3406ffa4a	[PATCH] fuse: fix oops in fuse_send_readpages() During heavy parallel filesystem activity it was possible to Oops the kernel. The reason is that read_cache_pages() could skip pages which have already been inserted into the cache by another task. Occasionally this may result in zero pages actually being sent, while fuse_send_readpages() relies on at least one page being in the request. So check this corner case and just free the request instead of trying to send it. Reported and tested by Konstantin Isakov. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:47 -07:00
Arjan van de Ven	4b6f5d20b0	[PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-28 09:16:06 -08:00
Miklos Szeredi	50322fe7d4	[PATCH] fuse: fix bug in negative lookup If negative entries (nodeid == 0) were sent in reply to LOOKUP requests, two bugs could be triggered: - looking up a negative entry would return -EIO, - revaildate on an entry which turned negative would send a FORGET request with zero nodeid, which would cause an abort() in the library. The above would only happen if the 'negative_timeout=N' option was used, otherwise lookups reply -ENOENT, which worked correctly. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-28 20:53:43 -08:00
Miklos Szeredi	77e7f250f8	[PATCH] fuse: fix bug in aborted fuse_release_end() There's a rather theoretical case of the BUG triggering in fuse_reset_request(): - iget() fails because of OOM after a successful CREATE_OPEN request - during IO on the resulting RELEASE request the connection is aborted Fix and add warning to fuse_reset_request(). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-17 13:59:27 -08:00
Miklos Szeredi	7128ec2a74	[PATCH] fuse: fix request_end() vs fuse_reset_request() race The last fix for this function in fact opened up a much more often triggering race. It was uncommented tricky code, that was buggy. Add comment, make it less tricky and fix bug. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-05 11:06:51 -08:00
Miklos Szeredi	9cd6845511	[PATCH] fuse: fix async read for legacy filesystems While asynchronous reads mean a performance improvement in most cases, if the filesystem assumed that reads are synchronous, then async reads may degrade performance (filesystem may receive reads out of order, which can confuse it's own readahead logic). With sshfs a 1.5 to 4 times slowdown can be measured. There's also a need for userspace filesystems to know whether asynchronous reads are supported by the kernel or not. To achive these, negotiate in the INIT request whether async reads will be used and the maximum readahead value. Update interface version to 7.6 If userspace uses a version earlier than 7.6, then disable async reads, and set maximum readahead value to the maximum read size, as done in previous versions. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-01 08:53:09 -08:00
Miklos Szeredi	095da6cbb6	[PATCH] fuse: fix bitfield race Fix race in setting bitfields of fuse_conn. Spotted by Andrew Morton. The two fields ->connected and ->mounted were always changed with the fuse_lock held. But other bitfields in the same structure were changed without the lock. In theory this could lead to losing the assignment of even the ones under lock. The chosen solution is to change these two fields to be a full unsigned type. The other bitfields aren't "important" enough to warrant the extra complexity of full locking or changing them to bitops. For all bitfields document why they are safe wrt. concurrent assignments. Also make the initialization of the 'num_waiting' atomic counter explicit. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:31 -08:00
Miklos Szeredi	c1aa96a52e	[PATCH] fuse: use asynchronous READ requests for readpages This patch changes fuse_readpages() to send READ requests asynchronously. This makes it possible for userspace filesystems to utilize the kernel readahead logic instead of having to implement their own (resulting in double caching). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:31 -08:00
Miklos Szeredi	361b1eb55e	[PATCH] fuse: READ request initialization Add a separate function for filling in the READ request. This will make it possible to send asynchronous READ requests as well as synchronous ones. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:31 -08:00
Miklos Szeredi	9b9a04693f	[PATCH] fuse: move INIT handling to inode.c Now the INIT requests can be completely handled in inode.c and the fuse_send_init() function need not be global any more. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:31 -08:00
Miklos Szeredi	64c6d8ed4c	[PATCH] fuse: add asynchronous request support Add possibility for requests to run asynchronously and call an 'end' callback when finished. With this, the special handling of the INIT and RELEASE requests can be cleaned up too. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:31 -08:00
Miklos Szeredi	69a53bf267	[PATCH] fuse: add connection aborting Add ability to abort a filesystem connection. With the introduction of asynchronous reads, the ability to interrupt any request is not enough to dissolve deadlocks, since now waiting for the request completion (page unlocked) is independent of the actual request, so in a deadlock all threads will be uninterruptible. The solution is to make it possible to abort all requests, even those currently undergoing I/O to/from userspace. The natural interface for this is 'mount -f mountpoint', but that only works as long as the filesystem is attached. So also add an 'abort' attribute to the sysfs view of the connection. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	0cd5b88553	[PATCH] fuse: add number of waiting requests attribute This patch adds the 'waiting' attribute which indicates how many filesystem requests are currently waiting to be completed. A non-zero value without any filesystem activity indicates a hung or deadlocked filesystem. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	f543f253f3	[PATCH] fuse: make fuse connection a kobject Kobjectify fuse_conn, and make it visible under /sys/fs/fuse/connections. Lacking any natural naming, connections are numbered. This patch doesn't add any attributes, just the infrastructure. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	9ba7cbba10	[PATCH] fuse: extend semantics of connected flag The ->connected flag for a fuse_conn object previously only indicated whether the device file for this connection is currently open or not. Change it's meaning so that it indicates whether the connection is active or not: now either umount or device release will clear the flag. The separate ->mounted flag is still needed for handling background requests. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	d77a1d5b61	[PATCH] fuse: introduce list for requests under I/O Create a new list for requests in the process of being transfered to/from userspace. This will be needed to be able to abort all requests even those currently under I/O Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	83cfd49351	[PATCH] fuse: introduce unified request state The state of request was made up of 2 bitfields (->sent and ->finished) and of the fact that the request was on a list or not. Unify this into a single state field. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	6383bdaa2e	[PATCH] fuse: miscellaneous cleanup - remove some unneeded assignments - use kzalloc instead of kmalloc + memset - simplify setting sb->s_fs_info - in fuse_send_init() use fuse_get_request() instead of do_get_request() helper Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:30 -08:00
Miklos Szeredi	8bfc016d2e	[PATCH] fuse: uninline some functions Inline keyword is unnecessary in most cases. Clean them up. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:29 -08:00
Miklos Szeredi	b3bebd94bb	[PATCH] fuse: handle error INIT reply Handle the case when the INIT request is answered with an error. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:29 -08:00
Miklos Szeredi	f43b155a5a	[PATCH] fuse: fix request_end() This function used the request object after decrementing its reference count and releasing the lock. This could in theory lead to all sorts of problems. Fix and simplify at the same time. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:29 -08:00
Miklos Szeredi	222f1d6918	[PATCH] fuse: fuse_copy_finish() order fix fuse_copy_finish() must be called before request_end(), since the later might sleep, and no sleeping is allowed between fuse_copy_one() and fuse_copy_finish() because of kmap_atomic()/kunmap_atomic() used in them. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-16 23:15:29 -08:00
Jes Sorensen	1b1dcc1b57	[PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem This patch converts the inode semaphore to a mutex. I have tested it on XFS and compiled as much as one can consider on an ia64. Anyway your luck with it might be different. Modified-by: Ingo Molnar <mingo@elte.hu> (finished the conversion) Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2006-01-09 15:59:24 -08:00
Miklos Szeredi	39ee059aff	[PATCH] fuse: check file type in lookup Previously invalid types were quietly changed to regular files, but at revalidation the inode was changed to bad. This was rather inconsistent behavior. Now check if the type is valid on initial lookup, and return -EIO if not. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	6ad84acab9	[PATCH] fuse: ensure progress in read and write In direct_io mode, send at least one page per reqest. Previously it was possible that reqests with zero data were sent, and hence the read/write didn't make any progress, resulting in an infinite (though interruptible) loop. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	3ec870d524	[PATCH] fuse: make maximum write data configurable Make the maximum size of write data configurable by the filesystem. The previous fixed 4096 limit only worked on architectures where the page size is less or equal to this. This change make writing work on other architectures too, and also lets the filesystem receive bigger write requests in direct_io mode. Normal writes which go through the page cache are still limited to a page sized chunk per request. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	1d3d752b47	[PATCH] fuse: clean up request size limit checking Change the way a too large request is handled. Until now in this case the device read returned -EINVAL and the operation returned -EIO. Make it more flexibible by not returning -EINVAL from the read, but restarting it instead. Also remove the fixed limit on setxattr data and let the filesystem provide as large a read buffer as it needs to handle the extended attribute data. The symbolic link length is already checked by VFS to be less than PATH_MAX, so the extra check against FUSE_SYMLINK_MAX is not needed. The check in fuse_create_open() against FUSE_NAME_MAX is not needed, since the dentry has already been looked up, and hence the name already checked. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:56 -08:00
Miklos Szeredi	248d86e87d	[PATCH] fuse: fail file operations on bad inode Make file operations on a bad inode fail. This just makes things a bit more consistent. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Miklos Szeredi	6f9f11806a	[PATCH] fuse: add code documentation Document some not-so-trivial functions. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Miklos Szeredi	8cbdf1e6f6	[PATCH] fuse: support caching negative dentries Add support for caching negative dentries. Up till now, ->d_revalidate() always forced a new lookup on these. Now let the lookup method return a zero node ID (not used for anything else) meaning a negative entry, but with a positive cache timeout. The old way of signaling negative entry (replying ENOENT) still works. Userspace should check the ABI minor version to see whether sending a zero ID is allowed by the kernel or not. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Miklos Szeredi	de5f120255	[PATCH] fuse: add frsize to statfs reply Add 'frsize' member to the statfs reply. I'm not sure if sending f_fsid will ever be needed, but just in case leave some space at the end of the structure, so less compatibility mess would be required. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Miklos Szeredi	45714d6561	[PATCH] fuse: bump interface version Change interface version to 7.4. Following changes will need backward compatibility support, so store the minor version returned by userspace. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00
Miklos Szeredi	4633a22e7a	[PATCH] fuse: clean up page offset calculation Use page_offset() instead of doing page offset calculation by hand. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-06 08:33:55 -08:00

... 6 7 8 9 10 ...

777 Commits