Commit Graph

857138 Commits

Author SHA1 Message Date
Vasily Averin d5880c7a86 fuse: fix missing unlock_page in fuse_writepage()
unlock_page() was missing in case of an already in-flight write against the
same page.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Fixes: ff17be0864 ("fuse: writepage: skip already in flight")
Cc: <stable@vger.kernel.org> # v3.13
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-24 15:28:01 +02:00
Michael S. Tsirkin 501ae8ecae fuse: reserve byteswapped init opcodes
virtio fs tunnels fuse over a virtio channel.  One issue is two sides might
be speaking different endian-ness. To detects this, host side looks at the
opcode value in the FUSE_INIT command.  Works fine at the moment but might
fail if a future version of fuse will use such an opcode for
initialization.  Let's reserve this opcode so we remember and don't do
this.

Same for CUSE_INIT.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:41 +02:00
Vivek Goyal 15c8e72e88 fuse: allow skipping control interface and forced unmount
virtio-fs does not support aborting requests which are being
processed. That is requests which have been sent to fuse daemon on host.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:41 +02:00
Miklos Szeredi 783863d647 fuse: dissociate DESTROY from fuseblk
Allow virtio-fs to also send DESTROY request.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:41 +02:00
Miklos Szeredi 8fab010644 fuse: delete dentry if timeout is zero
Don't hold onto dentry in lru list if need to re-lookup it anyway at next
access.  Only do this if explicitly enabled, otherwise it could result in
performance regression.

More advanced version of this patch would periodically flush out dentries
from the lru which have gone stale.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:41 +02:00
Vivek Goyal 0cd1eb9a41 fuse: separate fuse device allocation and installation in fuse_conn
As of now fuse_dev_alloc() both allocates a fuse device and installs it in
fuse_conn list.  fuse_dev_alloc() can fail if fuse_device allocation fails.

virtio-fs needs to initialize multiple fuse devices (one per virtio queue).
It initializes one fuse device as part of call to fuse_fill_super_common()
and rest of the devices are allocated and installed after that.

But, we can't afford to fail after calling fuse_fill_super_common() as we
don't have a way to undo all the actions done by fuse_fill_super_common().
So to avoid failures after the call to fuse_fill_super_common(),
pre-allocate all fuse devices early and install them into fuse connection
later.

This patch provides two separate helpers for fuse device allocation and
fuse device installation in fuse_conn.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:41 +02:00
Stefan Hajnoczi ae3aad77f4 fuse: add fuse_iqueue_ops callbacks
The /dev/fuse device uses fiq->waitq and fasync to signal that requests are
available.  These mechanisms do not apply to virtio-fs.  This patch
introduces callbacks so alternative behavior can be used.

Note that queue_interrupt() changes along these lines:

  spin_lock(&fiq->waitq.lock);
  wake_up_locked(&fiq->waitq);
+ kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
  spin_unlock(&fiq->waitq.lock);
- kill_fasync(&fiq->fasync, SIGIO, POLL_IN);

Since queue_request() and queue_forget() also call kill_fasync() inside
the spinlock this should be safe.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:41 +02:00
Stefan Hajnoczi 0cc2656cdb fuse: extract fuse_fill_super_common()
fuse_fill_super() includes code to process the fd= option and link the
struct fuse_dev to the fd's struct file.  In virtio-fs there is no file
descriptor because /dev/fuse is not used.

This patch extracts fuse_fill_super_common() so that both classic fuse and
virtio-fs can share the code to initialize a mount.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Vivek Goyal 4388c5aac4 fuse: export fuse_dequeue_forget() function
File systems like virtio-fs need to do not have to play directly with
forget list data structures. There is a helper function use that instead.

Rename dequeue_forget() to fuse_dequeue_forget() and export it so that
stacked filesystems can use it.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Stefan Hajnoczi 79d96efffd fuse: export fuse_get_unique()
virtio-fs will need unique IDs for FORGET requests from outside
fs/fuse/dev.c.  Make the symbol visible.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Vivek Goyal 95a84cdb11 fuse: export fuse_send_init_request()
This will be used by virtio-fs to send init request to fuse server after
initialization of virt queues.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Stefan Hajnoczi 14d46d7abc fuse: export fuse_len_args()
virtio-fs will need to query the length of fuse_arg lists.  Make the symbol
visible.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Stefan Hajnoczi 04ec5af077 fuse: export fuse_end_request()
virtio-fs will need to complete requests from outside fs/fuse/dev.c.  Make
the symbol visible.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Miklos Szeredi f22f812d5c fuse: fix request limit
The size of struct fuse_req was reduced from 392B to 144B on a non-debug
config, thus the sanitize_global_limit() helper was setting a larger
default limit.  This doesn't really reflect reduction in the memory used by
requests, since the fields removed from fuse_req were added to fuse_args
derived structs; e.g. sizeof(struct fuse_writepages_args) is 248B, thus
resulting in slightly more memory being used for writepage requests
overalll (due to using 256B slabs).

Make the calculatation ignore the size of fuse_req and use the old 392B
value.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-12 14:59:40 +02:00
Miklos Szeredi 05ea48cc2b fuse: stop copying pages to fuse_req
The page array pointers are also duplicated across fuse_args_pages and
fuse_req.  Get rid of the fuse_req ones.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi d49937749f fuse: stop copying args to fuse_req
No need to duplicate the argument arrays in fuse_req, so just dereference
req->args instead of copying to the fuse_req internal ones.

This allows further cleanup of the fuse_req structure.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi 145b673bd2 fuse: clean up fuse_req
Get rid of request specific fields in fuse_req that are not used anymore.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi 7213394c4e fuse: simplify request allocation
Page arrays are not allocated together with the request anymore.  Get rid
of the dead code

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi 66abc3599c fuse: unexport request ops
All requests are now sent with one of the fuse_simple_... helpers.  Get rid
of the old api from the fuse internal header.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi 75b399dda5 fuse: convert retrieve to simple api
Rename fuse_request_send_notify_reply() to fuse_simple_notify_reply() and
convert to passing fuse_args instead of fuse_req.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi 4cb548666e fuse: convert release to simple api
Since we cannot reserve the request structure up-front, make sure that the
request allocation doesn't fail using __GFP_NOFAIL.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:50 +02:00
Miklos Szeredi b50ef7c52a cuse: convert init to simple api
This is a straightforward conversion.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 615047eff1 fuse: convert init to simple api
Bypass the fc->initialized check by setting the force flag.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 33826ebbbe fuse: convert writepages to simple api
Derive fuse_writepage_args from fuse_io_args.

Sending the request is tricky since it was done with fi->lock held, hence
we must either use atomic allocation or release the lock.  Both are
possible so try atomic first and if it fails, release the lock and do the
regular allocation with GFP_NOFS and __GFP_NOFAIL.  Both flags are
necessary for correct operation.

Move the page realloc function from dev.c to file.c and convert to using
fuse_writepage_args.

The last caller of fuse_write_fill() is gone, so get rid of it.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 43f5098eb8 fuse: convert readdir to simple api
The old fuse_read_fill() helper can be deleted, now that the last user is
gone.

The fuse_io_args struct is moved to fuse_i.h so it can be shared between
readdir/read code.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 134831e36b fuse: convert readpages to simple api
Need to extend fuse_io_args with 'attr_ver' and 'ff' members, that take the
functionality of the same named members in fuse_req.

fuse_short_read() can now take struct fuse_args_pages.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 45ac96ed7c fuse: convert direct_io to simple api
Change of semantics in fuse_async_req_send/fuse_send_(read|write): these
can now return error, in which case the 'end' callback isn't called, so the
fuse_io_args object needs to be freed.

Added verification that the return value is sane (less than or equal to the
requested read/write size).

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 1259728731 fuse: add simple background helper
Create a helper named fuse_simple_background() that is similar to
fuse_simple_request().  Unlike the latter, it returns immediately and calls
the supplied 'end' callback when the reply is received.

The supplied 'args' pointer is stored in 'fuse_req' which allows the
callback to interpret the output arguments decoded from the reply.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 338f2e3f33 fuse: convert sync write to simple api
Extract a fuse_write_flags() helper that converts ki_flags relevant write
to open flags.

The other parts of fuse_send_write() aren't used in the
fuse_perform_write() case.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 00793ca5d4 fuse: covert readpage to simple api
Derive fuse_io_args from struct fuse_args_pages.  This will be used for
both synchronous and asynchronous read/write requests.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi a0d45d84f4 fuse: fuse_short_read(): don't take fuse_req as argument
This will allow the use of this function when converting to the simple api
(which doesn't use fuse_req).

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 093f38a2c1 fuse: convert ioctl to simple api
fuse_simple_request() is converted to return length of last (instead of
single) out arg, since FUSE_IOCTL_OUT has two out args, the second of which
is variable length.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 4c4f03f78c fuse: move page alloc
fuse_req_pages_alloc() is moved to file.c, since its internal use by the
device code will eventually be removed.

Rename to fuse_pages_alloc() to signify that it's not only usable for
fuse_req page array.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 4c29afece8 fuse: convert readlink to simple api
Also turn BUG_ON into gracefully recovered WARN_ON.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 68583165f9 fuse: add pages to fuse_args
Derive fuse_args_pages from fuse_args. This is used to handle requests
which use pages for input or output.  The related flags are added to
fuse_args.

New FR_ALLOC_PAGES flags is added to indicate whether the page arrays in
fuse_req need to be freed by fuse_put_request() or not.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 1ccd1ea249 fuse: convert destroy to simple api
We can use the "force" flag to make sure the DESTROY request is always sent
to userspace.  So no need to keep it allocated during the lifetime of the
filesystem.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi e413754b26 fuse: add nocreds to fuse_args
In some cases it makes no sense to set pid/uid/gid fields in the request
header.  Allow fuse_simple_background() to omit these.  This is only
required in the "force" case, so for now just WARN if set otherwise.

Fold fuse_get_req_nofail_nopages() into its only caller.  Comment is
obsolete anyway.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:49 +02:00
Miklos Szeredi 3545fe2112 fuse: convert fuse_force_forget() to simple api
Move this function to the readdir.c where its only caller resides.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:48 +02:00
Miklos Szeredi 454a7613f5 fuse: add noreply to fuse_args
This will be used by fuse_force_forget().

We can expand fuse_request_send() into fuse_simple_request().  The
FR_WAITING bit has already been set, no need to check.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:48 +02:00
Miklos Szeredi c500ebaa90 fuse: convert flush to simple api
Add 'force' to fuse_args and use fuse_get_req_nofail_nopages() to allocate
the request in that case.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:48 +02:00
Miklos Szeredi 40ac7ab2d0 fuse: simplify 'nofail' request
Instead of complex games with a reserved request, just use __GFP_NOFAIL.

Both calers (flush, readdir) guarantee that connection was already
initialized, so no need to wait for fc->initialized.

Also remove unneeded clearing of FR_BACKGROUND flag.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:48 +02:00
Miklos Szeredi 1f4e9d03d1 fuse: rearrange and resize fuse_args fields
This makes the structure better packed.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:48 +02:00
Miklos Szeredi d5b4854357 fuse: flatten 'struct fuse_args'
...to make future expansion simpler.  The hiearachical structure is a
historical thing that does not serve any practical purpose.

The generated code is excatly the same before and after the patch.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:48 +02:00
Eric Biggers 76e43c8cca fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock
When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.

This may have to wait for fuse_iqueue::waitq.lock to be released by one
of many places that take it with IRQs enabled.  Since the IRQ handler
may take kioctx::ctx_lock, lockdep reports that a deadlock is possible.

Fix it by protecting the state of struct fuse_iqueue with a separate
spinlock, and only accessing fuse_iqueue::waitq using the versions of
the waitqueue functions which do IRQ-safe locking internally.

Reproducer:

	#include <fcntl.h>
	#include <stdio.h>
	#include <sys/mount.h>
	#include <sys/stat.h>
	#include <sys/syscall.h>
	#include <unistd.h>
	#include <linux/aio_abi.h>

	int main()
	{
		char opts[128];
		int fd = open("/dev/fuse", O_RDWR);
		aio_context_t ctx = 0;
		struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd };
		struct iocb *cbp = &cb;

		sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd);
		mkdir("mnt", 0700);
		mount("foo",  "mnt", "fuse", 0, opts);
		syscall(__NR_io_setup, 1, &ctx);
		syscall(__NR_io_submit, ctx, 1, &cbp);
	}

Beginning of lockdep output:

	=====================================================
	WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
	5.3.0-rc5 #9 Not tainted
	-----------------------------------------------------
	syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
	000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
	000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline]
	000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825

	and this task is already holding:
	0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline]
	0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline]
	0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825
	which would create a new lock dependency:
	 (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}

	but this new dependency connects a SOFTIRQ-irq-safe lock:
	 (&(&ctx->ctx_lock)->rlock){..-.}

	[...]

Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com
Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com
Fixes: bfe4037e72 ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.19+
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-10 16:29:29 +02:00
David Howells c7eb686963 vfs: subtype handling moved to fuse
The unused vfs code can be removed.  Don't pass empty subtype (same as if
->parse callback isn't called).

The bits that are left involve determining whether it's permitted to split the
filesystem type string passed in to mount(2).  Consequently, this means that we
cannot get rid of the FS_HAS_SUBTYPE flag unless we define that a type string
with a dot in it always indicates a subtype specification.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-06 21:28:49 +02:00
David Howells c30da2e981 fuse: convert to use the new mount API
Convert the fuse filesystem to the new internal mount API as the old
one will be obsoleted and removed.  This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-06 21:27:09 +02:00
Miklos Szeredi bf9261b818 Merge branch 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into HEAD
Mount API convertion of fuse needs get_tree_bdev().
2019-09-06 21:22:58 +02:00
David Howells 0f07100410 mtd: Provide fs_context-aware mount_mtd() replacement
Provide a function, get_tree_mtd(), to replace mount_mtd(), using an
fs_context struct to hold the parameters.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: David Woodhouse <dwmw2@infradead.org>
cc: Brian Norris <computersforpeace@gmail.com>
cc: Boris Brezillon <bbrezillon@kernel.org>
cc: Marek Vasut <marek.vasut@gmail.com>
cc: Richard Weinberger <richard@nod.at>
cc: linux-mtd@lists.infradead.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05 14:34:23 -04:00
David Howells fe62c3a4e1 vfs: Create fs_context-aware mount_bdev() replacement
Create a function, get_tree_bdev(), that is fs_context-aware and a
->get_tree() counterpart of mount_bdev().

It caches the block device pointer in the fs_context struct so that this
information can be passed into sget_fc()'s test and set functions.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: linux-block@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05 14:34:22 -04:00
Al Viro 533770cc0a new helper: get_tree_keyed()
For vfs_get_keyed_super users.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05 14:34:22 -04:00