OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Jann Horn	78fee0b684	orangefs: fix namespace handling In orangefs_inode_getxattr(), an fsuid is written to dmesg. The kuid is converted to a userspace uid via from_kuid(current_user_ns(), [...]), but since dmesg is global, init_user_ns should be used here instead. In copy_attributes_from_inode(), op_alloc() and fill_default_sys_attrs(), upcall structures are populated with uids/gids that have been mapped into the caller's namespace. However, those upcall structures are read by another process (the userspace filesystem driver), and that process might be running in another namespace. This effectively lets any user spoof its uid and gid as seen by the userspace filesystem driver. To fix the second issue, I just construct the opcall structures with init_user_ns uids/gids and require the filesystem server to run in the init namespace. Since orangefs is full of global state anyway (as the error message in DUMP_DEVICE_ERROR explains, there can only be one userspace orangefs filesystem driver at once), that shouldn't be a problem. [ Why does orangefs even exist in the kernel if everything does upcalls into userspace? What does orangefs do that couldn't be done with the FUSE interface? If there is no good answer to those questions, I'd prefer to see orangefs kicked out of the kernel. Can that be done for something that shipped in a release? According to commit `f7ab093f74` ("Orangefs: kernel client part 1"), they even already have a FUSE daemon, and the only rational reason (apart from "but most of our users report preferring to use our kernel module instead") given for not wanting to use FUSE is one "in-the-works" feature that could probably be integated into FUSE instead. ] This patch has been compile-tested. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-07-05 15:47:43 -04:00
Al Viro	45996492e5	orangefs: fix orangefs_superblock locking * switch orangefs_remount() to taking ORANGEFS_SB(sb) instead of sb * remove from the list _before_ orangefs_unmount() - request_mutex in the latter will make sure that nothing observed in the loop in ORANGEFS_DEV_REMOUNT_ALL handling will get freed until the end of loop * on removal, keep the forward pointer and zero the back one. That way we can drop and regain the spinlock in the loop body (again, ORANGEFS_DEV_REMOUNT_ALL one) and still be able to get to the rest of the list. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-03-26 07:22:00 -04:00
Mike Marshall	53f57fef43	Orangefs: Extra sanity insurance on buffer before using string functions on it. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-03-14 15:48:28 -04:00
Martin Brandenburg	acfcbaf192	orangefs: make fs_mount_pending static Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-03-09 13:26:39 -05:00
Mike Marshall	9d9e7ba9ee	Orangefs: improve gossip statements Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-03-03 13:46:48 -05:00
Mike Marshall	9f08cfe944	Orangefs: update orangefs.txt Al Viro has cleaned up the way ops are processed and waited for, now orangefs.txt has an overview of how it works. Several recent related commits have added to the comments in the code as well. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-26 14:39:08 -05:00
Mike Marshall	ca9f518ead	Orangefs: code sanitation. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-26 10:21:12 -05:00
Mike Marshall	adcf34a289	Orangefs: code sanitation Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-24 16:54:27 -05:00
Al Viro	05a50a5be8	orangefs: have ..._clean_interrupted_...() wait for copy to/from daemon * turn all those list_del(&op->list) into list_del_init() * don't pick ops that are already given up in control device ->read()/->write_iter(). * have orangefs_clean_interrupted_operation() notice if op is currently being copied to/from daemon (by said ->read()/->write_iter()) and wait for that to finish. * when we are done copying to/from daemon and find that it had been given up while we were doing that, wake the waiting ..._clean_interrupted_... As the result, we are guaranteed that orangefs_clean_interrupted_operation(op) doesn't return until nobody else can see op. Moreover, we don't need to play with op refcounts anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:56 -05:00
Al Viro	5964c1b839	orangefs: set correct ->downcall.status on failing to copy reply from daemon ... and clean the end of control device ->write_iter() while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:55 -05:00
Al Viro	897c5df6cf	orangefs: get rid of op->done shouldn't be needed now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:55 -05:00
Al Viro	ea2c9c9f65	orangefs: bufmap rewrite new waiting-for-slot logics: * make request for slot wait for bufmap to be set up if it comes before it's installed OR while it's running down * make closing control device wait for all slots to be freed * waiting itself rewritten to (open-coded) analogues of wait_event_... primitives - we would need wait_event_locked() and, pardon an obscenely long name, wait_event_interruptible_exclusive_timeout_locked(). * we never wait for more than slot_timeout_secs in total and, if during the wait the daemon goes away, we only allow ORANGEFS_BUFMAP_WAIT_TIMEOUT_SECS for it to come back. * (cosmetical) bitmap is used instead of an array of zeroes and ones * old (and only reached if we are about to corrupt memory) waiting for daemon restart in service_operation() removed. [Martin's fixes folded] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:54 -05:00
Al Viro	78699e29fd	orangefs: delay freeing slot until cancel completes Make cancels reuse the aborted read/write op, to make sure they do not fail on lack of memory. Don't issue a cancel unless the daemon has seen our read/write, has not replied and isn't being shut down. If cancel is issued, don't wait for it to complete; stash the slot in there and just have it freed when cancel is finally replied to or purged (and delay dropping the reference until then, obviously). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:53 -05:00
Mike Marshall	5090c9670d	Orangefs: improve gossip statement There were two just alike, making it hard maybe to tell which one you were looking at in syslog... so I changed it a little by adding some extra interesting tidbits to it... Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-04 13:29:27 -05:00
Al Viro	2a9e5c2260	orangefs: don't reinvent completion.h... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 15:20:11 -05:00
Al Viro	4f55e39732	if ORANGEFS_VFS_OP_FILE_IO request had been given up, don't bother waiting ... we are not going to get woken up anyway, so it's just going to time out and whine. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 15:20:11 -05:00
Al Viro	727cbfea62	orangefs: get rid of MSECS_TO_JIFFIES All timeouts are in _seconds_, so all calls are of form MSECS_TO_JIFFIES(n * 1000), which is a convoluted way to spell n * HZ. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 15:20:11 -05:00
Al Viro	ed42fe0593	orangefs: hopefully saner op refcounting and locking * create with refcount 1 * make op_release() decrement and free if zero (i.e. old put_op() has become that). * mark when submitter has given up waiting; from that point nobody else can move between the lists, change state, etc. * have daemon read/write_iter grab a reference when picking op and always give it up in the end * don't put into hash until we know it's been successfully passed to daemon * move op->lock _lower_ than htab_in_progress_lock (and make sure to take it in purge_inprogress_ops()) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 13:03:12 -05:00
Al Viro	fee25ce125	orangefs: make sure that reopening pvfs2-req won't overlap with the end of close Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 12:55:24 -05:00
Al Viro	831d094979	orangefs: move wakeups into set_op_state_{serviced,purged}() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 12:42:43 -05:00
Al Viro	90e54e36c9	orangefs: ->poll() doesn't need spinlock not just for list_empty()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 12:42:43 -05:00
Al Viro	8016387ce7	orangefs: kill ioctl32 rudiments Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 12:42:43 -05:00
Al Viro	83595db052	orangefs: ->poll() is only called between successful ->open() and ->release() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 12:42:43 -05:00
Al Viro	fb6d2526e9	orangefs: generic_file_open() is pointless for character devices Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-23 12:42:43 -05:00
Mike Marshall	cf0c27715b	Orangefs: make gossip statement more palatable to xtensa Thanks to Intel's kbuild test robot Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-19 12:04:40 -05:00
Mike Marshall	b3ae4755f5	Orangefs: implement .write_iter Until now, orangefs_devreq_write_iter has just been a wrapper for the old-fashioned orangefs_devreq_writev... linux would call .write_iter with "struct kiocb iocb" and "struct iov_iter iter" and .write_iter would just: return pvfs2_devreq_writev(iocb->ki_filp, iter->iov, iter->nr_segs, &iocb->ki_pos); Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-13 11:18:12 -05:00
Martin Brandenburg	7d2214858f	orangefs: Fix some more global namespace pollution. This only changes the names of things, so there is no functional change. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-01-04 16:21:46 -05:00
Martin Brandenburg	a762ae6dc5	orangefs: Remove ``aligned'' upcall and downcall length macros. There was previously MAX_ALIGNED_DEV_REQ_(UP\|DOWN)SIZE macros which evaluated to MAX_DEV_REQ_(UP\|DOWN)SIZE+8. As it is unclear what this is for, other than creating a situation where we accept more data than we can parse, it is removed. Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Martin Brandenburg <martin@omnibond.com>	2015-12-17 14:33:38 -05:00
Martin Brandenburg	90d26aa808	Orangefs: do not finalize bufmap if it was never initialized. Found by the infant Orangefs fuzzer... Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2015-12-15 15:37:53 -05:00
Mike Marshall	ce6c414e17	Orangefs: Don't wait the old-fashioned way. Get rid of add_wait_queue, set_current_state, etc, and use the wait_event() model. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2015-12-14 14:54:46 -05:00
Mike Marshall	97f100277c	Orangefs: de-uglify orangefs_devreq_writev, and devorangefs-req.c in general AV dislikes many parts of orangefs_devreq_writev. Besides making orangefs_devreq_writev more easily readable and better commented, this patch makes an effort to address some of the problems: > The 5th is quietly ignored unless trailer_size is positive and > status is zero. If trailer_size > 0 && status == 0, you verify that > the length of the 5th segment is no more than trailer_size and copy > it to vmalloc'ed buffer. Without bothering to zero the rest of that > buffer out. It was just wrong to allow a 5th segment that is not exactly equal to trailer_size. Now that that's fixed, there's nothing to zero out in the vmalloced buffer - it is exactly the right size to hold the 5th segment. > Another API bogosity: when the 5th segment is present, successful writev() > returns the sum of sizes of the first 4. Added size of 5th segment to writev return... > if concatenation of the first 4 segments is longer than > 16 + sizeof(struct pvfs2_downcall_s) by no more than sizeof(long) => whine > and proceed with garbage. If 4th segment isn't exactly sizeof(struct pvfs2_downcall_s), whine and fail. > if the 32bit value 4 bytes into op->downcall is zero and 64bit > value following it is non-zero, the latter is interpreted as the size of > trailer data. The latter is what userspace claimed was the length of the trailer data. The kernel module now compares it to the trailer iovec's iov_len as a sanity check. > if there's no trailer, the 5th segment (if present) is completely ignored. Whine and fail if there should be no trailer, yet a 5th segment is present. > if vmalloc fails, act as if status (32bit at offset 5 into > op->downcall) had been -ENOMEM and don't look at the 5th segment at all. whine and fail with -ENOMEM. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2015-12-14 13:32:05 -05:00
Mike Marshall	575e946125	Orangefs: change pvfs2 filenames to orangefs Also changed references within source files that referred to header files whose names had changed. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2015-12-04 12:56:14 -05:00

32 Commits