OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Stanislav Kinsbursky	4e37a7c207	nfsd: make delegations shutdown network namespace aware NFSv4 delegations are stored in global list. But they are nfs4_client dependent, which is network namespace aware already. State shutdown and laundromat are done per network namespace as well. So, delegations unhash have to be done in network namespace context. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:50 -05:00
Stanislav Kinsbursky	c9a4962881	nfsd: make client_lock per net This lock protects the client lru list and session hash table, which are allocated per network namespace already. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:50 -05:00
Stanislav Kinsbursky	ec28e02ca5	nfsd4: remove state lock from nfs4_state_shutdown Protection of __nfs4_state_shutdown() with nfs4_lock_state() looks redundant. This function is called by the last NFSd thread on it's exit and state lock protects actually two functions (del_recall_lru is protected by recall_lock): 1) nfsd4_client_tracking_exit 2) __nfs4_state_shutdown_net "nfsd4_client_tracking_exit" doesn't require state lock protection, because it's state can be modified only by tracker callbacks. Here a re they: 1) create: is called only from nfsd4_proc_compound. 2) remove: is called from either nfsd4_proc_compound or nfs4_laundromat. 3) check: is called only from nfsd4_proc_compound. 4) grace_done; called only from nfs4_laundromat. nfsd4_proc_compound is called onll by NFSd kthread, which is exiting right now. nfs4_laundromat is called by laundry_wq. But laundromat_work was canceled already. "__nfs4_state_shutdown_net" also doesn't require state lock protection, because all NFSd kthreads are dead, and no race can happen with NFSd start, because "nfsd_up" flag is still set. Moreover, all Nfsd shutdown is protected with global nfsd_mutex. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:49 -05:00
J. Bruce Fields	dba88ba55a	nfsd4: remove state lock from nfsd4_load_reboot_recovery_data That function is only called under nfsd_mutex: we know that because the only caller is nfsd_svc, via nfsd_svc nfsd_startup nfs4_state_start nfsd4_client_tracking_init client_tracking_ops->init == nfsd4_load_reboot_recovery_data The shared state accessed here includes: - user_recovery_dirname: used here, modified only by nfs4_reset_recoverydir, which can be verified to only be called under nfsd_mutex. - filesystem state, protected by i_mutex (handwaving slightly here) - rec_file, reclaim_str_hashtbl, reclaim_str_hashtbl_size: other than here, used only from code called from nfsd or laundromat threads, both of which should be started only after this runs (see nfsd_svc) and stopped before this could run again (see nfsd_shutdown, called from nfsd_last_thread). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-28 10:13:48 -05:00
J. Bruce Fields	a36b1725b3	nfsd4: return badname, not inval, on "." or "..", or "/" The spec requires badname, not inval, in these cases. Some callers want us to return enoent, but I can see no justification for that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-27 16:41:48 -05:00
J. Bruce Fields	063b0fb9fa	nfsd4: downgrade some fs/nfsd/nfs4state.c BUG's Linus has pointed out that indiscriminate use of BUG's can make it harder to diagnose bugs because they can bring a machine down, often before we manage to get any useful debugging information to the logs. (Consider, for example, a BUG() that fires in a workqueue, or while holding a spinlock). Most of these BUG's won't do much more than kill an nfsd thread, but it would still probably be safer to get out the warning without dying. There's still more of this to do in nfsd/. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:16 -05:00
J. Bruce Fields	ffe1137ba7	nfsd4: delay filling in write iovec array till after xdr decoding Our server rejects compounds containing more than one write operation. It's unclear whether this is really permitted by the spec; with 4.0, it's possibly OK, with 4.1 (which has clearer limits on compound parameters), it's probably not OK. No client that we're aware of has ever done this, but in theory it could be useful. The source of the limitation: we need an array of iovecs to pass to the write operation. In the worst case that array of iovecs could have hundreds of elements (the maximum rwsize divided by the page size), so it's too big to put on the stack, or in each compound op. So we instead keep a single such array in the compound argument. We fill in that array at the time we decode the xdr operation. But we decode every op in the compound before executing any of them. So once we've used that array we can't decode another write. If we instead delay filling in that array till the time we actually perform the write, we can reuse it. Another option might be to switch to decoding compound ops one at a time. I considered doing that, but it has a number of other side effects, and I'd rather fix just this one problem for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:15 -05:00
J. Bruce Fields	70cc7f75b1	nfsd4: move more write parameters into xdr argument In preparation for moving some of this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:14 -05:00
J. Bruce Fields	5a80a54d21	nfsd4: reorganize write decoding In preparation for moving some of it elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:14 -05:00
J. Bruce Fields	8a61b18c9b	nfsd4: simplify reading of opnum The comment here is totally bogus: - OP_WRITE + 1 is RELEASE_LOCKOWNER. Maybe there was some older version of the spec in which that served as a sort of OP_ILLEGAL? No idea, but it's clearly wrong now. - In any case, I can't see that the spec says anything about what to do if the client sends us less ops than promised. It's clearly nutty client behavior, and we should do whatever's easiest: returning an xdr error (even though it won't be consistent with the error on the last op returned) seems fine to me. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:13 -05:00
J. Bruce Fields	447bfcc936	nfsd4: no, we're not going to check tags for utf8 Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:08:12 -05:00
J. Bruce Fields	57d276d71a	nfsd: fix v4 reply caching Very embarassing: `1091006c5e` "nfsd: turn on reply cache for NFSv4" missed a line, effectively leaving the reply cache off in the v4 case. I thought I'd tested that, but I guess not. This time, wrote a pynfs test to confirm it works. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-26 09:05:19 -05:00
Stanislav Kinsbursky	0912128149	nfsd: make laundromat network namespace aware This patch moves laundromat_work to nfsd per-net context, thus allowing to run multiple laundries. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:51 -05:00
Stanislav Kinsbursky	12760c6685	nfsd: pass nfsd_net instead of net to grace enders Passing net context looks as overkill. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:50 -05:00
Stanislav Kinsbursky	3320fef19b	nfsd: use service net instead of hard-coded init_net This patch replaces init_net by SVC_NET(), where possible and also passes proper context to nested functions where required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:50 -05:00
Stanislav Kinsbursky	73758fed71	nfsd: make close_lru list per net This list holds nfs4 clients (open) stateowner queue for last close replay, which are network namespace aware. So let's make this list per network namespace too. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:49 -05:00
Stanislav Kinsbursky	5ed58bb243	nfsd: make client_lru list per net This list holds nfs4 clients queue for lease renewal, which are network namespace aware. So let's make this list per network namespace too. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:48 -05:00
Stanislav Kinsbursky	1872de0e81	nfsd: make sessionid_hashtbl allocated per net This hash holds established sessions state and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:47 -05:00
Stanislav Kinsbursky	20e9e2bc98	nfsd: make lockowner_ino_hashtbl allocated per net This hash holds file lock owners and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:47 -05:00
Stanislav Kinsbursky	9b53113740	nfsd: make ownerstr_hashtbl allocated per net This hash holds open owner state and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:46 -05:00
Stanislav Kinsbursky	a99454aa4f	nfsd: make unconf_name_tree per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:45 -05:00
Stanislav Kinsbursky	0a7ec37727	nfsd: make unconf_id_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:45 -05:00
Stanislav Kinsbursky	382a62e76c	nfsd: make conf_name_tree per net This tree holds nfs4_clients info, which are network namespace aware. So let's make it per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:44 -05:00
Stanislav Kinsbursky	8daae4dc0d	nfsd: make conf_id_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:43 -05:00
Stanislav Kinsbursky	52e19c09a1	nfsd: make reclaim_str_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash is used only by legacy tracker. So let's allocate hash in tracker init. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:43 -05:00
Stanislav Kinsbursky	c212cecfa2	nfsd: make nfs4_client network namespace dependent And use it's net where possible. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:42 -05:00
Stanislav Kinsbursky	7f2210fa6b	nfsd: use service net instead of hard-coded net where possible Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:41 -05:00
J. Bruce Fields	621eb19ce1	svcrpc: Revert "sunrpc/cache.h: replace simple_strtoul" Commit `bbf43dc888` "sunrpc/cache.h: replace simple_strtoul" introduced new range-checking which could cause get_int to fail on unsigned integers too large to be represented as an int. We could parse them as unsigned instead--but it turns out svcgssd is actually passing down "-1" in some cases. Which is perhaps stupid, but there's nothing we can do about it now. So just revert back to the previous "sloppy" behavior that accepts either representation. Cc: stable@vger.kernel.org Reported-by: Sven Geggus <lists@fuchsschwanzdomain.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-15 07:40:32 -05:00
Fengguang Wu	2b4cf668a7	nfsd4: get_backchannel_cred should be static Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-14 11:23:00 -05:00
Fengguang Wu	135ae8270d	nfsd4: init_session should be declared static Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-14 11:23:00 -05:00
Jeff Layton	7e4f015d81	nfsd: release the legacy reclaimable clients list in grace_done The current code holds on to this list until nfsd is shut down, but it's never touched once the grace period ends. Release that memory back into the wild when the grace period ends. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:12 -05:00
Jeff Layton	2216d449a9	nfsd: get rid of cl_recdir field Remove the cl_recdir field from the nfs4_client struct. Instead, just compute it on the fly when and if it's needed, which is now only when the legacy client tracking code is in effect. The error handling in the legacy client tracker is also changed to handle the case where md5 is unavailable. In that case, we'll warn the admin with a KERN_ERR message and disable the client tracking. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	ac55fdc408	nfsd: move the confirmed and unconfirmed hlists to a rbtree The current code requires that we md5 hash the name in order to store the client in the confirmed and unconfirmed trees. Change it instead to store the clients in a pair of rbtrees, and simply compare the cl_names directly instead of hashing them. This also necessitates that we add a new flag to the clp->cl_flags field to indicate which tree the client is currently in. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	0ce0c2b5d2	nfsd: don't search for client by hash on legacy reboot recovery gracedone When nfsd starts, the legacy reboot recovery code creates a tracking struct for each directory in the v4recoverydir. When the grace period ends, it basically does a "readdir" on the directory again, and matches each dentry in there to an existing client id to see if it should be removed or not. If the matching client doesn't exist, or hasn't reclaimed its state then it will remove that dentry. This is pretty inefficient since it involves doing a lot of hash-bucket searching. It also means that we have to keep relying on being able to search for a nfs4_client by md5 hashed cl_recdir name. Instead, add a pointer to the nfs4_client that indicates the association between the nfs4_client_reclaim and nfs4_client. When a reclaim operation comes in, we set the pointer to make that association. On gracedone, the legacy client tracker will keep the recdir around iff: 1/ there is a reclaim record for the directory ...and... 2/ there's an association between the reclaim record and a client record -- that is, a create or check operation was performed on the client that matches that directory. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	772a9bbbb5	nfsd: make nfs4_client_to_reclaim return a pointer to the reclaim record Later callers will need to make changes to the record. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	ce30e5392f	nfsd: break out reclaim record removal into separate function We'll need to be able to call this from nfs4recover.c eventually. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	278c931cb0	nfsd: have nfsd4_find_reclaim_client take a char * argument Currently, it takes a client pointer, but later we're going to need to search for these records without knowing whether a matching client even exists. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:11 -05:00
Jeff Layton	8b0554e9a2	nfsd: warn about impending removal of nfsdcld upcall Let's shoot for removing the nfsdcld upcall in 3.10. Most likely, no one is actually using it so I don't expect this warning to fire often (except maybe on misconfigured systems). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	f3aa7e24c9	nfsd: pass info about the legacy recoverydir in environment variables The usermodehelper upcall program can then decide to use this info as a (one-way) transition mechanism to the new scheme. When a "check" upcall occurs and the client doesn't exist in the database, we can look to see whether the directory exists. If it does, then we'd add the client to the database, remove the legacy recdir, and return success to the kernel to allow the recovery to proceed. For gracedone, we simply pass the v4recovery "topdir" so that the upcall can clean it out prior to returning to the kernel. A module parm is also added to disable the legacy conversion if the admin chooses. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	2d77bf0a55	nfsd: change heuristic for selecting the client_tracking_ops First, try to use the new usermodehelper upcall. It should succeed or fail quickly, so there's little cost to doing so. If it fails, and the legacy tracking dir exists, use that. If it doesn't exist then fall back to using nfsdcld. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	2873d2147e	nfsd: add a usermodehelper upcall for NFSv4 client ID tracking Add a new client tracker upcall type that uses call_usermodehelper to call out to a program. This seems to be the preferred method of calling out to usermode these days for seldom-called upcalls. It's simple and doesn't require a running daemon, so it should "just work" as long as the binary is installed. The client tracking exit operation is also changed to check for a NULL pointer before running. The UMH upcall doesn't need to do anything at module teardown time. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-12 18:55:10 -05:00
Jeff Layton	a0af710a65	nfsd: remove unused argument to nfs4_has_reclaimed_state Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-10 14:56:54 -05:00
Jeff Layton	698d8d875a	nfsd: fix error handling in nfsd4_remove_clid_dir If the credential save fails, then we'll leak our mnt_want_write_file reference. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-10 14:52:03 -05:00
J. Bruce Fields	292a41716a	nfsd4: update documentation on 4.1 progress Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-10 14:52:03 -05:00
J. Bruce Fields	12fc3e92d4	nfsd4: backchannel should use client-provided security flavor For now this only adds support for AUTH_NULL. (Previously we assumed AUTH_UNIX.) We'll also need AUTH_GSS, which is trickier. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:40:05 -05:00
J. Bruce Fields	57725155dc	nfsd4: common helper to initialize callback work I've found it confusing having the only references to nfsd4_do_callback_rpc() in a different file. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:40:04 -05:00
J. Bruce Fields	cb73a9f464	nfsd4: implement backchannel_ctl operation This operation is mandatory for servers to implement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:39:58 -05:00
J. Bruce Fields	c6bb3ca27d	nfsd4: use callback security parameters in create_session We're currently ignoring the callback security parameters specified in create_session, and just assuming the client wants auth_sys, because that's all the current linux client happens to care about. But this could cause us callbacks to fail to a client that wanted something different. For now, all we're doing is no longer ignoring the uid and gid passed in the auth_sys case. Further patches will add support for auth_null and gss (and possibly use more of the auth_sys information; the spec wants us to use exactly the credential we're passed, though it's hard to imagine why a client would care). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:35 -05:00
J. Bruce Fields	acb2887e04	nfsd4: clean up callback security parsing Move the callback parsing into a separate function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:35 -05:00
J. Bruce Fields	face15025f	nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes NFSv4 shares the same struct file across multiple writes. (And we'd like NFSv2 and NFSv3 to do that as well some day.) So setting O_SYNC on the struct file as a way to request a synchronous write doesn't work. Instead, do a vfs_fsync_range() in that case. Reported-by: Peter Staubach <pstaubach@exagrid.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-11-07 19:31:34 -05:00

1 2 3 4 5 ...

334784 Commits All Branches Search

334784 Commits

All Branches