OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
J. Bruce Fields	30c0e1ef0a	nfsd4: bad BUG() in preprocess_stateid_op It's OK for this function to return without setting filp--we do it in the special-stateid case. And there's a legitimate case where we can hit this, since we do permit reads on write-only stateid's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:20:51 -04:00
Linus Torvalds	763008c435	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix an Oops in the NFSv4 atomic open code NFS: Fix the selection of security flavours in Kconfig NFS: fix the return value of nfs_file_fsync() rpcrdma: Fix SQ size calculation when memreg is FRMR xprtrdma: Do not truncate iova_start values in frmr registrations. nfs: Remove redundant NULL check upon kfree() nfs: Add "lookupcache" to displayed mount options NFS: allow close-to-open cache semantics to apply to root of NFS filesystem SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)	2010-08-18 15:45:23 -07:00
Trond Myklebust	df486a2590	NFS: Fix the selection of security flavours in Kconfig Randy Dunlap reports: ERROR: "svc_gss_principal" [fs/nfs/nfs.ko] undefined! because in fs/nfs/Kconfig, NFS_V4 selects RPCSEC_GSS_KRB5 and/or in fs/nfsd/Kconfig, NFSD_V4 selects RPCSEC_GSS_KRB5. RPCSEC_GSS_KRB5 does 5 selects, but none of these is enforced/followed by the fs/nfs[d]/Kconfig configs: select SUNRPC_GSS select CRYPTO select CRYPTO_MD5 select CRYPTO_DES select CRYPTO_CBC Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: J. Bruce Fields <bfields@fieldses.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-08-17 17:42:45 -04:00
Linus Torvalds	8c8946f509	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.	2010-08-10 11:39:13 -07:00
Linus Torvalds	5f248c9c25	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c	2010-08-10 11:26:52 -07:00
Christoph Hellwig	ebabe9a900	pass a struct path to vfs_statfs We'll need the path to implement the flags field for statvfs support. We do have it available in all callers except: - ecryptfs_statfs. This one doesn't actually need vfs_statfs but just needs to do a caller to the lower filesystem statfs method. - sys_ustat. Add a non-exported statfs_by_dentry helper for it which doesn't won't be able to fill out the flags field later on. In addition rename the helpers for statfs vs fstatfs to do_*statfs instead of the misleading vfs prefix. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:48:42 -04:00
Linus Torvalds	0d9f9e122c	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd4: fix file open accounting for RDWR opens nfsd: don't allow setting maxblksize after svc created nfsd: initialize nfsd versions before creating svc net: sunrpc: removed duplicated #include nfsd41: Fix a crash when a callback is retried nfsd: fix startup/shutdown order bug nfsd: minor nfsd read api cleanup gcc-4.6: nfsd: fix initialized but not read warnings nfsd4: share file descriptors between stateid's nfsd4: fix openmode checking on IO using lock stateid nfsd4: miscellaneous process_open2 cleanup nfsd4: don't pretend to support write delegations nfsd: bypass readahead cache when have struct file nfsd: minor nfsd_svc() cleanup nfsd: move more into nfsd_startup() nfsd: just keep single lockd reference for nfsd nfsd: clean up nfsd_create_serv error handling nfsd: fix error handling in __write_ports_addxprt nfsd: fix error handling when starting nfsd with rpcbind down nfsd4: fix v4 state shutdown error paths ...	2010-08-07 14:24:41 -07:00
J. Bruce Fields	998db52c03	nfsd4: fix file open accounting for RDWR opens Commit `f9d7562fdb` "nfsd4: share file descriptors between stateid's" didn't correctly account for O_RDWR opens. Symptoms include leaked files, resulting in failures to unmount and/or warnings about orphaned inodes on reboot. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-07 09:50:05 -04:00
J. Bruce Fields	7fa53cc872	nfsd: don't allow setting maxblksize after svc created It's harmless to set this after the server is created, but also ineffective, since the value is only used at the time of svc_create_pooled(). So fail the attempt, in keeping with the pattern set by write_versions, write_{lease,grace}time and write_recoverydir. (This could break userspace that tried to write to nfsd/max_block_size between setting up sockets and starting the server. However, such code wouldn't have worked anyway, and I don't know of any examples--rpc.nfsd in nfs-utils, probably the only user of the interface, doesn't do that.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 18:00:33 -04:00
J. Bruce Fields	e844a7b980	nfsd: initialize nfsd versions before creating svc Commit `59db4a0c10` "nfsd: move more into nfsd_startup()" inadvertently moved nfsd_versions after nfsd_create_svc(). On older distributions using an rpc.nfsd that does not explicitly set the list of nfsd versions, this results in svc-create_pooled() being called with an empty versions array. The resulting incomplete initialization leads to a NULL dereference in svc_process_common() the first time a client accesses the server. Move nfsd_reset_versions() back before the svc_create_pooled(); this time, put it closer to the svc_create_pooled() call, to make this mistake more difficult in the future. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:40 -04:00
Boaz Harrosh	c18c821fd4	nfsd41: Fix a crash when a callback is retried If a callback is retried at nfsd4_cb_recall_done() due to some error, the returned rpc reply crashes here: @@ -514,6 +514,7 @@ decode_cb_sequence(struct xdr_stream xdr, struct nfsd4_cb_sequence res, u32 dummy; __be32 *p; + BUG_ON(!res); if (res->cbs_minorversion == 0) return 0; [BUG_ON added for demonstration] This is because the nfsd4_cb_done_sequence() has NULLed out the task->tk_msg.rpc_resp pointer. Also eventually the rpc would use the new slot without making sure it is free by calling nfsd41_cb_setup_sequence(). This problem was introduced by a 4.1 protocol addition patch: [`0421b5c5`] nfsd41: Backchannel: Implement cb_recall over NFSv4.1 Which was overlooking the possibility of an RPC callback retries. For not-4.1 case redoing the _prepare is harmless. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:39 -04:00
J. Bruce Fields	774f8bbd9e	nfsd: fix startup/shutdown order bug We must create the server before we can call init_socks or check the number of threads. Symptoms were a NULL pointer dereference in nfsd_svc(). Problem identified by Jeff Layton. Also fix a minor cleanup-on-error case in nfsd_startup(). Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:30 -04:00
J. Bruce Fields	039a87ca53	nfsd: minor nfsd read api cleanup Christoph points that the NFSv2/v3 callers know which case they want here, so we may as well just call the file=NULL case directly instead of making this conditional. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-30 12:54:54 -04:00
Andi Kleen	6904996101	gcc-4.6: nfsd: fix initialized but not read warnings Fixes at least one real minor bug: the nfs4 recovery dir sysctl would not return its status properly. Also I finished Al's `1e41568d73` ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") commit, it moved the IMA code, but left the old path initializer in there. The rest is just dead code removed I think, although I was not fully sure about the "is_borc" stuff. Some more review would be still good. Found by gcc 4.6's new warnings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 19:32:17 -04:00
J. Bruce Fields	f9d7562fdb	nfsd4: share file descriptors between stateid's The vfs doesn't really allow us to "upgrade" a file descriptor from read-only to read-write, and our attempt to do so in nfs4_upgrade_open is ugly and incomplete. Move to a different scheme where we keep multiple opens, shared between open stateid's, in the nfs4_file struct. Each file will be opened at most 3 times (for read, write, and read-write), and those opens will be shared between all clients and openers. On upgrade we will do another open if necessary instead of attempting to upgrade an existing open. We keep count of the number of readers and writers so we know when to close the shared files. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-29 18:19:23 -04:00
J. Bruce Fields	0292191417	nfsd4: fix openmode checking on IO using lock stateid It is legal to perform a write using the lock stateid that was originally associated with a read lock, or with a file that was originally opened for read, but has since been upgraded. So, when checking the openmode, check the mode associated with the open stateid from which the lock was derived. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:37:12 -04:00
J. Bruce Fields	21fb4016bd	nfsd4: miscellaneous process_open2 cleanup Move more work into helper functions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:34:29 -04:00
J. Bruce Fields	c3e4808086	nfsd4: don't pretend to support write delegations The delegation code mostly pretends to support either read or write delegations. However, correct support for write delegations would require, for example, breaking of delegations (and/or implementation of cb_getattr) on stat. Currently all that stops us from handing out delegations is a subtle reference-counting issue. Avoid confusion by adding an earlier check that explicitly refuses write delegations. For now, though, I'm not going so far as to rip out existing half-support for write delegations, in case we get around to using that soon. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:05:51 -04:00
Eric Paris	2a12a9d781	fsnotify: pass a file instead of an inode to open, read, and write fanotify, the upcoming notification system actually needs a struct path so it can do opens in the context of listeners, and it needs a file so it can get f_flags from the original process. Close was the only operation that already was passing a struct file to the notification hook. This patch passes a file for access, modify, and open as well as they are easily available to these hooks. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:32 -04:00
J. Bruce Fields	fa0a21269f	nfsd: bypass readahead cache when have struct file The readahead cache compensates for the fact that the NFS server currently does an open and close on every IO operation in the NFSv2 and NFSv3 case. In the NFSv4 case we have long-lived struct files associated with client opens, so there's no need for this. In fact, concurrent IO's using trying to modify the same file->f_ra may cause problems. So, don't bother with the readahead cache in that case. Note eventually we'll likely do this in the v2/v3 case as well by keeping a cache of struct files instead of struct file_ra_state's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-27 18:15:54 -04:00
J. Bruce Fields	af4718f3f9	nfsd: minor nfsd_svc() cleanup More idiomatic to put the error case in the if clause. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:27 -04:00
J. Bruce Fields	59db4a0c10	nfsd: move more into nfsd_startup() This is just cleanup--it's harmless to call nfsd_rachache_init, nfsd_init_socks, and nfsd_reset_versions more than once. But there's no point to it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:26 -04:00
Jeff Layton	ac77efbe2b	nfsd: just keep single lockd reference for nfsd Right now, nfsd keeps a lockd reference for each socket that it has open. This is unnecessary and complicates the error handling on startup and shutdown. Change it to just do a lockd_up when starting the first nfsd thread just do a single lockd_down when taking down the last nfsd thread. Because of the strange way the sv_count is handled this requires an extra flag to tell whether the nfsd_serv holds a reference for lockd or not. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:26 -04:00
Jeff Layton	628b368728	nfsd: clean up nfsd_create_serv error handling There doesn't seem to be any need to reset the nfssvc_boot time if the nfsd startup failed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:25 -04:00
Jeff Layton	0cd14a061e	nfsd: fix error handling in __write_ports_addxprt __write_ports_addxprt calls nfsd_create_serv. That increases the refcount of nfsd_serv (which is tracked in sv_nrthreads). The service only decrements the thread count on error, not on success like __write_ports_addfd does, so using this interface leaves the nfsd thread count high. Fix this by having this function call svc_destroy() on error to release the reference (and possibly to tear down the service) and simply decrement the refcount without tearing down the service on success. This makes the sv_threads handling work basically the same in both __write_ports_addxprt and __write_ports_addfd. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:24 -04:00
Jeff Layton	78a8d7c8ca	nfsd: fix error handling when starting nfsd with rpcbind down The refcounting for nfsd is a little goofy. What happens is that we create the nfsd RPC service, attach sockets to it but don't actually start the threads until someone writes to the "threads" procfile. To do this, __write_ports_addfd will create the nfsd service and then will decrement the refcount when exiting but won't actually destroy the service. This is fine when there aren't errors, but when there are this can cause later attempts to start nfsd to fail. nfsd_serv will be set, and that causes __write_versions to return EBUSY. Fix this by calling svc_destroy on nfsd_serv when this function is going to return error. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:23 -04:00
Jeff Layton	4ad9a344be	nfsd4: fix v4 state shutdown error paths If someone tries to shut down the laundry_wq while it isn't up it'll cause an oops. This can happen because write_ports can create a nfsd_svc before we really start the nfs server, and we may fail before the server is ever started. Also make sure state is shutdown on error paths in nfsd_svc(). Use a common global nfsd_up flag instead of nfs4_init, and create common helper functions for nfsd start/shutdown, as there will be other work that we want done only when we the number of nfsd threads transitions between zero and nonzero. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:22 -04:00
J. Bruce Fields	55b13354d7	nfsd: remove unused assignment from nfsd_link Trivial cleanup, since "dest" is never used. Reported-by: Anshul Madan <Anshul.Madan@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:50:39 -04:00
Chuck Lever	43a9aa64a2	NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR Some well-known NFSv3 clients drop their directory entry caches when they receive replies with no WCC data. Without this data, they employ extra READ, LOOKUP, and GETATTR requests to ensure their directory entry caches are up to date, causing performance to suffer needlessly. In order to return WCC data, our server has to have both the pre-op and the post-op attribute data on hand when a reply is XDR encoded. The pre-op data is filled in when the incoming fh is locked, and the post-op data is filled in when the fh is unlocked. Unfortunately, for REMOVE, RMDIR, MKNOD, and MKDIR, the directory fh is not unlocked until well after the reply has been XDR encoded. This means that encode_wcc_data() does not have wcc_data for the parent directory, so none is returned to the client after these operations complete. By unlocking the parent directory fh immediately after the internal operations for each NFS procedure is complete, the post-op data is filled in before XDR encoding starts, so it can be returned to the client properly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-07 17:12:32 -04:00
J. Bruce Fields	6a85d6c769	nfsd4: comment nitpick Reported-by: "Madan, Anshul" <Anshul.Madan@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-06 12:40:22 -04:00
J. Bruce Fields	cba9ba4b90	nfsd4: fix delegation recall race use-after-free When the rarely-used callback-connection-changing setclientid occurs simultaneously with a delegation recall, we rerun the recall by requeueing it on a workqueue. But we also need to take a reference on the delegation in that case, since the delegation held by the rpc itself will be released by the rpc_release callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-24 12:24:55 -04:00
J. Bruce Fields	ac94bf5825	nfsd4: fix deleg leak on callback error Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-24 12:24:53 -04:00
J. Bruce Fields	ec8acac84a	nfsd4: remove some debugging code This is overkill. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 22:29:03 -04:00
Benny Halevy	9303bbd3de	nfsd: nfs4callback encode_stateid helper function To be used also for the pnfs cb_layoutrecall callback Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd4: fix cb_recall encoding] "nfsd: nfs4callback encode_stateid helper function" forgot to reserve more space after return from the new helper. Reported-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:51 -04:00
J. Bruce Fields	4731030d58	nfsd4: translate memory errors to delay, not serverfault If the server is out of memory is better for clients to back off and retry than to just error out. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:36 -04:00
J. Bruce Fields	76407f76e0	nfsd4; fix session reference count leak Note the session has to be put() here regardless of what happens to the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:28 -04:00
Linus Torvalds	b95a568093	Merge branch 'for-2.6.35' of git://linux-nfs.org/~bfields/linux * 'for-2.6.35' of git://linux-nfs.org/~bfields/linux: nfsd4: shut down callback queue outside state lock nfsd: nfsd_setattr needs to call commit_metadata	2010-06-09 12:43:04 -07:00
J. Bruce Fields	44b56603c4	Merge branch 'for-2.6.34-incoming' into for-2.6.35-incoming	2010-06-08 20:05:18 -04:00
J. Bruce Fields	c3935e3049	nfsd4: shut down callback queue outside state lock This reportedly causes a lockdep warning on nfsd shutdown. That looks like a false positive to me, but there's no reason why this needs the state lock anyway. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-08 19:33:52 -04:00
Christoph Hellwig	b160fdabe9	nfsd: nfsd_setattr needs to call commit_metadata The conversion of write_inode_now calls to commit_metadata in commit `f501912a35` missed out the call in nfsd_setattr. But without this conversion we can't guarantee that a SETATTR request has actually been commited to disk with XFS, which causes a regression from 2.6.32 (only for NFSv2, but anyway). Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-01 19:17:50 -04:00
J. Bruce Fields	68a4b48ce6	nfsd4: don't bother storing callback reply tag We don't use this, and probably never will. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:59 -04:00
J. Bruce Fields	24a0111e40	nfsd4: fix use of op_share_access NFSv4.1 adds additional flags to the share_access argument of the open call. These flags need to be masked out in some of the existing code, but current code does that inconsistently. Tested-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:55 -04:00
J. Bruce Fields	172c85dd57	nfsd4: treat more recall errors as failures If a recall fails for some unexpected reason, instead of ignoring it and treating it like a success, it's safer to treat it as a failure, preventing further delgation grants and returning CB_PATH_DOWN. Also put put switches in a (two me) more logical order, with normal case first. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:53 -04:00
J. Bruce Fields	378b7d37f9	nfsd4: remove extra put() on callback errors Since rpc_call_async() guarantees that the release method will be called even on failure, this put is wrong. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:51 -04:00
Alexey Dobriyan	4be929be34	kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN - C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not USHORT_MAX/SHORT_MAX/SHORT_MIN. - Make SHRT_MIN of type s16, not int, for consistency. [akpm@linux-foundation.org: fix drivers/dma/timb_dma.c] [akpm@linux-foundation.org: fix security/keys/keyring.c] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 08:07:02 -07:00
Christoph Hellwig	8018ab0574	sanitize vfs_fsync calling conventions Now that the last user passing a NULL file pointer is gone we can remove the redundant dentry argument and associated hacks inside vfs_fsynmc_range. The next step will be removig the dentry argument from ->fsync, but given the luck with the last round of method prototype changes I'd rather defer this until after the main merge window. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
Christoph Hellwig	e970a573ce	nfsd: open a file descriptor for fsync in nfs4 recovery Instead of just looking up a path use do_filp_open to get us a file structure for the nfs4 recovery directory. This allows us to get rid of the last non-standard vfs_fsync caller with a NULL file pointer. [AV: should be using fput(), not filp_close()] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
J. Bruce Fields	e4e83ea47b	Revert "nfsd4: distinguish expired from stale stateids" This reverts commit `78155ed75f`. We're depending here on the boot time that we use to generate the stateid being monotonic, but get_seconds() is not necessarily. We still depend at least on boot_time being different every time, but that is a safer bet. We have a few reports of errors that might be explained by this problem, though we haven't been able to confirm any of them. But the minor gain of distinguishing expired from stale errors seems not worth the risk. Conflicts: fs/nfsd/nfs4state.c Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 19:03:50 -04:00
Pavel Emelyanov	47cee541a4	nfsd: safer initialization order in find_file() The alloc_init_file() first adds a file to the hash and then initializes its fi_inode, fi_id and fi_had_conflict. The uninitialized fi_inode could thus be erroneously checked by the find_file(), so move the hash insertion lower. The client_mutex should prevent this race in practice; however, we eventually hope to make less use of the client_mutex, so the ordering here is an accident waiting to happen. I didn't find whether the same can be true for two other fields, but the common sense tells me it's better to initialize an object before putting it into a global hash table :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 12:05:20 -04:00
J. Bruce Fields	b7299f4439	nfs4: minor callback code simplification, comment Note the position in the version array doesn't have to match the actual rpc version number--to me it seems clearer to maintain the distinction. Also document choice of rpc callback version number, as discussed in e.g. http://www.ietf.org/mail-archive/web/nfsv4/current/msg07985.html and followups. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 11:51:38 -04:00
Pavel Emelyanov	15ddb4aec5	NFSD: don't report compiled-out versions as present The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether the particular nfsd version is present/available. The problem is that once I turn off e.g. NFSD-V4 this call returns -1 which is true from the callers POV which is wrong. The proposal is to report false in that case. The bug has existed since `6658d3a7bb` "[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions". Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: stable@kernel.org Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-14 18:46:14 -04:00
J. Bruce Fields	4dc6ec00f6	nfsd4: implement reclaim_complete This is a mandatory operation. Also, here (not in open) is where we should be committing the reboot recovery information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 12:03:11 -04:00
Benny Halevy	ab707e1565	nfsd4: nfsd4_destroy_session must set callback client under the state lock nfsd4_set_callback_client must be called under the state lock to atomically set or unset the callback client and shutting down the previous one. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:59:11 -04:00
Benny Halevy	d76829889a	nfsd4: keep a reference count on client while in use Get a refcount on the client on SEQUENCE, Release the refcount and renew the client when all respective compounds completed. Do not expire the client by the laundromat while in use. If the client was expired via another path, free it when the compounds complete and the refcount reaches 0. Note that unhash_client_locked must call list_del_init on cl_lru as it may be called twice for the same client (once from nfs4_laundromat and then from expire_client) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:58:54 -04:00
Benny Halevy	07cd4909a6	nfsd4: mark_client_expired Mark the client as expired under the client_lock so it won't be renewed when an nfsv4.1 session is done, after it was explicitly expired during processing of the compound. Do not renew a client mark as expired (in particular, it is not on the lru list anymore) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:47:22 -04:00
Benny Halevy	46583e2597	nfsd4: introduce nfs4_client.cl_refcount Currently just initialize the cl_refcount to 1 and decrement in expire_client(), conditionally freeing the client when the refcount reaches 0. To be used later by nfsv4.1 compounds to keep the client from timing out while in use. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:47:03 -04:00
Benny Halevy	84d38ac9ab	nfsd4: refactor expire_client Separate out unhashing of the client and session. To be used later by the laundromat. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:02 -04:00
Benny Halevy	36acb66bda	nfsd4: extend the client_lock to cover cl_lru To be used later on to hold a reference count on the client while in use by a nfsv4.1 compound. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:02 -04:00
Benny Halevy	328efbab0f	nfsd4: use list_move in move_to_confirmed rather than list_del_init, list_add Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
Benny Halevy	be1fdf6c43	nfsd4: fold release_session into expire_client and grab the client lock once for all the client's sessions. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
Benny Halevy	9089f1b478	nfsd4: rename sessionid_lock to client_lock In preparation to share the lock's scope to both client and session hash tables. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
J. Bruce Fields	5d4cec2f2f	nfsd4: fix bare destroy_session null dereference It's legal to send a DESTROY_SESSION outside any session (as the only operation in a compound), in which case cstate->session will be NULL; check for that case. While we're at it, move these checks into a separate helper function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-07 19:08:47 -04:00
J. Bruce Fields	5306293c9c	Merge commit 'v2.6.34-rc6' Conflicts: fs/nfsd/nfs4callback.c	2010-05-04 11:29:05 -04:00
Benny Halevy	dbd65a7e44	nfsd4: use local variable in nfs4svc_encode_compoundres 'cs' is already computed, re-use it. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-04 10:10:36 -04:00
J. Bruce Fields	26c0c75e69	nfsd4: fix unlikely race in session replay case In the replay case, the renew_client(session->se_client); happens after we've droppped the sessionid_lock, and without holding a reference on the session; so there's nothing preventing the session being freed before we get here. Thanks to Benny Halevy for catching a bug in an earlier version of this patch. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Benny Halevy <bhalevy@panasas.com>	2010-05-03 08:32:31 -04:00
Neil Brown	2bc3c1179c	nfsd4: bug in read_buf When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. We never encountered thsi in testing because typically the only operations which use more than two pages are write-like operations, which have their own decoding logic. Something like a getattr after a write may cross a page boundary, but it would be very unusual for it to cross another boundary after that. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-26 15:39:08 -04:00
Dan Carpenter	d03859a4ac	nfsd: potential ERR_PTR dereference on exp_export() error paths. We "goto finish" from several places where "exp" is an ERR_PTR. Also I changed the check for "fsid_key" so that it was consistent with the check I added. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 12:03:02 -04:00
J. Bruce Fields	5771635592	nfsd4: complete enforcement of 4.1 op ordering Enforce the rules about compound op ordering. Motivated by implementing RECLAIM_COMPLETE, for which the client is implicit in the current session, so it is important to ensure a succesful SEQUENCE proceeds the RECLAIM_COMPLETE. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:35:14 -04:00
J. Bruce Fields	4b21d0defc	nfsd4: allow 4.0 clients to change callback path The rfc allows a client to change the callback parameters, but we didn't previously implement it. Teach the callbacks to rerun themselves (by placing themselves on a workqueue) when they recognize that their rpc task has been killed and that the callback connection has changed. Then we can change the callback connection by setting up a new rpc client, modifying the nfs4 client to point at it, waiting for any work in progress to complete, and then shutting down the old client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	2bf23875f5	nfsd4: rearrange cb data structures Mainly I just want to separate the arguments used for setting up the tcp client from the rest. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	b12a05cbdf	nfsd4: cl_count is unused Now that the shutdown sequence guarantees callbacks are shut down before the client is destroyed, we no longer have a use for cl_count. We'll probably reinstate a reference count on the client some day, but it will be held by users other than callbacks. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	b5a1a81e5c	nfsd4: don't sleep in lease-break callback The NFSv4 server's fl_break callback can sleep (dropping the BKL), in order to allocate a new rpc task to send a recall to the client. As far as I can tell this doesn't cause any races in the current code, but the analysis is difficult. Also, the sleep here may complicate the move away from the BKL. So, just schedule some work to do the job for us instead. The work will later also prove useful for restarting a call after the callback information is changed. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:01 -04:00
J. Bruce Fields	3c4ab2aaa9	nfsd4: indentation cleanup Looks like a put-and-paste mistake. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-19 15:12:51 -04:00
J. Bruce Fields	408b79bcc3	nfsd4: consistent session flag setting We should clear these flags on any new create_session, not just on the first one. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-16 21:47:37 -04:00
J. Bruce Fields	9045b4b9f7	nfsd4: remove probe task's reference on client Any null probe rpc will be synchronously destroyed by the rpc_shutdown_client() in expire_client(), so the rpc task cannot outlast the nfs4 client. Therefore there's no need for that task to hold a reference on the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 17:04:32 -04:00
J. Bruce Fields	3df796dbe9	nfsd4: remove dprintk I haven't found this useful. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 17:04:31 -04:00
J. Bruce Fields	147efd0dd7	nfsd4: shutdown callbacks on expiry Once we've expired the client, there's no further purpose to the callbacks; go ahead and shut down the callback client rather than waiting for the last reference to go. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 16:36:30 -04:00
J. Bruce Fields	227f98d98d	nfsd4: preallocate nfs4_rpc_args Instead of allocating this small structure, just include it in the delegation. The nfsd4_callback structure isn't really necessary yet, but we plan to add to it all the information necessary to perform a callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 16:28:11 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jeff Layton	91885258e8	nfsd: don't break lease while servicing a COMMIT This is the second attempt to fix the problem whereby a COMMIT call causes a lease break and triggers a possible deadlock. The problem is that nfsd attempts to break a lease on a COMMIT call. This triggers a delegation recall if the lease is held for a delegation. If the client is the one holding the delegation and it's the same one on which it's issuing the COMMIT, then it can't return that delegation until the COMMIT is complete. But, nfsd won't complete the COMMIT until the delegation is returned. The client and server are essentially deadlocked until the state is marked bad (due to the client not responding on the callback channel). The first patch attempted to deal with this by eliminating the open of the file altogether and simply had nfsd_commit pass a NULL file pointer to the vfs_fsync_range. That would conflict with some work in progress by Christoph Hellwig to clean up the fsync interface, so this patch takes a different approach. This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates to nfsd_open that it should not break any leases when opening the file, and has nfsd_commit set that flag on the nfsd_open call. For now, this patch leaves nfsd_commit opening the file with write access since I'm not clear on what sort of access would be more appropriate. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-22 15:37:53 -04:00
NeilBrown	61f8603d93	nfsd: factor out hash functions for export caches. Both the _lookup and the _update functions for these two caches independently calculate the hash of the key. So factor out that code for improved reuse. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-16 18:05:11 -04:00
J. Bruce Fields	e739cf1da4	Merge commit 'v2.6.34-rc1' into for-2.6.35-incoming	2010-03-09 17:22:08 -05:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
J. Bruce Fields	e7b184f199	nfsd4: document lease/grace-period limits The current documentation here is out of date, and not quite right. (Future work: some user documentation would be useful.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:10 -05:00
J. Bruce Fields	efc4bb4fdd	nfsd4: allow setting grace period time Allow explicit configuration of the grace period time as well as the lease period time. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:08 -05:00
J. Bruce Fields	f013574014	nfsd4: reshuffle lease-setting code to allow reuse We'll soon allow setting the grace period, so we'll want to share this code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:03 -05:00
J. Bruce Fields	f958a1320f	nfsd4: remove unnecessary lease-setting function This is another layer of indirection that doesn't really buy us anything. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:03 -05:00
J. Bruce Fields	e46b498c84	nfsd4: simplify lease/grace interaction The original code here assumed we'd allow the user to change the lease any time, but only allow the change to take effect on restart. Since then we modified the code to allow setting the lease on when the server is down. Update the rest of the code to reflect that fact, clarify variable names, and add document. Also, the code insisted that the grace period always be the longer of the old and new lease periods, but that's overly conservative--as long as it lasts at least the old lease period, old clients should still know to recover in time. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:02 -05:00
J. Bruce Fields	cf07d2ea43	nfsd4: simplify references to nfsd4 lease time Instead of accessing the lease time directly, some users call nfs4_lease_time(), and some a macro, NFSD_LEASE_TIME, defined as nfs4_lease_time(). Neither layer of indirection serves any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:01 -05:00
Linus Torvalds	05c5cb31ec	Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux * 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: (22 commits) nfsd4: fix minor memory leak svcrpc: treat uid's as unsigned nfsd: ensure sockets are closed on error Revert "sunrpc: move the close processing after do recvfrom method" Revert "sunrpc: fix peername failed on closed listener" sunrpc: remove unnecessary svc_xprt_put NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN xfs_export_operations.commit_metadata commit_metadata export operation replacing nfsd_sync_dir lockd: don't clear sm_monitored on nsm_reboot_lookup lockd: release reference to nsm_handle in nlm_host_rebooted nfsd: Use vfs_fsync_range() in nfsd_commit NFSD: Create PF_INET6 listener in write_ports SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() NFSD: Support AF_INET6 in svc_addsock() function SUNRPC: Use rpc_pton() in ip_map_parse() nfsd: 4.1 has an rfc number nfsd41: Create the recovery entry for the NFSv4.1 client nfsd: use vfs_fsync for non-directories ...	2010-03-06 11:31:38 -08:00
Wu Fengguang	42e4960868	vfs: take f_lock on modifying f_mode after open time We'll introduce FMODE_RANDOM which will be runtime modified. So protect all runtime modification to f_mode with f_lock to avoid races. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@kernel.org> [2.6.33.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:25 -08:00
Linus Torvalds	e213e26ab3	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits) quota: stop using QUOTA_OK / NO_QUOTA dquot: cleanup dquot initialize routine dquot: move dquot initialization responsibility into the filesystem dquot: cleanup dquot drop routine dquot: move dquot drop responsibility into the filesystem dquot: cleanup dquot transfer routine dquot: move dquot transfer responsibility into the filesystem dquot: cleanup inode allocation / freeing routines dquot: cleanup space allocation / freeing routines ext3: add writepage sanity checks ext3: Truncate allocated blocks if direct IO write fails to update i_size quota: Properly invalidate caches even for filesystems with blocksize < pagesize quota: generalize quota transfer interface quota: sb_quota state flags cleanup jbd: Delay discarding buffers in journal_unmap_buffer ext3: quota_write cross block boundary behaviour quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota quota: split out compat_sys_quotactl support from quota.c quota: split out netlink notification support from quota.c quota: remove invalid optimization from quota_sync_all ... Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c	2010-03-05 13:20:53 -08:00
Christoph Hellwig	907f4554e2	dquot: move dquot initialization responsibility into the filesystem Currently various places in the VFS call vfs_dq_init directly. This means we tie the quota code into the VFS. Get rid of that and make the filesystem responsible for the initialization. For most metadata operations this is a straight forward move into the methods, but for truncate and open it's a bit more complicated. For truncate we currently only call vfs_dq_init for the sys_truncate case because open already takes care of it for ftruncate and open(O_TRUNC) - the new code causes an additional vfs_dq_init for those which is harmless. For open the initialization is moved from do_filp_open into the open method, which means it happens slightly earlier now, and only for regular files. The latter is fine because we don't need to initialize it for operations on special files, and we already do it as part of the namespace operations for directories. Add a dquot_file_open helper that filesystems that support generic quotas can use to fill in ->open. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:30 +01:00
J. Bruce Fields	4ea41e2de5	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs into for-2.6.34-incoming Resolve merge conflict in fs/xfs/linux-2.6/xfs_export.c.	2010-03-04 12:04:51 -05:00
J. Bruce Fields	8d75da8afd	nfsd4: fix minor memory leak There's no need to allocate this cred more than once. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-03 16:13:29 -05:00
Al Viro	462d60577a	fix NFS4 handling of mountpoint stat RFC says we need to follow the chain of mounts if there's more than one stacked on that point. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:07:57 -05:00
Al Viro	8737c9305b	Switch may_open() and break_lease() to passing O_... ... instead of mixing FMODE_ and O_ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 13:00:21 -05:00
Chuck Lever	58255a4e3c	NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN The server's callback client should stop trying to connect to the client's callback server as soon as it gets ECONNREFUSED. The NFS server's callback client does not call rpc_ping(), but appears to have it's own "ping" procedure, so it wasn't covered by commit `caabea8a`. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-24 17:50:28 -08:00
Ben Myers	f501912a35	commit_metadata export operation replacing nfsd_sync_dir - Add commit_metadata export_operation to allow the underlying filesystem to decide how to commit an inode most efficiently. - Usage of nfsd_sync_dir and write_inode_now has been replaced with the commit_metadata function that takes a svc_fh. - The commit_metadata function calls the commit_metadata export_op if it's there, or else falls back to sync_inode instead of fsync and write_inode_now because only metadata need be synced here. - nfsd4_sync_rec_dir now uses vfs_fsync so that commit_metadata can be static Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-20 13:13:44 -08:00
Chuck Ebbert	aeaa5ccd64	vfs: don't call ima_file_check() unconditionally in nfsd_open() commit `1e41568d73` ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") moved this code back to its original location but missed the "else". Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-20 00:47:31 -05:00
Daniel Mack	3ad2f3fbb9	tree-wide: Assorted spelling fixes In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-09 11:13:56 +01:00
Linus Torvalds	deb0c98c7f	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: Revert "nfsd4: fix error return when pseudoroot missing"	2010-02-08 17:08:01 -08:00
J. Bruce Fields	260c64d235	Revert "nfsd4: fix error return when pseudoroot missing" Commit `f39bde24b2` fixed the error return from PUTROOTFH in the case where there is no pseudofilesystem. This is really a case we shouldn't hit on a correctly configured server: in the absence of a root filehandle, there's no point accepting version 4 NFS rpc calls at all. But the shared responsibility between kernel and userspace here means the kernel on its own can't eliminate the possiblity of this happening. And we have indeed gotten this wrong in distro's, so new client-side mount code that attempts to negotiate v4 by default first has to work around this case. Therefore when commit `f39bde24b2` arrived at roughly the same time as the new v4-default mount code, which explicitly checked only for the previous error, the result was previously fine mounts suddenly failing. We'll fix both sides for now: revert the error change, and make the client-side mount workaround more robust. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-08 15:25:23 -05:00
Mimi Zohar	9bbb6cad01	ima: rename ima_path_check to ima_file_check ima_path_check actually deals with files! call it ima_file_check instead. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Mimi Zohar	8eb988c70e	fix ima breakage The "Untangling ima mess, part 2 with counters" patch messed up the counters. Based on conversations with Al Viro, this patch streamlines ima_path_check() by removing the counter maintaince. The counters are now updated independently, from measuring the file, in __dentry_open() and alloc_file() by calling ima_counts_get(). ima_path_check() is called from nfsd and do_filp_open(). It also did not measure all files that should have been measured. Reason: ima_path_check() got bogus value passed as mask. [AV: mea culpa] [AV: add missing nfsd bits] Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Al Viro	1e41568d73	Take ima_path_check() in nfsd past dentry_open() in nfsd_open() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Trond Myklebust	aa696a6f34	nfsd: Use vfs_fsync_range() in nfsd_commit The NFS COMMIT operation allows the client to specify the exact byte range that it wishes to sync to disk in order to optimise server performance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-29 18:53:11 -05:00
Chuck Lever	37498292aa	NFSD: Create PF_INET6 listener in write_ports Try to create a PF_INET6 listener for NFSD, if IPv6 is enabled in the kernel. Make sure nfsd_serv's reference count is decreased if __write_ports_addxprt() failed to create a listener. See __write_ports_addfd(). Our current plan is to rely on rpc.nfsd to create appropriate IPv6 listeners when server-side NFS/IPv6 support is desired. Legacy behavior, via the write_threads or write_svc kernel APIs, will remain the same -- only IPv4 listeners are created. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [bfields@citi.umich.edu: Move error-handling code to end] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-27 17:01:08 -05:00
Chuck Lever	6871790815	SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" write_ports() converts svc_create_xprt()'s ENOENT error return to EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error message that makes sense. It turns out that several of the other kernel APIs rpc.nfsd use can also return ENOENT from svc_create_xprt(), by way of lockd_up(). On the client side, an NFSv2 or NFSv3 mount request can also return the result of lockd_up(). This error may also be returned during an NFSv4 mount request, since the NFSv4 callback service uses svc_create_xprt() to create the callback listener. An ENOENT error return results in a confusing error message from the mount command. Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-26 17:59:21 -05:00
Ricardo Labiaga	8b8aae4009	nfsd41: Create the recovery entry for the NFSv4.1 client Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-14 12:24:46 -05:00
Christoph Hellwig	6a68f89ee1	nfsd: use vfs_fsync for non-directories Instead of opencoding the fsync calling sequence use vfs_fsync. This also gets rid of the useless i_mutex over the data writeout. Consolidate the remaining special code for syncing directories and document it's quirks. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Ricardo Labiaga	de3cab793c	nfsd4: Use FIRST_NFS4_OP in nfsd4_decode_compound() Since we're checking for LAST_NFS4_OP, use FIRST_NFS4_OP to be consistent. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Ricardo Labiaga	c551866e64	nfsd41: nfsd4_decode_compound() does not recognize all ops The server incorrectly assumes that the operations in the array start with value 0. The first operation (OP_ACCESS) has a value of 3, causing the check in nfsd4_decode_compound to be off. Instead of comparing that the operation number is less than the number of elements in the array, the server should verify that it is less than the maximum valid operation number defined by LAST_NFS4_OP. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Linus Torvalds	93939f4e5d	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: sunrpc: fix peername failed on closed listener nfsd: make sure data is on disk before calling ->fsync nfsd: fix "insecure" export option	2010-01-06 18:10:15 -08:00
Christoph Hellwig	7211a4e859	nfsd: make sure data is on disk before calling ->fsync nfsd is not using vfs_fsync, so I missed it when changing the calling convention during the 2.6.32 window. This patch fixes it to not only start the data writeout, but also wait for it to complete before calling into ->fsync. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-06 17:37:26 -05:00
J. Bruce Fields	3d354cbc43	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-20 20:19:51 -08:00
J. Bruce Fields	f69ac2f5a3	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-20 10:22:58 -05:00
Linus Torvalds	bac5e54c29	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits) direct I/O fallback sync simplification ocfs: stop using do_sync_mapping_range cleanup blockdev_direct_IO locking make generic_acl slightly more generic sanitize xattr handler prototypes libfs: move EXPORT_SYMBOL for d_alloc_name vfs: force reval of target when following LAST_BIND symlinks (try #7) ima: limit imbalance msg Untangling ima mess, part 3: kill dead code in ima Untangling ima mess, part 2: deal with counters Untangling ima mess, part 1: alloc_file() O_TRUNC open shouldn't fail after file truncation ima: call ima_inode_free ima_inode_free IMA: clean up the IMA counts updating code ima: only insert at inode creation time ima: valid return code from ima_inode_alloc fs: move get_empty_filp() deffinition to internal.h Sanitize exec_permission_lite() Kill cached_lookup() and real_lookup() Kill path_lookup_open() ... Trivial conflicts in fs/direct-io.c	2009-12-16 12:04:02 -08:00
Al Viro	1429b3eca2	Untangling ima mess, part 3: kill dead code in ima Kill the 'update' argument of ima_path_check(), kill dead code in ima. Current rules: ima counters are bumped at the same time when the file switches from put_filp() fodder to fput() one. Which happens exactly in two places - alloc_file() and __dentry_open(). Nothing else needs to do that at all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-16 12:16:47 -05:00
Al Viro	b65a9cfc2c	Untangling ima mess, part 2: deal with counters * do ima_get_count() in __dentry_open() * stop doing that in followups * move ima_path_check() to right after nameidata_to_filp() * don't bump counters on it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-16 12:16:47 -05:00
J. Bruce Fields	7663dacd92	nfsd: remove pointless paths in file headers The new .h files have paths at the top that are now out of date. While we're here, just remove all of those from fs/nfsd; they never served any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 15:01:47 -05:00
J. Bruce Fields	1557aca790	nfsd: move most of nfsfh.h to fs/nfsd Most of this can be trivially moved to a private header as well. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 15:01:46 -05:00
J. Bruce Fields	774b147828	nfsd: make V4ROOT exports read-only I can't see any use for writeable V4ROOT exports. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 15:01:44 -05:00
Steve Dickson	03a816b46d	nfsd: restrict filehandles accepted in V4ROOT case On V4ROOT exports, only accept filehandles that are the root of some export. This allows mountd to allow or deny access to individual directories and symlinks on the pseudofilesystem. Note that the checks in readdir and lookup are not enough, since a malicious host with access to the network could guess filehandles that they weren't able to obtain through lookup or readdir. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:24 -05:00
J. Bruce Fields	f2ca7153ca	nfsd: allow exports of symlinks We want to allow exports of symlinks, to allow mountd to communicate to the kernel which symlinks lead to exports, and hence which symlinks need to be visible on the pseudofilesystem. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:24 -05:00
J. Bruce Fields	3227fa41ab	nfsd: filter readdir results in V4ROOT case As with lookup, we treat every boject as a mountpoint and pretend it doesn't exist if it isn't exported. The preexisting code here is confusing, but I haven't yet figured out how to make it clearer. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:24 -05:00
J. Bruce Fields	82ead7fe41	nfsd: filter lookup results in V4ROOT case We treat every object as a mountpoint and pretend it doesn't exist if it isn't exported. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:23 -05:00
J. Bruce Fields	3b6cee7bc4	nfsd4: don't continue "under" mounts in V4ROOT case If /A/mount/point/ has filesystem "B" mounted on top of it, and if "A" is exported, but not "B", then the nfs server has always returned to the client a filehandle for the mountpoint, instead of for the root of "B", allowing the client to see the subtree of "A" that would otherwise be hidden by B. Disable this behavior in the case of V4ROOT exports; we implement the path restrictions of V4ROOT exports by treating every directory as if it were a mountpoint, and allowing traversal only if the new directory is exported. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:07:23 -05:00
Steve Dickson	eb4c86c6a5	nfsd: introduce export flag for v4 pseudoroot NFSv4 differs from v2 and v3 in that it presents a single unified filesystem tree, whereas v2 and v3 exported multiple filesystem (whose roots could be found using a separate mount protocol). Our original NFSv4 server implementation asked the administrator to designate a single filesystem as the NFSv4 root, then to mount filesystems they wished to export underneath. (Often using bind mounts of already-existing filesystems.) This was conceptually simple, and allowed easy implementation, but created a serious obstacle to upgrading between v2/v3: since the paths to v4 filesystems were different, administrators would have to adjust all the paths in client-side mount commands when switching to v4. Various workarounds are possible. For example, the administrator could export "/" and designate it as the v4 root. However, the security risks of that approach are obvious, and in any case we shouldn't be requiring the administrator to take extra steps to fix this problem; instead, the server should present consistent paths across different versions by default. These patches take a modified version of that approach: we provide a new export option which exports only a subset of a filesystem. With this flag, it becomes safe for mountd to export "/" by default, with no need for additional configuration. We begin just by defining the new flag. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-15 14:00:40 -05:00
J. Bruce Fields	12045a6ee9	nfsd: let "insecure" flag vary by pseudoflavor This was an oversight; it should be among the export flags that can be allowed to vary by pseudoflavor. This allows an administrator to (for example) allow auth_sys mounts only from low ports, but allow auth_krb5 mounts to use any port. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 19:08:58 -05:00
J. Bruce Fields	e8e8753f7a	nfsd: new interface to advertise export features Soon we will add the new V4ROOT flag, and allow the INSECURE flag to vary by pseudoflavor. It would be useful for nfs-utils (for example, for improved exportfs error reporting) to be able to know when this happens. Use this new interface for that purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:51:29 -05:00
Boaz Harrosh	9a74af2133	nfsd: Move private headers to source directory Lots of include/linux/nfsd/* headers are only used by nfsd module. Move them to the source directory Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:12:12 -05:00
Boaz Harrosh	341eb18446	nfsd: Source files #include cleanups Now that the headers are fixed and carry their own wait, all fs/nfsd/ source files can include a minimal set of headers. and still compile just fine. This patch should improve the compilation speed of the nfsd module. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:12:09 -05:00
J. Bruce Fields	57ecb34feb	nfsd4: fix share mode permissions NFSv4 opens may function as locks denying other NFSv4 users the rights to open a file. We're requiring a user to have write permissions before they can deny write. We're not requiring a user to have write permissions to deny read, which is if anything a more drastic denial. What was intended was to require write permissions for DENY_READ. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:06:54 -05:00
J. Bruce Fields	864f0f61f8	nfsd: simplify fh_verify access checks All nfsd security depends on the security checks in fh_verify, and especially on nfsd_setuser(). It therefore bothers me that the nfsd_setuser call may be made from three different places, depending on whether the filehandle has already been mapped to a dentry, and on whether subtreechecking is in force. Instead, make an unconditional call in fh_verify(), so it's trivial to verify that the call always occurs. That leaves us with a redundant nfsd_setuser() call in the subtreecheck case--it needs the correct user set earlier in order to check execute permissions on the path to this filehandle--but I'm willing to accept that minor inefficiency in the subtreecheck case in return for more straightforward permission checking. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-25 17:55:46 -05:00
J. Bruce Fields	9b8b317d58	Merge commit 'v2.6.32-rc8' into HEAD	2009-11-23 12:34:58 -05:00
Petr Vandrovec	479c2553af	Fix memory corruption caused by nfsd readdir+ Commit `8177e6d6df` ("nfsd: clean up readdirplus encoding") introduced single character typo in nfs3 readdir+ implementation. Unfortunately that typo has quite bad side effects: random memory corruption, followed (on my box) with immediate spontaneous box reboot. Using 'p1' instead of 'p' fixes my Linux box rebooting whenever VMware ESXi box tries to list contents of my home directory. Signed-off-by: Petr Vandrovec <petr@vandrovec.name> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-14 12:55:55 -08:00
J. Bruce Fields	0a3adadee4	nfsd: make fs/nfsd/vfs.h for common includes None of this stuff is used outside nfsd, so move it out of the common linux include directory. Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really belongs there, so later we may remove that file entirely. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-13 13:23:02 -05:00
Benny Halevy	8c10cbdb4a	nfsd: use STATEID_FMT and STATEID_VAL for printing stateids Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-05 12:06:29 -05:00
Peter Staubach	1b7e0403c6	nfsd: register NFS_ACL with rpcbind Modify the NFS server to register the NFS_ACL services with the rpcbind daemon. This allows the client to ping for the existence of the NFS_ACL support via commands such as "rpcinfo -t <server> nfs_acl". This patch also modifies the NFS_ACL support so that responses to version 2 NULLPROC requests can be made. The changelog for the patch which turned off this functionality mentioned something about not registering the NFS_ACL as being part of some tradition. I can't find this tradition and the only other implementation which supports NFS_ACL does register them with the rpcbind daemon. Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-04 13:46:37 -05:00
Frank Filz	aba24d7158	nfsd: Fix sort_pacl in fs/nfsd/nf4acl.c to actually sort groups We have been doing some extensive testing of Linux support for ACLs on NFDS v4. We have noticed that the server rejects ACLs where the groups are out of order, for example, the following ACL is rejected: A::OWNER@:rwaxtTcCy A::user101@domain:rwaxtcy A::GROUP@:rwaxtcy A:g:group102@domain:rwaxtcy A:g:group101@domain:rwaxtcy A::EVERYONE@:rwaxtcy Examining the server code, I found that after converting an NFS v4 ACL to POSIX, sort_pacl is called to sort the user ACEs and group ACEs. Unfortunately, a minor bug causes the group sort to be skipped. Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:44 -04:00
J. Bruce Fields	efe0cb6d5a	nfsd4.1: common slot allocation size calculation We do the same calculation in a couple places; use a helper function, and add a little documentation, in the hopes of preventing bugs like that fixed in the last patch. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:43 -04:00
J. Bruce Fields	dd829c4564	nfsd4.1: fix session memory use calculation Unbalanced calculations on creation and destruction of sessions could cause our estimate of cache memory used to become negative, sometimes resulting in spurious SERVERFAULT returns to client CREATE_SESSION requests. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:43 -04:00
J. Bruce Fields	e343eb0d60	Merge commit 'v2.6.32-rc5' into for-2.6.33	2009-10-27 18:45:17 -04:00
Alexey Dobriyan	828c09509b	const: constify remaining file_operations [akpm@linux-foundation.org: fix KVM] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-01 16:11:11 -07:00
Andy Adamson	ddc04fd4d5	nfsd41: use sv_max_mesg for forechannel max sizes ca_maxresponsesize and ca_maxrequest size include the RPC header. sv_max_mesg is sv_max_payolad plus a page for overhead and is used in svc_init_buffer to allocate server buffer space for both the request and reply. Note that this means we can service an RPC compound that requires ca_maxrequestsize (MAXWRITE) or ca_max_responsesize (MAXREAD) but that we do not support an RPC compound that requires both ca_maxrequestsize and ca_maxresponsesize. Signed-off-by: Andy Adamson <andros@netapp.com> [bfields@citi.umich.edu: more documentation updates] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:40:15 -04:00
J. Bruce Fields	f39bde24b2	nfsd4: fix error return when pseudoroot missing We really shouldn't hit this case at all, and forthcoming kernel and nfs-utils changes should eliminate this case; if it does happen, consider it a bug rather than reporting an error that doesn't really make sense for the operation (since there's no reason for a server to be accepting v4 traffic yet have no root filehandle). Also move some exp_pseudoroot code into a helper function while we're here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:21:26 -04:00
J. Bruce Fields	289ede453e	nfsd: minor nfsd_lookup cleanup Break out some of nfsd_lookup_dentry into helper functions. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:07:53 -04:00
J. Bruce Fields	fed8381126	nfsd4: cross mountpoints when looking up parents `3c394ddaa7` "nfsd4: nfsv4 clients should cross mountpoints" forgot to handle lookups of parents directories. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-28 12:07:52 -04:00
Alexey Dobriyan	2bcd57ab61	headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:13:10 -07:00
James Morris	88e9d34c72	seq_file: constify seq_operations Make all seq_operations structs const, to help mitigate against revectoring user-triggerable function pointers. This is derived from the grsecurity patch, although generated from scratch because it's simpler than extracting the changes from there. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:29 -07:00
Linus Torvalds	a87e84b5cd	Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux * 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits) nfsd4: nfsv4 clients should cross mountpoints nfsd: revise 4.1 status documentation sunrpc/cache: avoid variable over-loading in cache_defer_req sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req nfsd: return success for non-NFS4 nfs4_state_start nfsd41: Refactor create_client() nfsd41: modify nfsd4.1 backchannel to use new xprt class nfsd41: Backchannel: Implement cb_recall over NFSv4.1 nfsd41: Backchannel: cb_sequence callback nfsd41: Backchannel: Setup sequence information nfsd41: Backchannel: Server backchannel RPC wait queue nfsd41: Backchannel: Add sequence arguments to callback RPC arguments nfsd41: Backchannel: callback infrastructure nfsd4: use common rpc_cred for all callbacks nfsd4: allow nfs4 state startup to fail SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous nfsd4: fix null dereference creating nfsv4 callback client nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked. ...	2009-09-22 07:54:33 -07:00
Alexey Dobriyan	7b021967c5	const: make lock_manager_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Steve Dickson	3c394ddaa7	nfsd4: nfsv4 clients should cross mountpoints Allow NFS v4 clients to seamlessly cross mount point without have to set either the 'crossmnt' or the 'nohide' export options. Signed-Off-By: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-21 16:02:25 -04:00
Ricardo Labiaga	b09333c464	nfsd41: Refactor create_client() Move common initialization of 'struct nfs4_client' inside create_client(). Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [nfsd41: Remember the auth flavor to use for callbacks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:52:13 -04:00
Alexandros Batsakis	3ddc8bf5f3	nfsd41: modify nfsd4.1 backchannel to use new xprt class This patch enables the use of the nfsv4.1 backchannel. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [initialize rpc_create_args.bc_xprt too] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:52:13 -04:00
Ricardo Labiaga	0421b5c55a	nfsd41: Backchannel: Implement cb_recall over NFSv4.1 Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [nfsd41: cb_recall callback] [Share v4.0 and v4.1 back channel xdr] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Share v4.0 and v4.1 back channel xdr] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use nfsd4_cb_sequence for callback minorversion] [nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Backchannel: Add sequence arguments to callback RPC arguments] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [pulled-in definition of nfsd4_cb_done] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:52:12 -04:00
Benny Halevy	2af73580b7	nfsd41: Backchannel: cb_sequence callback Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1 Note: highest slot id and target highest slot id do not have to be 0 as was previously implemented. They can be greater than what the nfs server sent if the client supports a larger slot table on the backchannel. At this point we just ignore that. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [Rework the back channel xdr using the shared v4.0 and v4.1 framework.] Signed-off-by: Andy Adamson <andros@netapp.com> [fixed indentation] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: use nfsd4_cb_sequence for callback minorversion] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: fix verification of CB_SEQUENCE highest slot id[ Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Backchannel: Remove old backchannel serialization] [nfsd41: Backchannel: First callback sequence ID should be 1] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: decode_cb_sequence does not need to actually decode ignored fields] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:56 -04:00
Ricardo Labiaga	2a1d1b5938	nfsd41: Backchannel: Setup sequence information Follows the model used by the NFS client. Setup the RPC prepare and done function pointers so that we can populate the sequence information if minorversion == 1. rpc_run_task() is then invoked directly just like existing NFS client operations do. nfsd4_cb_prepare() determines if the sequence information needs to be setup. If the slot is in use, it adds itself to the wait queue. nfsd4_cb_done() wakes anyone sleeping on the callback channel wait queue after our RPC reply has been received. It also sets the task message result pointer to NULL to clearly indicate we're done using it. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [define and initialize cl_cb_seq_nr here] [pulled out unused defintion of nfsd4_cb_done] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:56 -04:00
Ricardo Labiaga	199ff35e1c	nfsd41: Backchannel: Server backchannel RPC wait queue RPC callback requests will wait on this wait queue if the backchannel is out of slots. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:55 -04:00
Ricardo Labiaga	132f97715c	nfsd41: Backchannel: Add sequence arguments to callback RPC arguments Follow the model we use in the client. Make the sequence arguments part of the regular RPC arguments. None of the callbacks that are soon to be implemented expect results that need to be passed back to the caller, so we don't define a separate RPC results structure. For session validation, the cb_sequence decoding will use a pointer to the sequence arguments that are part of the RPC argument. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [define struct nfsd4_cb_sequence here] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:55 -04:00
Andy Adamson	38524ab38f	nfsd41: Backchannel: callback infrastructure Keep the xprt used for create_session in cl_cb_xprt. Mark cl_callback.cb_minorversion = 1 and remember the client provided cl_callback.cb_prog rpc program number. Use it to probe the callback path. Use the client's network address to initialize as the callback's address as expected by the xprt creation routines. Define xdr sizes and code nfs4_cb_compound header to be able to send a null callback rpc. Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [get callback minorversion from fore channel's] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: change bc_sock to bc_xprt] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pulled definition for cl_cb_xprt] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: set up backchannel's cb_addr] [moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:55 -04:00
J. Bruce Fields	80fc015bdf	nfsd4: use common rpc_cred for all callbacks Callbacks are always made using the machine's identity, so we can use a single auth_generic credential shared among callbacks to all clients and let the rpc code take care of the rest. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:34 -04:00
J. Bruce Fields	29ab23cc5d	nfsd4: allow nfs4 state startup to fail The failure here is pretty unlikely, but we should handle it anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:33 -04:00
J. Bruce Fields	886e3b7fe6	nfsd4: fix null dereference creating nfsv4 callback client On setting up the callback to the client, we attempt to use the same authentication flavor the client did. We find an rpc cred to use by calling rpcauth_lookup_credcache(), which assumes that the given authentication flavor has a credentials cache. However, this is not required to be true--in particular, auth_null does not use one. Instead, we should call the auth's lookup_cred() method. Without this, a client attempting to mount using nfsv4 and auth_null triggers a null dereference. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:33 -04:00
Benny Halevy	4be36ca0ce	nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-13 15:57:39 -04:00
Trond Myklebust	ab3bbaa8b2	Merge branch 'nfs-for-2.6.32'	2009-09-11 14:59:37 -04:00
J. Bruce Fields	aed100fafb	nfsd: fix leak on error in nfsv3 readdir Note the !dchild->d_inode case can leak the filehandle. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-04 15:48:00 -04:00
J. Bruce Fields	8177e6d6df	nfsd: clean up readdirplus encoding Make the return from compose_entry_fh() zero or an error, even though the returned error isn't used, just to make the meaning of the return immediately obvious. Move some repeated code out of main function into helper. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-04 15:47:40 -04:00
J. Bruce Fields	1be10a88ca	nfsd4: filehandle leak or error exit from fh_compose() A number of callers (nfsd4_encode_fattr(), at least) don't bother to release the filehandle returned to fh_compose() if fh_compose() returns an error. So, modify fh_compose() to release the filehandle before returning an error. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-04 11:59:32 -04:00
Trond Myklebust	2671a4bf35	NFSd: Fix filehandle leak in exp_pseudoroot() and nfsd4_path() nfsd4_path() allocates a temporary filehandle and then fails to free it before the function exits, leaking reference counts to the dentry and export that it refers to. Also, nfsd4_lookupp() puts the result of exp_pseudoroot() in a temporary filehandle which it releases on success of exp_pseudoroot() but not on failure; fix exp_pseudoroot to ensure that on failure it releases the filehandle before returning. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-03 16:57:57 -04:00
J. Bruce Fields	bc6c53d5a1	nfsd: move fsid_type choice out of fh_compose More trivial cleanup. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-02 23:54:48 -04:00
J. Bruce Fields	8e498751f2	nfsd: move some of fh_compose into helper functions Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-02 23:53:51 -04:00
David Howells	e0e817392b	CRED: Add some configurable debugging [try #6 ] Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking for credential management. The additional code keeps track of the number of pointers from task_structs to any given cred struct, and checks to see that this number never exceeds the usage count of the cred struct (which includes all references, not just those from task_structs). Furthermore, if SELinux is enabled, the code also checks that the security pointer in the cred struct is never seen to be invalid. This attempts to catch the bug whereby inode_has_perm() faults in an nfsd kernel thread on seeing cred->security be a NULL pointer (it appears that the credential struct has been previously released): http://www.kerneloops.org/oops.php?number=252883 Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-09-02 21:29:01 +10:00
Andy Adamson	557ce2646e	nfsd41: replace page based DRC with buffer based DRC Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd pages in cache. Connectathon testing has shown that 1024 bytes for encoded compound operation responses past the sequence operation is sufficient, 512 bytes is a little too small. Set NFSD_SLOT_CACHE_SIZE to 1024. Allocate memory for the session DRC in the CREATE_SESSION operation to guarantee that the memory resource is available for caching responses. Allocate each slot individually in preparation for slot table size negotiation. Remove struct nfsd4_cache_entry and helper functions for the old page-based DRC. The iov_len calculation in nfs4svc_encode_compoundres is now always correct. Replay is now done in nfsd4_sequence under the state lock, so the session ref count is only bumped on non-replay. Clean up the nfs4svc_encode_compoundres session logic. The nfsd4_compound_state statp pointer is also not used. Remove nfsd4_set_statp(). Move useful nfsd4_cache_entry fields into nfsd4_slot. Signed-off-by: Andy Adamson <andros@netapp.com Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:06 -04:00
Andy Adamson	bdac86e215	nfsd41: replace nfserr_resource in pure nfs41 responses nfserr_resource is not a legal error for NFSv4.1. Replace it with nfserr_serverfault for EXCHANGE_ID and CREATE_SESSION processing. We will also need to map nfserr_resource to other errors in routines shared by NFSv4.0 and NFSv4.1 Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:05 -04:00
Andy Adamson	a8dfdaeb7a	nfsd41: use session maxreqs for sequence target and highest slotid This fixes a bug in the sequence operation reply. The sequence operation returns the highest slotid it will accept in the future in sr_highest_slotid, and the highest slotid it prefers the client to use. Since we do not re-negotiate the session slot table yet, these should both always be set to the session ca_maxrequests. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:05 -04:00
Andy Adamson	a649637c73	nfsd41: bound forechannel drc size by memory usage By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound a forechannel drc request size, clients can tailor a session to usage. For example, an I/O session (READ/WRITE only) can have a much smaller ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger ca_maxresponses to service a large in-flight data window. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 22:24:05 -04:00
Trond Myklebust	a06b1261bd	NFSD: Fix a bug in the NFSv4 'supported attrs' mandatory attribute The fact that the filesystem doesn't currently list any alternate locations does _not_ imply that the fs_locations attribute should be marked as "unsupported". Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-01 20:00:17 -04:00
Andy Adamson	468de9e54a	nfsd41: expand solo sequence check Compounds consisting of only a sequence operation don't need any additional caching beyond the sequence information we store in the slot entry. Fix nfsd4_is_solo_sequence to identify this case correctly. The additional check for a failed sequence in nfsd4_store_cache_entry() is redundant, since the nfsd4_is_solo_sequence call lower down catches this case. The final ce_cachethis set in nfsd4_sequence is also redundant. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-28 12:20:15 -04:00
Frank Filz	d8d0b85b11	nfsd4: remove ACE4_IDENTIFIER_GROUP flag from GROUP@ entry RFC 3530 says "ACE4_IDENTIFIER_GROUP flag MUST be ignored on entries with these special identifiers. When encoding entries with these special identifiers, the ACE4_IDENTIFIER_GROUP flag SHOULD be set to zero." It really shouldn't matter either way, but the point is that this flag is used to distinguish named users from named groups (since unix allows a group to have the same name as a user), so it doesn't really make sense to use it on a special identifier such as this.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-27 17:35:41 -04:00
Benny Halevy	aaf84eb95a	nfsd41: renew_client must be called under the state lock Until we work out the state locking so we can use a spin lock to protect the cl_lru, we need to take the state_lock to renew the client. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Do not renew state on error] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: Simplify exit code] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-27 17:17:40 -04:00
Ryusei Yamaguchi	ed2d8aed52	knfsd: Replace lock_kernel with a mutex in nfsd pool stats. lock_kernel() in knfsd was replaced with a mutex. The later commit `03cf6c9f49` ("knfsd: add file to export stats about nfsd pools") did not follow that change. This patch fixes the issue. Also move the get and put of nfsd_serv to the open and close methods (instead of start and stop methods) to allow atomic check and increment of reference count in the open method (where we can still return an error). Signed-off-by: Ryusei Yamaguchi <mandel59@gmail.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: Greg Banks <gnb@fmeh.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-25 12:39:37 -04:00
Frank Filz	55bb55dca0	nfsd: Fix unnecessary deny bits in NFSv4 ACL The group deny entries end up denying tcy even though tcy was just allowed by the allow entry. This appears to be due to: ace->access_mask = mask_from_posix(deny, flags); instead of: ace->access_mask = deny_mask_from_posix(deny, flags); Denying a previously allowed bit has no effect, so this shouldn't affect behavior, but it's ugly. Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-24 20:01:22 -04:00
Jeff Layton	fbf4665f41	nfsd: populate sin6_scope_id on callback address with scopeid from rq_addr on SETCLIENTID call When a SETCLIENTID call comes in, one of the args given is the svc_rqst. This struct contains an rq_addr field which holds the address that sent the call. If this is an IPv6 address, then we can use the sin6_scope_id field in this address to populate the sin6_scope_id field in the callback address. AFAICT, the rq_addr.sin6_scope_id is non-zero if and only if the client mounted the server's link-local address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:44 -04:00
Jeff Layton	7077ecbabd	nfsd: add support for NFSv4 callbacks over IPv6 The framework to add this is all in place. Now, add the code to allow support for establishing a callback channel on an IPv6 socket. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:44 -04:00
Jeff Layton	aa9a4ec770	nfsd: convert nfs4_cb_conn struct to hold address in sockaddr_storage ...rather than as a separate address and port fields. This will be necessary for implementing callbacks over IPv6. Also, convert gen_callback to use the standard rpcuaddr2sockaddr routine rather than its own private one. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:43 -04:00
Jeff Layton	363168b4ea	nfsd: make nfs4_client->cl_addr a struct sockaddr_storage It's currently a __be32, which isn't big enough to hold an IPv6 address. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:43 -04:00
J. Bruce Fields	e9dc122166	Merge branch 'nfs-for-2.6.32' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 into for-2.6.32-incoming Conflicts: net/sunrpc/cache.c	2009-08-21 11:27:29 -04:00
Trond Myklebust	f884dcaead	Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:58 -04:00
Trond Myklebust	bc74b4f5e6	SUNRPC: Allow the cache_detail to specify alternative upcall mechanisms For events that are rare, such as referral DNS lookups, it makes limited sense to have a daemon constantly listening for upcalls on a channel. An alternative in those cases might simply be to run the app that fills the cache using call_usermodehelper_exec() and friends. The following patch allows the cache_detail to specify alternative upcall mechanisms for these particular cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:29 -04:00
Trond Myklebust	2da8ca26c6	NFSD: Clean up the idmapper warning... What part of 'internal use' is so hard to understand? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:26 -04:00
Chuck Lever	4116092b92	NFSD: Support IPv6 addresses in write_failover_ip() In write_failover_ip(), replace the sscanf() with a call to the common sunrpc.ko presentation address parser. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:40 -04:00
Andy Adamson	abfabf8caf	nfsd41: encode replay sequence from the slot values The sequence operation is not cached; always encode the sequence operation on a replay from the slot table and session values. This simplifies the sessions replay logic in nfsd4_proc_compound. If this is a replay of a compound that was specified not to be cached, return NFS4ERR_RETRY_UNCACHED_REP. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 16:12:34 -04:00
Andy Adamson	c8647947f8	nfsd41: rename nfsd4_enc_uncached_replay This function is only used for SEQUENCE replay. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:36 -04:00
Andy Adamson	49557cc74c	nfsd41: Use separate DRC for setclientid Instead of trying to share the generic 4.1 reply cache code for the CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION separately. The nfs41 single slot clientid DRC holds the results of create session processing. CREATE_SESSION can be preceeded by a SEQUENCE operation (an embedded CREATE_SESSION) and the create session single slot cache must be maintained. nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not implement the replay of an embedded CREATE_SESSION. The clientid DRC slot does not need the inuse, cachethis or other fields that the multiple slot session cache uses. Replace the clientid DRC cache struct nfs4_slot cache with a new nfsd4_clid_slot cache. Save the xdr struct nfsd4_create_session into the cache at the end of processing, and on a replay, replace the struct for the replay request with the cached version all while under the state lock. nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case via the normal use of encode_operation. Errors that do not change the create session cache: A create session NFS4ERR_STALE_CLIENTID error means that a client record (and associated create session slot) could not be found and therefore can't be changed. NFSERR_SEQ_MISORDERED errors do not change the slot cache. All other errors get cached. Remove the clientid DRC specific check in nfs4svc_encode_compoundres to put the session only if cstate.session is set which will now always be true. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:29 -04:00
Andy Adamson	88e588d56a	nfsd41: change check_slot_seqid parameters For separation of session slot and clientid slot processing. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:23 -04:00
Andy Adamson	5261dcf8eb	nfsd41: remove redundant forechannel max requests check This check is done in set_forechannel_maxreqs. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:15 -04:00
Andy Adamson	0c193054a4	nfsd41: hange from page to memory based drc limits NFSD_SLOT_CACHE_SIZE is the size of all encoded operation responses (excluding the sequence operation) that we want to cache. For now, keep NFSD_SLOT_CACHE_SIZE at PAGE_SIZE. It will be reduced when the DRC is changed from page based to memory based. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:30:05 -04:00
Andy Adamson	6a14dd1a4f	nfsd41: reserve less memory for DRC Also remove a slightly misleading comment. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:59 -04:00
Andy Adamson	b101ebbc39	nfsd41: minor set_forechannel_maxreqs cleanup Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:54 -04:00
Andy Adamson	be98d1bbd1	nfsd41: reclaim DRC memory on session free This fixes a leak which would eventually lock out new clients. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:48 -04:00
J. Bruce Fields	413d63d710	nfsd: minor write_pool_threads exit cleanup Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:41 -04:00
Eric Sesterhenn	2522a776c1	Fix memory leak in write_pool_threads kmemleak produces the following warning unreferenced object 0xc9ec02a0 (size 8): comm "cat", pid 19048, jiffies 730243 backtrace: [<c01bf970>] create_object+0x100/0x240 [<c01bfadb>] kmemleak_alloc+0x2b/0x60 [<c01bcd4b>] __kmalloc+0x14b/0x270 [<c02fd027>] write_pool_threads+0x87/0x1d0 [<c02fcc08>] nfsctl_transaction_write+0x58/0x70 [<c02fcc6f>] nfsctl_transaction_read+0x4f/0x60 [<c01c2574>] vfs_read+0x94/0x150 [<c01c297d>] sys_read+0x3d/0x70 [<c0102d6b>] sysenter_do_call+0x12/0x32 [<ffffffff>] 0xffffffff write_pool_threads() only frees nthreads on error paths, in the success case we leak it. Signed-off-by: Eric Sesterhenn <eric.sesterhenn@lsexperts.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-28 14:29:34 -04:00
Andy Adamson	4bd9b0f4af	nfsd41: use globals for DRC limits The version 4.1 DRC memory limit and tracking variables are server wide and session specific. Replace struct svc_serv fields with globals. Stop using the svc_serv sv_lock. Add a spinlock to serialize access to the DRC limit management variables which change on session creation and deletion (usage counter) or (future) administrative action to adjust the total DRC memory limit. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-07-14 17:52:40 -04:00
Yu Zhiguo	9208faf297	NFSv4: ACL in operations 'open' and 'create' should be used ACL in operations 'open' and 'create' is decoded but never be used. It should be set as the initial ACL for the object according to RFC3530. If error occurs when setting the ACL, just clear the ACL bit in the returned attr bitmap. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-14 12:16:47 -04:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
David Howells	033a666ccb	NFSD: Don't hold unrefcounted creds over call to nfsd_setuser() nfsd_open() gets an unrefcounted pointer to the current process's effective credentials at the top of the function, then calls nfsd_setuser() via fh_verify() - which may replace and destroy the current process's effective credentials - and then passes the unrefcounted pointer to dentry_open() - but the credentials may have been destroyed by this point. Instead, the value from current_cred() should be passed directly to dentry_open() as one of its arguments, rather than being cached in a variable. Possibly fh_verify() should return the creds to use. This is a regression introduced by `745ca2475a` "CRED: Pass credentials through dentry_open()". Signed-off-by: David Howells <dhowells@redhat.com> Tested-and-Verified-By: Steve Dickson <steved@redhat.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-03 10:21:10 -04:00
Linus Torvalds	7e0338c0de	Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...	2009-06-22 12:55:50 -07:00
Andy Adamson	ab52ae6db0	nfsd41: Backchannel: minorversion support for the back channel Prepare to share backchannel code with NFSv4.1. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> [nfsd41: use nfsd4_cb_sequence for callback minorversion] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 18:33:57 -07:00
Andy Adamson	ef52bff840	nfsd41: Backchannel: cleanup nfs4.0 callback encode routines Mimic the client and prepare to share the back channel xdr with NFSv4.1. Bump the number of operations in each encode routine, then backfill the number of operations. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 18:33:57 -07:00
Mike Sager	6ddbbbfe52	nfsd41: Remove ip address collision detection case Verified that cthon and pynfs exchange id tests pass (except for the two expected fails: EID8 and EID50) Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 17:43:53 -07:00
NeilBrown	671e1fcf63	nfsd: optimise the starting of zero threads when none are running. Currently, if we ask to set then number of nfsd threads to zero when there are none running, we set up all the sockets and register the service, and then tear it all down again. This is pointless. So detect that case and exit promptly. (also remove an assignment to 'error' which was never used. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Jeff Layton <jlayton@redhat.com>	2009-06-18 09:42:41 -07:00
NeilBrown	82e12fe924	nfsd: don't take nfsd_mutex twice when setting number of threads. Currently when we write a number to 'threads' in nfsdfs, we take the nfsd_mutex, update the number of threads, then take the mutex again to read the number of threads. Mostly this isn't a big deal. However if we are write '0', and portmap happens to be dead, then we can get unpredictable behaviour. If the nfsd threads all got killed quickly and the last thread is waiting for portmap to respond, then the second time we take the mutex we will block waiting for the last thread. However if the nfsd threads didn't die quite that fast, then there will be no contention when we try to take the mutex again. Unpredictability isn't fun, and waiting for the last thread to exit is pointless, so avoid taking the lock twice. To achieve this, get nfsd_svc return a non-negative number of active threads when not returning a negative error. Signed-off-by: NeilBrown <neilb@suse.de>	2009-06-18 09:40:31 -07:00
Andy Adamson	5d77ddfbcb	nfsd41: sanity check client drc maxreqs Ensure the client requested maximum requests are between 1 and NFSD_MAX_SLOTS_PER_SESSION Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-16 17:13:16 -07:00
Alexandros Batsakis	6c18ba9f5e	nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct the change is valid for both the forechannel and the backchannel (currently dummy) Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-16 10:13:45 -07:00
Yu Zhiguo	b9081d90f5	NFS: kill off complicated macro 'PROC' kill off obscure macro 'PROC' of NFSv2&3 in order to make the code more clear. Among other things, this makes it simpler to grep for callers of these functions--something which has frequently caused confusion among nfs developers. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 19:34:32 -07:00
J. Bruce Fields	e4636d535e	nfsd: minor nfsd_vfs_write cleanup There's no need to check host_err >= 0 every time here when we could check host_err < 0 once, following the usual kernel style. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 19:18:34 -07:00
J. Bruce Fields	d911df7b8d	nfsd: Pull write-gathering code out of nfsd_vfs_write This is a relatively self-contained piece of code that handles a special case--move it to its own function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:54:05 -07:00
J. Bruce Fields	9d2a3f31d6	nfsd: track last inode only in use_wgather case Updating last_ino and last_dev probably isn't useful in the !use_wgather case. Also remove some pointless ifdef'd-out code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:52:47 -07:00
Trond Myklebust	48e03bc515	nfsd: Use write gathering only with NFSv2 NFSv3 and above can use unstable writes whenever they are sending more than one write, rather than relying on the flaky write gathering heuristics. More often than not, write gathering is currently getting it wrong when the NFSv3 clients are sending a single write with FILE_SYNC for efficiency reasons. This patch turns off write gathering for NFSv3/v4, and ensures that it only applies to the one case that can actually benefit: namely NFSv2. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:14:57 -07:00
J. Bruce Fields	7eef4091a6	Merge commit 'v2.6.30' into for-2.6.31	2009-06-15 18:08:07 -07:00
Al Viro	9393bd07cf	switch follow_down() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Al Viro	bab77ebf51	switch follow_up() to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	e64c390ca0	switch rqst_exp_parent() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	91c9fa8f75	switch rqst_exp_get_by_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	5bf3bd2b5c	switch exp_parent() to struct path ... and lose the always-NULL last argument (non-NULL case had been split off a while ago). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:00 -04:00
Al Viro	55430e2ece	nfsd struct path use: exp_get_by_name() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:35:59 -04:00
James Morris	0b4ec6e4e0	Merge branch 'master' into next	2009-06-09 09:27:53 +10:00
Yu Zhiguo	0a93a47f04	NFSv4: kill off complicated macro 'PROC' J. Bruce Fields wrote: ... > (This is extremely confusing code to track down: note that > proc->pc_decode is set to nfs4svc_decode_compoundargs() by the PROC() > macro at the end of fs/nfsd/nfs4proc.c. Which means, for example, that > grepping for nfs4svc_decode_compoundargs() gets you nowhere. Patches to > kill off that macro would be welcomed....) the macro 'PROC' is complicated and obscure, it had better be killed off in order to make the code more clear. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-01 18:09:20 -04:00
Yu Zhiguo	3c8e03166a	NFSv4: do exact check about attribute specified Server should return NFS4ERR_ATTRNOTSUPP if an attribute specified is not supported in current environment. Operations CREATE, NVERIFY, OPEN, SETATTR and VERIFY should do this check. This bug is found when do newpynfs tests. The names of the tests that failed are following: CR12 NVF7a NVF7b NVF7c NVF7d NVF7f NVF7r NVF7s OPEN15 VF7a VF7b VF7c VF7d VF7f VF7r VF7s Add function do_check_fattr() to do exact check: 1, Check attribute specified is supported by the NFSv4 server or not. 2, Check FATTR4_WORD0_ACL & FATTR4_WORD0_FS_LOCATIONS are supported in current environment or not. 3, Check attribute specified is writable or not. step 1 and 3 are done in function nfsd4_decode_fattr() but removed to this function now. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-01 18:01:54 -04:00
Mimi Zohar	14dba5331b	integrity: nfsd imbalance bug fix An nfsd exported file is opened/closed by the kernel causing the integrity imbalance message. Before a file is opened, there normally is permission checking, which is done in inode_permission(). However, as integrity checking requires a dentry and mount point, which is not available in inode_permission(), the integrity (permission) checking must be called separately. In order to detect any missing integrity checking calls, we keep track of file open/closes. ima_path_check() increments these counts and does the integrity (permission) checking. As a result, the number of calls to ima_path_check()/ima_file_free() should be balanced. An extra call to fput(), indicates the file could have been accessed without first calling ima_path_check(). In nfsv3 permission checking is done once, followed by multiple reads, which do an open/close for each read. The integrity (permission) checking call should be in nfsd_permission() after the inode_permission() call, but as there is no correlation between the number of permission checking and open calls, the integrity checking call should not increment the counters, but defer it to when the file is actually opened. This patch adds: - integrity (permission) checking for nfsd exported files in nfsd_permission(). - a call to increment counts for files opened by nfsd. This patch has been updated to return the nfs error types. Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-05-28 09:32:43 +10:00
Wei Yongjun	a0d24b295a	nfsd: fix hung up of nfs client while sync write data to nfs server Commit 'Short write in nfsd becomes a full write to the client' (`31dec2538e`) broken the sync write. With the following commands to reproduce: $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt $ cd /mnt $ echo aaaa > temp.txt Then nfs client is hung up. In SYNC mode the server alaways return the write count 0 to the client. This is because the value of host_err in nfsd_vfs_write() will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);', and then we return host_err(which is now 0) as write count. This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 17:40:06 -04:00
Greg Banks	1dbd0d53f3	knfsd: remove unreported filehandle stats counters The file nfsfh.c contains two static variables nfsd_nr_verified and nfsd_nr_put. These are counters which are incremented as a side effect of the fh_verify() fh_compose() and fh_put() operations, i.e. at least twice per NFS call for any non-trivial workload. Needless to say this makes the cacheline that contains them (and any other innocent victims) a very hot contention point indeed under high call-rate workloads on multiprocessor NFS server. It also turns out that these counters are not used anywhere. They're not reported to userspace, they're not used in logic, they're not even exported from the object file (let alone the module). All they do is waste CPU time. So this patch removes them. Tests on a 16 CPU Altix A4700 with 2 10gige Myricom cards, configured separately (no bonding). Workload is 640 client threads doing directory traverals with random small reads, from server RAM. Before ====== Kernel profile: % cumulative self self total time samples samples calls 1/call 1/call name 6.05 2716.00 2716.00 30406 0.09 1.02 svc_process 4.44 4706.00 1990.00 1975 1.01 1.01 spin_unlock_irqrestore 3.72 6376.00 1670.00 1666 1.00 1.00 svc_export_put 3.41 7907.00 1531.00 1786 0.86 1.02 nfsd_ofcache_lookup 3.25 9363.00 1456.00 10965 0.13 1.01 nfsd_dispatch 3.10 10752.00 1389.00 1376 1.01 1.01 nfsd_cache_lookup 2.57 11907.00 1155.00 4517 0.26 1.03 svc_tcp_recvfrom ... 2.21 15352.00 1003.00 1081 0.93 1.00 nfsd_choose_ofc <---- ^^^^ Here the function nfsd_choose_ofc() reads a global variable which by accident happened to be located in the same cacheline as nfsd_nr_verified. Call rate: nullarbor:~ # pmdumptext nfs3.server.calls ... Thu Dec 13 00:15:27 184780.663 Thu Dec 13 00:15:28 184885.881 Thu Dec 13 00:15:29 184449.215 Thu Dec 13 00:15:30 184971.058 Thu Dec 13 00:15:31 185036.052 Thu Dec 13 00:15:32 185250.475 Thu Dec 13 00:15:33 184481.319 Thu Dec 13 00:15:34 185225.737 Thu Dec 13 00:15:35 185408.018 Thu Dec 13 00:15:36 185335.764 After ===== kernel profile: % cumulative self self total time samples samples calls 1/call 1/call name 6.33 2813.00 2813.00 29979 0.09 1.01 svc_process 4.66 4883.00 2070.00 2065 1.00 1.00 spin_unlock_irqrestore 4.06 6687.00 1804.00 2182 0.83 1.00 nfsd_ofcache_lookup 3.20 8110.00 1423.00 10932 0.13 1.00 nfsd_dispatch 3.03 9456.00 1346.00 1343 1.00 1.00 nfsd_cache_lookup 2.62 10622.00 1166.00 4645 0.25 1.01 svc_tcp_recvfrom [...] 0.10 42586.00 44.00 74 0.59 1.00 nfsd_choose_ofc <--- HA!! ^^^^ Call rate: nullarbor:~ # pmdumptext nfs3.server.calls ... Thu Dec 13 01:45:28 194677.118 Thu Dec 13 01:45:29 193932.692 Thu Dec 13 01:45:30 194294.364 Thu Dec 13 01:45:31 194971.276 Thu Dec 13 01:45:32 194111.207 Thu Dec 13 01:45:33 194999.635 Thu Dec 13 01:45:34 195312.594 Thu Dec 13 01:45:35 195707.293 Thu Dec 13 01:45:36 194610.353 Thu Dec 13 01:45:37 195913.662 Thu Dec 13 01:45:38 194808.675 i.e. about a 5.3% improvement in call rate. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Reviewed-by: David Chinner <dgc@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 14:14:03 -04:00
Greg Banks	cf0a586cf4	knfsd: fix reply cache memory corruption Fix a regression in the reply cache introduced when the code was converted to use proper Linux lists. When a new entry needs to be inserted, the case where all the entries are currently being used by threads is not correctly detected. This can result in memory corruption and a crash. In the current code this is an extremely unlikely corner case; it would require the machine to have 1024 nfsd threads and all of them to be busy at the same time. However, upcoming reply cache changes make this more likely; a crash due to this problem was actually observed in field. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 14:14:02 -04:00
Greg Banks	fca4217c5b	knfsd: reply cache cleanups Make REQHASH() an inline function. Rename hash_list to cache_hash. Fix an obsolete comment. Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 14:14:02 -04:00
J. Bruce Fields	8daed1e549	nfsd: silence lockdep warning Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-11 17:23:14 -04:00
Wang Chen	02cb2858db	nfsd: nfs4_stat_init cleanup Save some loop time. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-06 16:22:41 -04:00
J. Bruce Fields	b2c0cea6b1	nfsd4: check for negative dentry before use in nfsv4 readdir After `2f9092e102` "Fix i_mutex vs. readdir handling in nfsd" (and `14f7dd63` "Copy XFS readdir hack into nfsd code"), an entry may be removed between the first mutex_unlock and the second mutex_lock. In this case, lookup_one_len() will return a negative dentry. Check for this case to avoid a NULL dereference. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp> Cc: stable@kernel.org	2009-05-06 16:16:36 -04:00
Randy Dunlap	9064caae8f	nfsd: use C99 struct initializers Eliminate 56 sparse warnings like this one: fs/nfsd/nfs4xdr.c:1331:15: warning: obsolete array initializer, use C99 syntax Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 15:09:12 -04:00
J. Bruce Fields	63e4863fab	nfsd4: make recall callback an asynchronous rpc As with the probe, this removes the need for another kthread. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 15:08:56 -04:00
Andy Adamson	ccecee1e5e	nfsd41: slots are freed with session The session and slots are allocated all in one piece. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 14:45:02 -04:00
J. Bruce Fields	3aea09dc91	nfsd4: track recall retries in nfs4_delegation Move this out of a local variable into the nfs4_delegation object in preparation for making this an async rpc call (at which point we'll need any state like this in a common object that's preserved across function calls). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 20:11:12 -04:00
J. Bruce Fields	6707bd3d42	nfsd4: remove unused dl_trunc There's no point in keeping this field around--it's always zero. (Background: the protocol allows you to tell the client that the file is about to be truncated, as an optimization to save the client from writing back dirty pages that will just be discarded. We don't implement this hint. If we do some day, adding this field back in will be the least of the work involved.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 19:57:46 -04:00
J. Bruce Fields	b53d40c507	nfsd4: eliminate struct nfs4_cb_recall The nfs4_cb_recall struct is used only in nfs4_delegation, so its pointer to the containing delegation is unnecessary--we could just use container_of(). But there's no real reason to have this a separate struct at all--just move these fields to nfs4_delegation. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 19:50:00 -04:00
J. Bruce Fields	c237dc0303	nfsd4: rename callback struct to cb_conn I want to use the name for a struct that actually does represent a single callback. (Actually, I've never been sure it helps to a separate struct for the callback information. Some day maybe those fields could just be dumped into struct nfs4_client. I don't know.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 17:31:44 -04:00
J. Bruce Fields	e300a63ce4	nfsd4: replace callback thread by asynchronous rpc We don't really need a synchronous rpc, and moving to an asynchronous rpc allows us to do without this extra kthread. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 17:10:53 -04:00
J. Bruce Fields	3cef9ab266	nfsd4: lookup up callback cred only once Lookup the callback cred once and then use it for all subsequent callbacks. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:45:03 -04:00
J. Bruce Fields	ecdd03b791	nfsd4: create rpc callback client from server thread The code is a little simpler, and it should be easier to avoid races, if we just do all rpc client creation/destruction from nfsd or laundromat threads and do only the rpc calls themselves asynchronously. The rpc creation doesn't involve any significant waiting (it doesn't call the client, for example), so there's no reason not to do this. Also don't bother destroying the client on failure of the rpc null probe. We may want to retry the probe later anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:44:53 -04:00
J. Bruce Fields	e1cab5a589	nfsd4: set cb_client inside setup_callback_client This is just a minor code simplification. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-29 16:44:47 -04:00

... 3 4 5 6 7 ...

1122 Commits