OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Benjamin Coddington	173b3afcee	lockd: Try to reconnect if statd has moved If rpc.statd is restarted, upcalls to monitor hosts can fail with ECONNREFUSED. In that case force a lookup of statd's new port and retry the upcall. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-09-24 23:08:43 -04:00
Jeff Layton	d68e3c4aa4	lockd: add a /proc/fs/lockd/nlm_end_grace file Add a new procfile that will allow a (privileged) userland process to end the NLM grace period early. The basic idea here will be to have sm-notify write to this file, if it sent out no NOTIFY requests when it runs. In that situation, we can generally expect that there will be no reclaim requests so the grace period can be lifted early. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:13 -04:00
Jeff Layton	f779002965	lockd: move lockd's grace period handling into its own module Currently, all of the grace period handling is part of lockd. Eventually though we'd like to be able to build v4-only servers, at which point we'll need to put all of this elsewhere. Move the code itself into fs/nfs_common and have it build a grace.ko module. Then, rejigger the Kconfig options so that both nfsd and lockd enable it automatically. Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-17 16:33:11 -04:00
Jeff Layton	09802fd2a8	lockd: rip out deferred lock handling from testlock codepath As Kinglong points out, the nlm_block->b_fl field is no longer used at all. Also, vfs_test_lock in the generic locking code will only return FILE_LOCK_DEFERRED if FL_SLEEP is set, and it isn't here. The only other place that returns that value is the DLM lock code, but it only does that in dlm_posix_lock, never in dlm_posix_get. Remove all of the deferred locking code from the testlock codepath since it doesn't appear to ever be used anyway. I do have a small concern that this might cause a behavior change in the case where you have a block already sitting on the list when the testlock request comes in, but that looks like it doesn't really work properly anyway. I think it's best to just pass that down to vfs_test_lock and let the filesystem report that instead of trying to infer what's going on with the lock by looking at an existing block. Cc: cluster-devel@redhat.com Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>	2014-09-09 16:01:09 -04:00
Kinglong Mee	f328296e27	locks: Copy fl_lmops information for conflock in locks_copy_conflock() Commit `d5b9026a67` ([PATCH] knfsd: locks: flag NFSv4-owned locks) using fl_lmops field in file_lock for checking nfsd4 lockowner. But, commit `1a747ee0cc` (locks: don't call ->copy_lock methods on return of conflicting locks) causes the fl_lmops of conflock always be NULL. Also, commit `0996905f93` (lockd: posix_test_lock() should not call locks_copy_lock()) caused the fl_lmops of conflock always be NULL too. Make sure copy the private information by fl_copy_lock() in struct file_lock_operations, merge __locks_copy_lock() to fl_copy_lock(). Jeff advice, "Set fl_lmops on conflocks, but don't set fl_ops. fl_ops are superfluous, since they are callbacks into the filesystem. There should be no need to bother the filesystem at all with info in a conflock. But, lock _ownership_ matters for conflocks and that's indicated by the fl_lmops. So you really do want to copy the fl_lmops for conflocks I think." v5: add missing calling of locks_release_private() in nlmsvc_testlock() v4: only copy fl_lmops for conflock, don't copy fl_ops Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-09 16:01:09 -04:00
Joe Perches	d0449b90f8	locks: Remove unused conf argument from lm_grant This argument is always NULL so don't pass it around. [jlayton: remove dependencies on previous patches in series] Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com>	2014-09-09 16:01:06 -04:00
J. Bruce Fields	7c17705e77	lockd: fix rpcbind crash on lockd startup failure Nikita Yuschenko reported that booting a kernel with init=/bin/sh and then nfs mounting without portmap or rpcbind running using a busybox mount resulted in: # mount -t nfs 10.30.130.21:/opt /mnt svc: failed to register lockdv1 RPC service (errno 111). lockd_up: makesock failed, error=-111 Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc055e65c Oops: Kernel access of bad area, sig: 11 [#1] MPC85xx CDS Modules linked in: CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117 task: cf29cea0 ti: cf35c000 task.ti: cf35c000 NIP: c055e65c LR: c0566490 CTR: c055e648 REGS: cf35dad0 TRAP: 0300 Not tainted (3.10.44.cge) MSR: 00029000 <CE,EE,ME> CR: 22442488 XER: 20000000 DEAR: 00000030, ESR: 00000000 GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086 00000000 GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000 10090ae8 GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000 bfa46ef0 GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000 cf0ded80 NIP [c055e65c] call_start+0x14/0x34 LR [c0566490] __rpc_execute+0x70/0x250 Call Trace: [cf35db80] [00000080] 0x80 (unreliable) [cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4 [cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8 [cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84 [cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c [cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108 [cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30 [cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0 [cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8 [cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec [cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4 [cf35dd90] [c0171590] nfs3_create_server+0x10/0x44 [cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4 [cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8 [cf35de70] [c00cd3bc] mount_fs+0x20/0xbc [cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104 [cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0 [cf35df10] [c00e75ac] SyS_mount+0x90/0xd0 [cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c The addition of svc_shutdown_net() resulted in two calls to svc_rpcb_cleanup(); the second is no longer necessary and crashes when it calls rpcb_register_call with clnt=NULL. Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru> Fixes: `679b033df4` "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up" Cc: stable@vger.kernel.org Acked-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-08 12:03:32 -04:00
Trond Myklebust	d4e8990299	lockd: Do not start the lockd thread before we've set nlmsvc_rqst->rq_task This fixes an Oopsable race when starting lockd. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-09-02 17:49:17 -04:00
Trond Myklebust	d6a7ce424f	lockd: Ensure that lockd_start_svc sets the server rq_task... Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-17 12:00:10 -04:00
Thomas Gleixner	5eaaed4fe2	fs: lockd: Use ktime_get_ns() Replace the ever recurring: ts = ktime_get_ts(); ns = timespec_to_ns(&ts); with ns = ktime_get_ns(); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: John Stultz <john.stultz@linaro.org>	2014-07-23 15:01:44 -07:00
Linus Torvalds	5b174fd647	Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "The largest piece is a long-overdue rewrite of the xdr code to remove some annoying limitations: for example, there was no way to return ACLs larger than 4K, and readdir results were returned only in 4k chunks, limiting performance on large directories. Also: - part of Neil Brown's work to make NFS work reliably over the loopback interface (so client and server can run on the same machine without deadlocks). The rest of it is coming through other trees. - cleanup and bugfixes for some of the server RDMA code, from Steve Wise. - Various cleanup of NFSv4 state code in preparation for an overhaul of the locking, from Jeff, Trond, and Benny. - smaller bugfixes and cleanup from Christoph Hellwig and Kinglong Mee. Thanks to everyone! This summer looks likely to be busier than usual for knfsd. Hopefully we won't break it too badly; testing definitely welcomed" * 'for-3.16' of git://linux-nfs.org/~bfields/linux: (100 commits) nfsd4: fix FREE_STATEID lockowner leak svcrdma: Fence LOCAL_INV work requests svcrdma: refactor marshalling logic nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry nfs4: remove unused CHANGE_SECURITY_LABEL nfsd4: kill READ64 nfsd4: kill READ32 nfsd4: simplify server xdr->next_page use nfsd4: hash deleg stateid only on successful nfs4_set_delegation nfsd4: rename recall_lock to state_lock nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open nfsd4: use recall_lock for delegation hashing nfsd: fix laundromat next-run-time calculation nfsd: make nfsd4_encode_fattr static SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING nfsd: remove unused function nfsd_read_file nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer NFSD: Error out when getting more than one fsloc/secinfo/uuid NFSD: Using type of uint32_t for ex_nflavors instead of int ...	2014-06-10 11:50:57 -07:00
Joe Perches	7ac9fe571d	lockd: convert use of typedef ctl_table to struct ctl_table This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:16 -07:00
Christoph Hellwig	d430e8d530	nfsd: move <linux/nfsd/export.h> to fs/nfsd There are no legitimate users outside of fs/nfsd, so move it there. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:54 -04:00
Christoph Hellwig	9c69de4c94	nfsd: remove <linux/nfsd/nfsfh.h> The only real user of this header is fs/nfsd/nfsfh.h, so merge the two. Various lockѕ source files used it to indirectly get other sunrpc or nfs headers, so fix those up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:53 -04:00
Kees Cook	12dd7ecf23	lockd: avoid warning when CONFIG_SYSCTL undefined When building without CONFIG_SYSCTL, the compiler saw an unused label. This moves the label into the #ifdef it is used under. fs/lockd/svc.c: In function ‘init_nlm’: fs/lockd/svc.c:626:1: warning: label ‘err_sysctl’ defined but not used [-Wunused-label] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-06 17:54:52 -04:00
Jeff Layton	679b033df4	lockd: ensure we tear down any live sockets when socket creation fails during lockd_up We had a Fedora ABRT report with a stack trace like this: kernel BUG at net/sunrpc/svc.c:550! invalid opcode: 0000 [#1] SMP [...] CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1 Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013 task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000 RIP: 0010:[<ffffffffa0305fa8>] [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc] RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206 RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286 RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360 R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600 R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000 FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0 Stack: ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000 ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60 ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008 Call Trace: [<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd] [<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd] [<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50 [<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd] [<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd] [<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50 [<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20 [<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0 [<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd] [<ffffffff811b8b34>] vfs_write+0xb4/0x1f0 [<ffffffff811c3f99>] ? putname+0x29/0x40 [<ffffffff811b9569>] SyS_write+0x49/0xa0 [<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0 [<ffffffff816962e9>] system_call_fastpath+0x16/0x1b Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 RIP [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc] RSP <ffff88003f9b9de0> Evidently, we created some lockd sockets and then failed to create others. make_socks then returned an error and we tried to tear down the svc, but svc->sv_permsocks was not empty so we ended up tripping over the BUG() in svc_destroy(). Fix this by ensuring that we tear down any live sockets we created when socket creation is going to return an error. Fixes: `786185b5f8` (SUNRPC: move per-net operations from...) Reported-by: Raphos <raphoszap@laposte.net> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 10:43:08 -04:00
NeilBrown	2ec197db1a	lockd: send correct lock when granting a delayed lock. If an NFS client attempts to get a lock (using NLM) and the lock is not available, the server will remember the request and when the lock becomes available it will send a GRANT request to the client to provide the lock. If the client already held an adjacent lock, the GRANT callback will report the union of the existing and new locks, which can confuse the client. This happens because __posix_lock_file (called by vfs_lock_file) updates the passed-in file_lock structure when adjacent or over-lapping locks are found. To avoid this problem we take a copy of the two fields that can be changed (fl_start and fl_end) before the call and restore them afterwards. An alternate would be to allocate a 'struct file_lock', initialise it, use locks_copy_lock() to take a copy, then locks_release_private() after the vfs_lock_file() call. But that is a lot more work. Reported-by: Olaf Kirch <okir@suse.com> Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> -- v1 had a couple of issues (large on-stack struct and didn't really work properly). This version is much better tested. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-02-13 14:55:02 -05:00
Trond Myklebust	9a1b6bf818	LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs Firstly, nlmclnt_setlockargs can be called from a reclaimer thread, in which case we're in entirely the wrong namespace. Secondly, commit `8aac62706a` (move exit_task_namespaces() outside of exit_notify()) now means that exit_task_work() is called after exit_task_namespaces(), which triggers an Oops when we're freeing up the locks. Fix this by ensuring that we initialise the nlm_host's rpc_client at mount time, so that the cl_nodename field is initialised to the value of utsname()->nodename that the net namespace uses. Then replace the lockd callers of utsname()->nodename. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Toralf Förster <toralf.foerster@gmx.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nix <nix@esperi.org.uk> Cc: Jeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org # 3.10.x	2013-08-05 15:03:46 -04:00
Linus Torvalds	61f98b0fca	Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from Bruce Fields: "Just three minor bugfixes" * 'for-3.11' of git://linux-nfs.org/~bfields/linux: svcrdma: underflow issue in decode_write_list() nfsd4: fix minorversion support interface lockd: protect nlm_blocked access in nlmsvc_retry_blocked	2013-07-17 13:43:55 -07:00
David Jeffery	1c327d962f	lockd: protect nlm_blocked access in nlmsvc_retry_blocked In nlmsvc_retry_blocked, the check that the list is non-empty and acquiring the pointer of the first entry is unprotected by any lock. This allows a rare race condition when there is only one entry on the list. A function such as nlmsvc_grant_callback() can be called, which will temporarily remove the entry from the list. Between the list_empty() and list_entry(),the list may become empty, causing an invalid pointer to be used as an nlm_block, leading to a possible crash. This patch adds the nlm_block_lock around these calls to prevent concurrent use of the nlm_blocked list. This was a regression introduced by `f904be9cc7` "lockd: Mostly remove BKL from the server". Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-11 17:24:07 -04:00
Kees Cook	f170168b9a	drivers: avoid parsing names as kthread_run() format strings Calling kthread_run with a single name parameter causes it to be handled as a format string. Many callers are passing potentially dynamic string content, so use "%s" in those cases to avoid any potential accidents. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-07-03 16:07:41 -07:00
Jeff Layton	3999e49364	locks: add a new "lm_owner_key" lock operation Currently, the hashing that the locking code uses to add these values to the blocked_hash is simply calculated using fl_owner field. That's valid in most cases except for server-side lockd, which validates the owner of a lock based on fl_owner and fl_pid. In the case where you have a small number of NFS clients doing a lot of locking between different processes, you could end up with all the blocked requests sitting in a very small number of hash buckets. Add a new lm_owner_key operation to the lock_manager_operations that will generate an unsigned long to use as the key in the hashtable. That function is only implemented for server-side lockd, and simply XORs the fl_owner and fl_pid. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:45 +04:00
Jeff Layton	1c8c601a8c	locks: protect most of the file_lock handling with i_lock Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:42 +04:00
Jeff Layton	f891a29f46	locks: drop the unused filp argument to posix_unblock_lock Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-06-29 12:57:37 +04:00
Trond Myklebust	1dfd89af86	LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot After a server reboot, the reclaimer thread will recover all the existing locks. For locks that are blocked, however, it will change the value of block->b_status to nlm_lck_denied_grace_period in order to signal that they need to wake up and resend the original blocking lock request. Due to a bug, however, the block->b_status never gets reset after the blocked locks have been woken up, and so the process goes into an infinite loop of resends until the blocked lock is satisfied. Reported-by: Marc Eshel <eshel@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2013-04-21 18:08:42 -04:00
Linus Torvalds	b6669737d3	Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux Pull nfsd changes from J Bruce Fields: "Miscellaneous bugfixes, plus: - An overhaul of the DRC cache by Jeff Layton. The main effect is just to make it larger. This decreases the chances of intermittent errors especially in the UDP case. But we'll need to watch for any reports of performance regressions. - Containerized nfsd: with some limitations, we now support per-container nfs-service, thanks to extensive work from Stanislav Kinsbursky over the last year." Some notes about conflicts, since there were two non-data semantic conflicts here: - idr_remove_all() had been added by a memory leak fix, but has since become deprecated since idr_destroy() does it for us now. - xs_local_connect() had been added by this branch to make AF_LOCAL connections be synchronous, but in the meantime Trond had changed the calling convention in order to avoid a RCU dereference. There were a couple of more obvious actual source-level conflicts due to the hlist traversal changes and one just due to code changes next to each other, but those were trivial. * 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits) SUNRPC: make AF_LOCAL connect synchronous nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum svcrpc: fix rpc server shutdown races svcrpc: make svc_age_temp_xprts enqueue under sv_lock lockd: nlmclnt_reclaim(): avoid stack overflow nfsd: enable NFSv4 state in containers nfsd: disable usermode helper client tracker in container nfsd: use proper net while reading "exports" file nfsd: containerize NFSd filesystem nfsd: fix comments on nfsd_cache_lookup SUNRPC: move cache_detail->cache_request callback call to cache_read() SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function SUNRPC: rework cache upcall logic SUNRPC: introduce cache_detail->cache_request callback NFS: simplify and clean cache library NFS: use SUNRPC cache creation and destruction helper for DNS cache nfsd4: free_stid can be static nfsd: keep a checksum of the first 256 bytes of request sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer sunrpc: fix comment in struct xdr_buf definition ...	2013-02-28 18:02:55 -08:00
Sasha Levin	b67bfe0d42	hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S \| hlist_for_each_entry_continue(a, - b, c) S \| hlist_for_each_entry_from(a, - b, c) S \| hlist_for_each_entry_rcu(a, - b, c, d) S \| hlist_for_each_entry_rcu_bh(a, - b, c, d) S \| hlist_for_each_entry_continue_rcu_bh(a, - b, c) S \| for_each_busy_worker(a, c, - b, d) S \| ax25_uid_for_each(a, - b, c) S \| ax25_for_each(a, - b, c) S \| inet_bind_bucket_for_each(a, - b, c) S \| sctp_for_each_hentry(a, - b, c) S \| sk_for_each(a, - b, c) S \| sk_for_each_rcu(a, - b, c) S \| sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S \| sk_for_each_safe(a, - b, c, d) S \| sk_for_each_bound(a, - b, c) S \| hlist_for_each_entry_safe(a, - b, c, d, e) S \| hlist_for_each_entry_continue_rcu(a, - b, c) S \| nr_neigh_for_each(a, - b, c) S \| nr_neigh_for_each_safe(a, - b, c, d) S \| nr_node_for_each(a, - b, c) S \| nr_node_for_each_safe(a, - b, c, d) S \| - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S \| - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S \| for_each_host(a, - b, c) S \| for_each_host_safe(a, - b, c, d) S \| for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:24 -08:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Trond Myklebust	666b3d803a	NLM: Ensure that we resend all pending blocking locks after a reclaim Currently, nlmclnt_lock will break out of the for(;;) loop when the reclaimer wakes up the blocking lock thread by setting nlm_lck_denied_grace_period. This causes the lock request to fail with an ENOLCK error. The intention was always to ensure that we resend the lock request after the grace period has expired. Reported-by: Wangyuan Zhang <Wangyuan.Zhang@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2013-02-19 12:18:27 -05:00
Tim Gardner	f25cc71e63	lockd: nlmclnt_reclaim(): avoid stack overflow Even though nlmclnt_reclaim() is only one call into the stack frame, 928 bytes on the stack seems like a lot. Recode to dynamically allocate the request structure once from within the reclaimer task, then pass this pointer into nlmclnt_reclaim() for reuse on subsequent calls. smatch analysis: fs/lockd/clntproc.c:620 nlmclnt_reclaim() warn: 'reqst' puts 928 bytes on stack Also remove redundant assignment of 0 after memset. Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-15 11:29:38 -05:00
Jeff Layton	5976687a2b	sunrpc: move address copy/cmp/convert routines and prototypes from clnt.h to addr.h These routines are used by server and client code, so having them in a separate header would be best. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-02-05 09:41:14 -05:00
Trond Myklebust	262693482c	lockd: Remove BUG_ON()s from fs/lockd/clntproc.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-11-04 14:43:40 -05:00
Trond Myklebust	a2d30a54df	lockd: Remove BUG_ON()s in fs/lockd/host.c - Convert the non-trivial ones into WARN_ON_ONCE(). - Remove the trivial refcounting BUGs Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-11-04 14:43:39 -05:00
Trond Myklebust	326ce0a6da	lockd: Remove trivial BUG_ON()s from the NSM code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-11-04 14:43:39 -05:00
Trond Myklebust	aad56de378	lockd: Remove unnecessary BUG_ON()s in the xdr client code - Offset bound checks are done in the NFS client code. - So are filehandle size checks - The cookie length is a constant - The utsname()->nodename is already bounded Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-11-04 14:43:39 -05:00
Trond Myklebust	e498daa812	LOCKD: Clear ln->nsm_clnt only when ln->nsm_users is zero The current code is clearing it in all cases _except_ when zero. Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-24 10:46:22 -04:00
Trond Myklebust	a4ee8d978e	LOCKD: fix races in nsm_client_get Commit `e9406db20f` (lockd: per-net NSM client creation and destruction helpers introduced) contains a nasty race on initialisation of the per-net NSM client because it doesn't check whether or not the client is set after grabbing the nsm_create_mutex. Reported-by: Nix <nix@esperi.org.uk> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2012-10-24 10:46:22 -04:00
Trond Myklebust	cd0b16c1c3	NLM: nlm_lookup_file() may return NLMv4-specific error codes If the filehandle is stale, or open access is denied for some reason, nlm_fopen() may return one of the NLMv4-specific error codes nlm4_stale_fh or nlm4_failed. These get passed right through nlm_lookup_file(), and so when nlmsvc_retrieve_args() calls the latter, it needs to filter the result through the cast_status() machinery. Failure to do so, will trigger the BUG_ON() in encode_nlm_stat... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reported-by: Larry McVoy <lm@bitmover.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-17 10:14:14 -04:00
Linus Torvalds	bd81ccea85	Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linux Pull nfsd update from J Bruce Fields: "Another relatively quiet cycle. There was some progress on my remaining 4.1 todo's, but a couple of them were just of the form "check that we do X correctly", so didn't have much affect on the code. Other than that, a bunch of cleanup and some bugfixes (including an annoying NFSv4.0 state leak and a busy-loop in the server that could cause it to peg the CPU without making progress)." * 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits) UAPI: (Scripted) Disintegrate include/linux/sunrpc UAPI: (Scripted) Disintegrate include/linux/nfsd nfsd4: don't allow reclaims of expired clients nfsd4: remove redundant callback probe nfsd4: expire old client earlier nfsd4: separate session allocation and initialization nfsd4: clean up session allocation nfsd4: minor free_session cleanup nfsd4: new_conn_from_crses should only allocate nfsd4: separate connection allocation and initialization nfsd4: reject bad forechannel attrs earlier nfsd4: enforce per-client sessions/no-sessions distinction nfsd4: set cl_minorversion at create time nfsd4: don't pin clientids to pseudoflavors nfsd4: fix bind_conn_to_session xdr comment nfsd4: cast readlink() bug argument NFSD: pass null terminated buf to kstrtouint() nfsd: remove duplicate init in nfsd4_cb_recall nfsd4: eliminate redundant nfs4_free_stateid fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR ...	2012-10-13 10:53:54 +09:00
Linus Torvalds	df632d3ce7	NFS client updates for Linux 3.7 Features include: - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1 Aside from the issues discussed at the LKS, distros are shipping NFSv4.1 with all the trimmings. - Fix fdatasync()/fsync() for the corner case of a server reboot. - NFSv4 OPEN access fix: finally distinguish correctly between open-for-read and open-for-execute permissions in all situations. - Ensure that the TCP socket is closed when we're in CLOSE_WAIT - More idmapper bugfixes - Lots of pNFS bugfixes and cleanups to remove unnecessary state and make the code easier to read. - In cases where a pNFS read or write fails, allow the client to resume trying layoutgets after two minutes of read/write-through-mds. - More net namespace fixes to the NFSv4 callback code. - More net namespace fixes to the NFSv3 locking code. - More NFSv4 migration preparatory patches. Including patches to detect network trunking in both NFSv4 and NFSv4.1 - pNFS block updates to optimise LAYOUTGET calls. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQdMvBAAoJEGcL54qWCgDyV84P/0XvcEXj6kdMv9EiWfRczo7r iAwAIhiEmG1agtZa6v+Gso2MYRQbkGyJi0LKIwzGqNUi0BLQGQCoV93kB0ITVpiN g7poDTnPyoItW1oJCtC48/Mx0G5C1yrHSwFAJrXmtzDF1mwd/BIQReafYp6x+/TU Mvwm7au3Y2ySRBEDmY4zyBERHXGt//JmsZ9Ays6jewQg5ZOyjDQKoeHVYaaeJoF0 A0tQGcBSNdySagI5dt4SlkuO7AClhzVHlilep2dsBu/TLS0F2pEdHXvM2W0koZmM uazaIpzd2F7TfokTYExgsyKsqpkzpDf1kebN4Y1+Ioi7Yy30dQrX6lNaUNcOmOJQ xx694HDHV90KdRBVSFhOIHMTBRcls68hBcWib3MXWHTKX6HVgnFMwhwxGH0MRezf 3rmXoqn+CO1j5WeQmA3BqdVbHSZHi913TKEwE/qoW4pmOFhv5I2flXWQS/Rwvdng 2xDCe6TlvhMS92IpyvNEIicXLRSm+DUAmoAfSqqlifZIAEM5R29e/wCAsmVprO3B LPHyUoIMO6SZ1PL6Rk20+6qQfvCK7U/ChULsUL/zb7R88Pc3sFE2BeAvZVATsvH3 +FJWTz43fwUBoMhPsn8xSBLn/fq6az5C19syz6Fpu3DZ4X0EwyVWifiFk6HgcxZD J8ajEl+dNZeFE8rkwykX =uBk7 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Features include: - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1 Aside from the issues discussed at the LKS, distros are shipping NFSv4.1 with all the trimmings. - Fix fdatasync()/fsync() for the corner case of a server reboot. - NFSv4 OPEN access fix: finally distinguish correctly between open-for-read and open-for-execute permissions in all situations. - Ensure that the TCP socket is closed when we're in CLOSE_WAIT - More idmapper bugfixes - Lots of pNFS bugfixes and cleanups to remove unnecessary state and make the code easier to read. - In cases where a pNFS read or write fails, allow the client to resume trying layoutgets after two minutes of read/write- through-mds. - More net namespace fixes to the NFSv4 callback code. - More net namespace fixes to the NFSv3 locking code. - More NFSv4 migration preparatory patches. Including patches to detect network trunking in both NFSv4 and NFSv4.1 - pNFS block updates to optimise LAYOUTGET calls." * tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (113 commits) pnfsblock: cleanup nfs4_blkdev_get NFS41: send real read size in layoutget NFS41: send real write size in layoutget NFS: track direct IO left bytes NFSv4.1: Cleanup ugliness in pnfs_layoutgets_blocked() NFSv4.1: Ensure that the layout sequence id stays 'close' to the current NFSv4.1: Deal with seqid wraparound in the pNFS return-on-close code NFSv4 set open access operation call flag in nfs4_init_opendata_res NFSv4.1: Remove the dependency on CONFIG_EXPERIMENTAL NFSv4 reduce attribute requests for open reclaim NFSv4: nfs4_open_done first must check that GETATTR decoded a file type NFSv4.1: Deal with wraparound when updating the layout "barrier" seqid NFSv4.1: Deal with wraparound issues when updating the layout stateid NFSv4.1: Always set the layout stateid if this is the first layoutget NFSv4.1: Fix another refcount issue in pnfs_find_alloc_layout NFSv4: don't put ACCESS in OPEN compound if O_EXCL NFSv4: don't check MAY_WRITE access bit in OPEN NFS: Set key construction data for the legacy upcall NFSv4.1: don't do two EXCHANGE_IDs on mount NFS: nfs41_walk_client_list(): re-lock before iterating ...	2012-10-10 23:52:35 +09:00
J. Bruce Fields	f474af7051	UAPI Disintegration 2012-10-09 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAUHPmWxOxKuMESys7AQKN4w//XDwALfbf0MXIw+gwyRiUtJe9mGexvI6X 1R4FWU9a3ImzEZP4cWnmPGT2wmC/x007DcIvx8cyvbdlSuqtR2i/DC+HbWabiLRn nJS7Eer1BJvLv5dn6NmXMEz7yB4Z46+frcmBs3WQeR0sqBMDm+rjQzCqECznO8Jc VtCbox+VR2DuWcM++YECTblYEH3Z+doDXUN2eBaD8L9x3klPbPXD7OcRyOnry8w+ ynmUTKKyH4+hpxDakYrObPIg+vFCxb4QRck1mlgA4wbvb3eqjhM0oOCYJ8GvmILA vdFYztWCjkiuOl5djtXBlsClX8SAMOBYlRed+R1GvjNCSR+WCWrFJJ2F8qoQ1w87 9ts2/8qrozS8luTB475SkT2uLdJkIUKX89Oh+dWeE8YkbPnRPj5lNAdtNY5QSyDq VaRpIo+YfmZygyvHJQlAXBuZ0mvzcPzArfcPgSVTD3B7xTEGVu/45V7SnQX5os/V v39ySPXMdGOIdvK51gw7OtZl64uqrEKu39PyYDX/GUADflp/CHD0J7PJrQePbsH9 AQolVZDIxTfKqYQnUdL8+C8Zc24RowEzz3c2+aO89MSzwGqev3q8sXRVbW/Iqryg p+V3nHe+ipKcga5tOBlPr9KDtDd7j3xN2yaIwf5/QyO1OHBpjAZP1gjSVDcUcwpi svYy4kPn3PA= =etoL -----END PGP SIGNATURE----- nfs: disintegrate UAPI for nfs This is to complete part of the Userspace API (UAPI) disintegration for which the preparatory patches were pulled recently. After these patches, userspace headers will be segregated into: include/uapi/linux/.../foo.h for the userspace interface stuff, and: include/linux/.../foo.h for the strictly kernel internal stuff. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-10-09 18:35:22 -04:00
Stanislav Kinsbursky	cb7323fffa	lockd: create and use per-net NSM RPC clients on MON/UNMON requests NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and destroying it's mount namespace). It means, that current nsproxy is set to NULL already, but creation of RPC client requires UTS namespace for gaining hostname string. This patch creates reference-counted per-net NSM client on first monitor request and destroys it after last unmonitor request. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:27:43 -07:00
Stanislav Kinsbursky	303a7ce920	lockd: use rpc client's cl_nodename for id encoding Taking hostname from uts namespace if not safe, because this cuold be performind during umount operation on child reaper death. And in this case current->nsproxy is NULL already. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:27:38 -07:00
Stanislav Kinsbursky	e9406db20f	lockd: per-net NSM client creation and destruction helpers introduced NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and destroying it's mount namespace). It means, that current nsproxy is set to NULL already, but creation of RPC client requires UTS namespace for gaining hostname string. This patch introduces reference counted NFS RPC clients creation and destruction helpers (similar to RPCBIND RPC clients). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-10-01 15:27:34 -07:00
Al Viro	c5aa1e554a	close the race in nlmsvc_free_block() we need to grab mutex before the reference counter reaches 0 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-22 20:48:20 -04:00
J. Bruce Fields	5b444cc9a4	svcrpc: remove handling of unknown errors from svc_recv svc_recv() returns only -EINTR or -EAGAIN. If we really want to worry about the case where it has a bug that causes it to return something else, we could stick a WARN() in svc_recv. But it's silly to require every caller to have all this boilerplate to handle that case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:42:00 -04:00
Linus Torvalds	a0e881b7c1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull second vfs pile from Al Viro: "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the deadlock reproduced by xfstests 068), symlink and hardlink restriction patches, plus assorted cleanups and fixes. Note that another fsfreeze deadlock (emergency thaw one) is not dealt with - the series by Fernando conflicts a lot with Jan's, breaks userland ABI (FIFREEZE semantics gets changed) and trades the deadlock for massive vfsmount leak; this is going to be handled next cycle. There probably will be another pull request, but that stuff won't be in it." Fix up trivial conflicts due to unrelated changes next to each other in drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c} * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits) delousing target_core_file a bit Documentation: Correct s_umount state for freeze_fs/unfreeze_fs fs: Remove old freezing mechanism ext2: Implement freezing btrfs: Convert to new freezing mechanism nilfs2: Convert to new freezing mechanism ntfs: Convert to new freezing mechanism fuse: Convert to new freezing mechanism gfs2: Convert to new freezing mechanism ocfs2: Convert to new freezing mechanism xfs: Convert to new freezing code ext4: Convert to new freezing mechanism fs: Protect write paths by sb_start_write - sb_end_write fs: Skip atime update on frozen filesystem fs: Add freezing handling to mnt_want_write() / mnt_drop_write() fs: Improve filesystem freezing handling switch the protection of percpu_counter list to spinlock nfsd: Push mnt_want_write() outside of i_mutex btrfs: Push mnt_want_write() outside of i_mutex fat: Push mnt_want_write() outside of i_mutex ...	2012-08-01 10:26:23 -07:00
Al Viro	bf8848918d	lockd: handle lockowner allocation failure in nlmclnt_proc() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-29 23:17:39 +04:00
Al Viro	446945ab9a	lockd: shift grabbing a reference to nlm_host into nlm_alloc_call() It's used both for client and server hosts; we can't do nlmclnt_release_host() on failure exits, since the host might need nlmsvc_release_host(), with BUG_ON() for calling the wrong one. Makes life simpler for callers, actually... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-29 23:09:57 +04:00
Stanislav Kinsbursky	5630f7fa97	Lockd: move grace period management from lockd() to per-net functions Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky	5ccb0066f2	LockD: pass actual network namespace to grace period management functions Passed network namespace replaced hard-coded init_net Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky	db9c455341	LockD: manage grace list per network namespace Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:22 -04:00
Stanislav Kinsbursky	9695c7057f	SUNRPC: service request network namespace helper introduced This is a cleanup patch - makes code looks simplier. It replaces widely used rqstp->rq_xprt->xpt_net by introduced SVC_NET(rqstp). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:49:21 -04:00
Stanislav Kinsbursky	08d44a35a9	LockD: make lockd manager allocated per network namespace Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:44 -04:00
Stanislav Kinsbursky	66547b0251	LockD: manage grace period per network namespace Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:44 -04:00
Stanislav Kinsbursky	e2edaa98cb	Lockd: add more debug to host shutdown functions Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:44 -04:00
Stanislav Kinsbursky	d5850ff9ea	Lockd: host complaining function introduced Just a small cleanup. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:44 -04:00
Stanislav Kinsbursky	caa4e76b6f	LockD: manage used host count per networks namespace This patch introduces moves nrhosts in per-net data. It also adds kernel warning to nlm_shutdown_hosts_net() about remaining hosts in specified network namespace context. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:43 -04:00
Stanislav Kinsbursky	3cf7fb07e0	LockD: manage garbage collection timeout per networks namespace This patch moves next_gc to per-net data. Note: passed network can be NULL (when Lockd kthread is exiting of Lockd module is removing). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:43 -04:00
Stanislav Kinsbursky	27adaddc8d	LockD: make garbage collector network namespace aware. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:43 -04:00
Stanislav Kinsbursky	b26411f85d	LockD: mark host per network namespace on garbage collect This is required for per-network NLM shutdown and cleanup. This patch passes init_net for a while. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-07-27 16:48:43 -04:00
Linus Torvalds	419f431949	Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux Pull the rest of the nfsd commits from Bruce Fields: "... and then I cherry-picked the remainder of the patches from the head of my previous branch" This is the rest of the original nfsd branch, rebased without the delegation stuff that I thought really needed to be redone. I don't like rebasing things like this in general, but in this situation this was the lesser of two evils. * 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits) nfsd4: fix, consolidate client_has_state nfsd4: don't remove rebooted client record until confirmation nfsd4: remove some dprintk's and a comment nfsd4: return "real" sequence id in confirmed case nfsd4: fix exchange_id to return confirm flag nfsd4: clarify that renewing expired client is a bug nfsd4: simpler ordering of setclientid_confirm checks nfsd4: setclientid: remove pointless assignment nfsd4: fix error return in non-matching-creds case nfsd4: fix setclientid_confirm same_cred check nfsd4: merge 3 setclientid cases to 2 nfsd4: pull out common code from setclientid cases nfsd4: merge last two setclientid cases nfsd4: setclientid/confirm comment cleanup nfsd4: setclientid remove unnecessary terms from a logical expression nfsd4: move rq_flavor into svc_cred nfsd4: stricter cred comparison for setclientid/exchange_id nfsd4: move principal name into svc_cred nfsd4: allow removing clients not holding state nfsd4: rearrange exchange_id logic to simplify ...	2012-06-01 08:32:58 -07:00
Linus Torvalds	a00b6151a2	Merge branch 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux Pull nfsd update from Bruce Fields. * 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits) nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek SUNRPC: split upcall function to extract reusable parts nfsd: allocate id-to-name and name-to-id caches in per-net operations. nfsd: make name-to-id cache allocated per network namespace context nfsd: make id-to-name cache allocated per network namespace context nfsd: pass network context to idmap init/exit functions nfsd: allocate export and expkey caches in per-net operations. nfsd: make expkey cache allocated per network namespace context nfsd: make export cache allocated per network namespace context nfsd: pass pointer to export cache down to stack wherever possible. nfsd: pass network context to export caches init/shutdown routines Lockd: pass network namespace to creation and destruction routines NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches nfsd: pass pointer to expkey cache down to stack wherever possible. nfsd: use hash table from cache detail in nfsd export seq ops nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops nfsd: use exp_put() for svc_export_cache put nfsd: use cache detail pointer from svc_export structure on cache put nfsd: add link to owner cache detail to svc_export structure nfsd: use passed cache_detail pointer expkey_parse() ...	2012-05-31 18:18:11 -07:00
Stanislav Kinsbursky	8dbf28e495	LockD: add debug message to start and stop functions Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:44 -04:00
Stanislav Kinsbursky	3d1221dfa9	LockD: service start function introduced This is just a code move, which from my POV makes the code look better. I.e. now on start we have 3 different stages: 1) Service creation. 2) Service per-net data allocation. 3) Service start. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:44 -04:00
Stanislav Kinsbursky	7d13ec761a	LockD: move global usage counter manipulation from error path Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:43 -04:00
Stanislav Kinsbursky	2445223909	LockD: service creation function introduced This function creates service if it doesn't exist, or increases usage counter if it does, and returns a pointer to it. The usage counter will be droppepd by svc_destroy() later in lockd_up(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:42 -04:00
Stanislav Kinsbursky	dbf9b5d74c	LockD: use existing per-net data function on service creation This patch also replaces svc_rpcb_setup() with svc_bind(). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:42 -04:00
Stanislav Kinsbursky	4db77695bf	LockD: pass service to per-net up and down functions Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:41 -04:00
Stanislav Kinsbursky	786185b5f8	SUNRPC: move per-net operations from svc_destroy() The idea is to separate service destruction and per-net operations, because these are two different things and the mix looks ugly. Notes: 1) For NFS server this patch looks ugly (sorry for that). But these place will be rewritten soon during NFSd containerization. 2) LockD per-net counter increase int lockd_up() was moved prior to make_socks() to make lockd_down_net() call safe in case of error. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:40 -04:00
Stanislav Kinsbursky	9793f7c889	SUNRPC: new svc_bind() routine introduced This new routine is responsible for service registration in a specified network context. The idea is to separate service creation from per-net operations. Note also: since registering service with svc_bind() can fail, the service will be destroyed and during destruction it will try to unregister itself from rpcbind. In this case unregistration has to be skipped. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-05-31 20:29:39 -04:00
Al Viro	e847469bf7	lockd: fix the endianness bug comparing be32 values for < is not doing the right thing... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-13 13:50:52 -04:00
Stanislav Kinsbursky	e3f70eadb7	Lockd: pass network namespace to creation and destruction routines v2: dereference of most probably already released nlm_host removed in nlmclnt_done() and reclaimer(). These routines are called from locks reclaimer() kernel thread. This thread works in "init_net" network context and currently relays on persence on lockd thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay on current network context. So let's pass corrent one into them. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:06 -04:00
J. Bruce Fields	1df00640c9	Merge nfs containerization work from Trond's tree The nfs containerization work is a prerequisite for Jeff Layton's reboot recovery rework.	2012-03-26 11:48:54 -04:00
Trond Myklebust	ffa94db604	SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined Stephen Rothwell reports: net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_mapping': net/sunrpc/rpcb_clnt.c:820:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getport': net/sunrpc/rpcb_clnt.c:837:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_set': net/sunrpc/rpcb_clnt.c:860:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_getaddr': net/sunrpc/rpcb_clnt.c:892:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getaddr': net/sunrpc/rpcb_clnt.c:914:19: warning: unused variable 'task' [-Wunused-variable] fs/lockd/svclock.c:49:20: warning: 'nlmdbg_cookie2a' declared 'static' but never defined [-Wunused-function] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-03-21 09:31:44 -04:00
NeilBrown	de5b8e8e04	lockd: fix arg parsing for grace_period and timeout. If you try to set grace_period or timeout via a module parameter to lockd, and do this on a big-endian machine where sizeof(int) != sizeof(unsigned long) it won't work. This number given will be effectively shifted right by the difference in those two sizes. So cast kp->arg properly to get correct result. Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-17 18:38:51 -05:00
Stanislav Kinsbursky	3b64739fb9	Lockd: shutdown NLM hosts in network namespace context Lockd now managed in network namespace context. And this patch introduces network namespace related NLM hosts shutdown in case of releasing per-net Lockd resources. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-02-15 00:19:48 -05:00
Stanislav Kinsbursky	0e1cb5c0aa	LockD: make NSM network namespace aware NLM host is network namespace aware now. So NSM have to take it into account. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-02-15 00:19:48 -05:00
Stanislav Kinsbursky	66697bfd6a	LockD: make nlm hosts network namespace aware This object depends on RPC client, and thus on network namespace. So let's make it's allocation and lookup in network namespace context. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-02-15 00:19:48 -05:00
Stanislav Kinsbursky	bb2224df5f	Lockd: per-net up and down routines introduced This patch introduces per-net Lockd initialization and destruction routines. The logic is the same as in global Lockd up and down routines. Probably the solution is not the best one. But at least it looks clear. So per-net "up" routine are called only in case of lockd is running already. If per-net resources are not allocated yet, then service is being registered with local portmapper and lockd sockets created. Per-net "down" routine is called on every lockd_down() call in case of global users counter is not zero. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-02-15 00:19:47 -05:00
Stanislav Kinsbursky	a9c5d73a8d	Lockd: pernet usage counter introduced Lockd is going to be shared between network namespaces - i.e. going to be able to handle lock requests from different network namespaces. This means, that network namespace related resources have to be allocated not once (like now), but for every network namespace context, from which service is requested to operate. This patch implements Lockd per-net users accounting. New per-net counter is used to determine, when per-net resources have to be freed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-02-15 00:19:47 -05:00
Stanislav Kinsbursky	c228fa2038	Lockd: create permanent lockd sockets in current network namespace This patch parametrizes Lockd permanent sockets creation routine by network namespace context. It also replaces hard-coded init_net with current network namespace context in Lockd sockets creation routines. This approach looks safe, because Lockd is created during NFS mount (or NFS server start) and thus socket is required exactly in current network namespace context. But in the same time it means, that Lockd sockets inherits first Lockd requester network namespace. This issue will be fixed in further patches of the series. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-02-15 00:19:46 -05:00
Trond Myklebust	a613fa168a	SUNRPC: constify the rpc_program Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:20 -05:00
Stanislav Kinsbursky	4cb54ca206	SUNRPC: search for service transports in network namespace context Service transports are parametrized by network namespace. And thus lookup of transport instance have to take network namespace into account. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: J. Bruce Fields <bfields@redhat.com>	2012-01-31 19:28:19 -05:00
Rusty Russell	90ab5ee941	module_param: make bool parameters really bool (drivers & misc) module_param(bool) used to counter-intuitively take an int. In `fddd5201` (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-13 09:32:20 +10:30
Al Viro	d8c9584ea2	vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-06 23:16:53 -05:00
Mi Jinlong	849a1cf13d	SUNRPC: Replace svc_addr_u by sockaddr_storage For IPv6 local address, lockd can not callback to client for missing scope id when binding address at inet6_bind: 324 if (addr_type & IPV6_ADDR_LINKLOCAL) { 325 if (addr_len >= sizeof(struct sockaddr_in6) && 326 addr->sin6_scope_id) { 327 /* Override any existing binding, if another one 328 * is supplied by user. 329 / 330 sk->sk_bound_dev_if = addr->sin6_scope_id; 331 } 332 333 / Binding to link-local address requires an interface */ 334 if (!sk->sk_bound_dev_if) { 335 err = -EINVAL; 336 goto out_unlock; 337 } Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info besides address. Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-14 08:21:48 -04:00
Eric Dumazet	11fd165c68	sunrpc: use better NUMA affinities Use NUMA aware allocations to reduce latencies and increase throughput. sunrpc kthreads can use kthread_create_on_node() if pool_mode is "percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can also take into account NUMA node affinity for memory allocations. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: "J. Bruce Fields" <bfields@fieldses.org> CC: Neil Brown <neilb@suse.de> CC: David Miller <davem@davemloft.net> Reviewed-by: Greg Banks <gnb@fastmail.fm> [bfields@redhat.com: fix up caller nfs41_callback_up] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-08-19 13:25:36 -04:00
Linus Torvalds	28890d3598	Merge branch 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits) NFSv4: Don't use the delegation->inode in nfs_mark_return_delegation() nfs: don't use d_move in nfs_async_rename_done RDMA: Increasing RPCRDMA_MAX_DATA_SEGS SUNRPC: Replace xprt->resend and xprt->sending with a priority queue SUNRPC: Allow caller of rpc_sleep_on() to select priority levels SUNRPC: Support dynamic slot allocation for TCP connections SUNRPC: Clean up the slot table allocation SUNRPC: Initalise the struct xprt upon allocation SUNRPC: Ensure that we grab the XPRT_LOCK before calling xprt_alloc_slot pnfs: simplify pnfs files module autoloading nfs: document nfsv4 sillyrename issues NFS: Convert nfs4_set_ds_client to EXPORT_SYMBOL_GPL SUNRPC: Convert the backchannel exports to EXPORT_SYMBOL_GPL SUNRPC: sunrpc should not explicitly depend on NFS config options NFS: Clean up - simplify the switch to read/write-through-MDS NFS: Move the pnfs write code into pnfs.c NFS: Move the pnfs read code into pnfs.c NFS: Allow the nfs_pageio_descriptor to signal that a re-coalesce is needed NFS: Use the nfs_pageio_descriptor->pg_bsize in the read/write request NFS: Cache rpc_ops in struct nfs_pageio_descriptor ...	2011-07-27 13:23:02 -07:00
J. Bruce Fields	8fb47a4fbf	locks: rename lock-manager ops Both the filesystem and the lock manager can associate operations with a lock. Confusingly, one of them (fl_release_private) actually has the same name in both operation structures. It would save some confusion to give the lock-manager ops different names. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-20 20:23:19 -04:00
Vasily Averin	82c2c8b861	lockd: properly convert be32 values in debug messages lockd: server returns status 50331648 it's quite hard to understand that number in this message is 3 in big endian Signed-off-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-07-12 13:40:26 -04:00
Trond Myklebust	0b760113a3	NLM: Don't hang forever on NLM unlock requests If the NLM daemon is killed on the NFS server, we can currently end up hanging forever on an 'unlock' request, instead of aborting. Basically, if the rpcbind request fails, or the server keeps returning garbage, we really want to quit instead of retrying. Tested-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org	2011-06-15 11:24:27 -04:00
Chuck Lever	80c30e8de4	NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!" Nick Bowler <nbowler@elliptictech.com> reports: > We were just having some NFS server troubles, and my client machine > running 2.6.38-rc1+ (specifically, commit `2b1caf6ed7`) crashed > hard (syslog output appended to this mail). > > I'm not sure what the exact timeline was or how to reproduce this, > but the server was rebooted during all this. Since I've never seen > this happen before, it is possibly a regression from previous kernel > releases. However, I recently updated my nfs-utils (on the client) to > version 1.2.3, so that might be related as well. [ BUG output redacted ] When done searching, the for_each_host loop in next_host_state() falls through and returns the final host on the host chain without bumping it's reference count. Since the host's ref count is only one at that point, releasing the host in nlm_host_rebooted() attempts to destroy the host prematurely, and therefore hits a BUG(). Likely, the original intent of the for_each_host behavior in next_host_state() was to handle the case when the host chain is empty. Searching the chain and finding no suitable host to return needs to be handled as well. Defensively restructure next_host_state() always to return NULL when the loop falls through. Introduced by commit `b10e30f6` "lockd: reorganize nlm_host_rebooted". Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-01-25 15:24:47 -05:00
Dan Carpenter	51f128ea1c	lockd: double unlock in next_host_state() We unlock again after we goto out. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-01-04 13:10:37 -05:00
Chuck Lever	7969183660	lockd: Remove src_sap and src_len from nlm_lookup_host_info struct Clean up. The contents of the src_sap field is not used in nlm_alloc_host(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:27 -05:00
Chuck Lever	2025889828	lockd: Remove nlm_lookup_host() Clean up. Remove the now unused helper nlm_lookup_host(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:27 -05:00
Chuck Lever	fcc072c783	lockd: Make nrhosts an unsigned long Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:27 -05:00
Chuck Lever	d2df0484bb	lockd: Rename nlm_hosts Clean up. nlm_hosts now contains only server-side entries. Rename it to match convention of client side cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:27 -05:00
Chuck Lever	67216b94d4	lockd: Clean up nlmsvc_lookup_host() Clean up. Change nlmsvc_lookup_host() to be purpose-built for server-side nlm_host management. This replaces the generic nlm_lookup_host() helper function, just like on the client side. The lookup logic is specialized for server host lookups. The server side cache also gets its own specialized equivalent of the nlm_release_host() function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
Chuck Lever	8ea6ecc8b0	lockd: Create client-side nlm_host cache NFS clients don't need the garbage collection processing that is performed on nlm_host structures. The client picks up an nlm_host at mount time and holds a reference to it until the file system is unmounted. Servers, on the other hand, don't have a precise way to tell when an nlm_host is no longer being used, so zero refcount nlm_host entries are left to expire in the cache after a time. Basically there's nothing holding a reference to an nlm_host between individual server-side NLM requests, but we can't afford the expense of recreating them for every new NLM request from a client. The nlm_host cache adds some lifetime hysteresis to entries in the cache so the next time a particular nlm_host is needed, it's likely to be discovered by a lookup rather than created from whole cloth. With the new implementation, client nlm_host cache items are no longer garbage collected, and are destroyed directly by a new release function specialized for client entries, nlmclnt_release_host(). They are cached in their own data structure, and have their own lookup logic, simplified and specialized for client nlm_host entries. However, the client nlm_host cache still shares reboot recovery logic with the server nlm_host cache. The NSM "peer rebooted" downcall for clients and servers still come through the same RPC call. This is a legacy formal API that would be difficult to alter, and besides, the user space NSM implementation can't tell the difference between peers that are clients or servers. For this reason, the client cache continues to share the nlm_host_mutex (and reboot recovery logic) with the server cache. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
Chuck Lever	7db836d4a4	lockd: Split nlm_release_call() The nlm_release_call() function is invoked from both the server and the client side. We're about to introduce a distinct server- and client-side nlm_release_host(), so nlm_release_call() must first be split into a client-side and a server-side version. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
Chuck Lever	723bb5b505	lockd: Add nlm_destroy_host_locked() Refactor the tail of nlm_gc_hosts() into nlm_destroy_host() so that this logic can be used separately from garbage collection. Rename it _locked() to document that it must be called with the hosts cache mutex held. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
Chuck Lever	a7952f4056	lockd: Add nlm_alloc_host() Refactor nlm_host allocation and initialization into a separate function. This will be the common piece of server and client nlm_host lookup logic after the nlm_host cache is split. Small change: use kmalloc() instead of kzalloc(), as we're overwriting almost all fields in the new nlm_host struct with non-zero values immediately after it is allocated. An added benefit is we now have an explicit reference to each field name where it is initialized (for all you cscope fans out there). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
J. Bruce Fields	b10e30f655	lockd: reorganize nlm_host_rebooted Minor reorganization; no change in behavior. This will save some duplicated code after we split the client and server host caches. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> [ cel: Forward-ported to 2.6.37 ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
J. Bruce Fields	b113746888	lockd: define host_for_each{_safe} macros We've got a lot of loops like this, and I find them a little easier to read with the macros. More such loops are coming. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> [ cel: Forward-ported to 2.6.37 ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:26 -05:00
Chuck Lever	bf2695516d	SUNRPC: New xdr_streams XDR decoder API Now that all client-side XDR decoder routines use xdr_streams, there should be no need to support the legacy calling sequence [rpc_rqst , __be32 , RPC res *] anywhere. We can construct an xdr_stream in the generic RPC code, instead of in each decoder function. This is a refactoring change. It should not cause different behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:25 -05:00
Chuck Lever	9f06c719f4	SUNRPC: New xdr_streams XDR encoder API Now that all client-side XDR encoder routines use xdr_streams, there should be no need to support the legacy calling sequence [rpc_rqst , __be32 , RPC arg *] anywhere. We can construct an xdr_stream in the generic RPC code, instead of in each encoder function. Also, all the client-side encoder functions return 0 now, making a return value superfluous. Take this opportunity to convert them to return void instead. This is a refactoring change. It should not cause different behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:25 -05:00
Chuck Lever	49b170047f	NSM: Avoid return code checking in NSM XDR encoder functions Clean up. The trend in the other XDR encoder functions is to BUG() when encoding problems occur, since a problem here is always due to a local coding error. Then, instead of a status, zero is unconditionally returned. Update the NSM XDR encoders to behave this way. To finish the update, use the new-style be32_to_cpup() and cpu_to_be32() macros, and compute the buffer sizes using raw integers instead of sizeof(). This matches the conventions used in other XDR functions Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:24 -05:00
Chuck Lever	d8367c504e	lockd: Move nlmdbg_cookie2a() to svclock.c Clean up. nlmdbg_cookie2a() is used only in svclock.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:24 -05:00
Chuck Lever	3460f29a27	lockd: Introduce new-style XDR functions for NLMv4 We'd like to prevent local buffer overflows caused by malicious or broken servers. New xdr_stream style decoders can do that. For efficiency, we also want to be able to pass xdr_streams from call_encode() to all XDR encoding functions, rather than building an xdr_stream in every XDR encoding function in the kernel. Same idea as the NLM v3 XDR overhaul. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:23 -05:00
Chuck Lever	2b061f9ef2	lockd: Introduce new-style XDR functions for NLMv3 We'd like to prevent local buffer overflows caused by malicious or broken servers. New xdr_stream style decoders can do that. For efficiency, we also eventually want to be able to pass xdr_streams from call_encode() and call_decode() to all XDR encoding functions, rather than building an xdr_stream in every XDR encoding and decoding function in the kernel. To do all of this, rewrite the XDR encoding and decoding functions in fs/lockd/xdr.c to use xdr_streams. This makes them more or less incompatible with server-side XDR helper functions, so break them out into a separate source file. Static helper functions are left without the "inline" directive. This allows the compiler to choose automatically how to optimize these for size or speed. SHARE-related functionality doesn't seem to be used, as those functions are hiding behind a #define that isn't set anywhere that I can find. And, they've been in there forever (at least as far back as the kernel's git history goes), yet remain unused. Let's take the opportunity to bin them. It should be easy enough for someone to introduce proper XDR functions if at some point SHARE-related NLM functionality is desired. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:21 -05:00
Arnd Bergmann	451a3c24b0	BKL: remove extraneous #include <smp_lock.h> The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
Trond Myklebust	8e35f8e7c6	NLM: Fix a regression in lockd Nick Bowler reports: There are no unusual messages on the client... but I just logged into the server and I see lots of messages of the following form: nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! Bisected to commit `9247685088` (SUNRPC: Properly initialize sock_xprt.srcaddr in all cases) Apparently, removing the 'transport->srcaddr.ss_family = family' from xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly initialising the srcaddr family to AF_UNSPEC. Reported-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-11-15 20:44:26 -05:00
J. Bruce Fields	a282a1fa6b	lockd: fix nlmsvc_notify_blocked locking nlmsvc_notify_blocked walks the nlm_blocked list, which requires nlm_blocked_lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-27 21:39:50 +02:00
Arnd Bergmann	763641d812	lockd: push lock_flocks down lockd should use lock_flocks() instead of lock_kernel() to lock against posix locks accessing the i_flock list. This is a prerequisite to turning lock_flocks into a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>	2010-10-27 21:39:39 +02:00
Linus Torvalds	4390110fef	Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits) svcrpc: svc_tcp_sendto XPT_DEAD check is redundant svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue svcrpc: assume svc_delete_xprt() called only once svcrpc: never clear XPT_BUSY on dead xprt nfsd4: fix connection allocation in sequence() nfsd4: only require krb5 principal for NFSv4.0 callbacks nfsd4: move minorversion to client nfsd4: delay session removal till free_client nfsd4: separate callback change and callback probe nfsd4: callback program number is per-session nfsd4: track backchannel connections nfsd4: confirm only on succesful create_session nfsd4: make backchannel sequence number per-session nfsd4: use client pointer to backchannel session nfsd4: move callback setup into session init code nfsd4: don't cache seq_misordered replies SUNRPC: Properly initialize sock_xprt.srcaddr in all cases SUNRPC: Use conventional switch statement when reclassifying sockets sunrpc/xprtrdma: clean up workqueue usage sunrpc: Turn list_for_each-s into the ..._entry-s ... Fix up trivial conflicts (two different deprecation notices added in separate branches) in Documentation/feature-removal-schedule.txt	2010-10-26 09:55:25 -07:00
Pavel Emelyanov	c653ce3f0a	sunrpc: Add net to rpc_create_args Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 17:18:56 -04:00
Pavel Emelyanov	fc5d00b04a	sunrpc: Add net argument to svc_create_xprt Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 17:18:54 -04:00
Bryan Schumaker	f904be9cc7	lockd: Mostly remove BKL from the server This patch removes all but one call to lock_kernel() from the server. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-22 15:32:58 -04:00
Bryan Schumaker	63185942c5	lockd: Remove BKL from the client This patch removes all calls to lock_kernel() from the client. This patch should be applied after the "fs/lock.c prepare for BKL removal" patch submitted by Arnd Bergmann on September 18. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-09-22 09:50:35 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jeff Layton	7e469af97e	lockd: don't clear sm_monitored on nsm_reboot_lookup When lockd gets a notify downcall from statd, it'll search its hosts cache and then clear the sm_monitored bit on the host it finds. The idea is apparently to make lockd redo a SM_MON on the next lock request. This is unnecessary and causes the kernel's NSM cache to go out of sync with statd. statd doesn't stop monitoring a host when it gets a SM_NOTIFY and there's no guarantee that another lock will occur after the reclaim and before the unmount. In that event, no SM_UNMON will occur. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-08 16:20:35 -05:00
Jeff Layton	cdd30fa166	lockd: release reference to nsm_handle in nlm_host_rebooted nsm_reboot_lookup takes a reference to the nsm_handle that it returns, but nlm_host_rebooted never releases that reference. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-08 16:20:35 -05:00
Chuck Lever	d6783b2b6c	SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() Clean up: Bruce observed we have more or less common logic in each of svc_create_xprt()'s callers: the check to create an IPv6 RPC listener socket only if CONFIG_IPV6 is set. I'm about to add another case that does just the same. If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt() call sites can get rid of the "#ifdef" ugliness, and can use the same logic with or without IPv6 support available in the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-26 17:56:43 -05:00
Linus Torvalds	37c24b37fb	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: (42 commits) nfsd: remove pointless paths in file headers nfsd: move most of nfsfh.h to fs/nfsd nfsd: remove unused field rq_reffh nfsd: enable V4ROOT exports nfsd: make V4ROOT exports read-only nfsd: restrict filehandles accepted in V4ROOT case nfsd: allow exports of symlinks nfsd: filter readdir results in V4ROOT case nfsd: filter lookup results in V4ROOT case nfsd4: don't continue "under" mounts in V4ROOT case nfsd: introduce export flag for v4 pseudoroot nfsd: let "insecure" flag vary by pseudoflavor nfsd: new interface to advertise export features nfsd: Move private headers to source directory vfs: nfsctl.c un-used nfsd #includes lockd: Remove un-used nfsd headers #includes s390: remove un-used nfsd #includes sparc: remove un-used nfsd #includes parsic: remove un-used nfsd #includes compat.c: Remove dependence on nfsd private headers ...	2009-12-16 10:43:34 -08:00
Boaz Harrosh	0296f55f98	lockd: Remove un-used nfsd headers #includes In what history where these ever needed? Well not any more. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-14 18:12:11 -05:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Eric W. Biederman	ab09203e30	sysctl fs: Remove dead binary sysctl support Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:04:55 -08:00
Alexey Dobriyan	2bcd57ab61	headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:13:10 -07:00
Linus Torvalds	a87e84b5cd	Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux * 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits) nfsd4: nfsv4 clients should cross mountpoints nfsd: revise 4.1 status documentation sunrpc/cache: avoid variable over-loading in cache_defer_req sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req nfsd: return success for non-NFS4 nfs4_state_start nfsd41: Refactor create_client() nfsd41: modify nfsd4.1 backchannel to use new xprt class nfsd41: Backchannel: Implement cb_recall over NFSv4.1 nfsd41: Backchannel: cb_sequence callback nfsd41: Backchannel: Setup sequence information nfsd41: Backchannel: Server backchannel RPC wait queue nfsd41: Backchannel: Add sequence arguments to callback RPC arguments nfsd41: Backchannel: callback infrastructure nfsd4: use common rpc_cred for all callbacks nfsd4: allow nfs4 state startup to fail SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous nfsd4: fix null dereference creating nfsv4 callback client nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked. ...	2009-09-22 07:54:33 -07:00
Alexey Dobriyan	7b021967c5	const: make lock_manager_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Alexey Dobriyan	6aed62853c	const: make file_lock_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:25 -07:00
Jeff Layton	4516fc0454	sunrpc: add routine for comparing addresses lockd needs these sort of routines, as does the NFSv4 callback code. Move lockd's routines into common code and rename them so that they can be used by others. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-21 11:27:42 -04:00
Chuck Lever	c15128c5e4	lockd: Replace nsm_display_address() with rpc_ntop() Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:39 -04:00
Chuck Lever	b97a56747e	lockd: Replace nlm_clear_port() Clean up: Use shared rpc_set_port() function instead of nlm_clear_port(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:38 -04:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
Linus Torvalds	7e0338c0de	Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...	2009-06-22 12:55:50 -07:00
Chuck Lever	0e5c2632e1	lockd: Don't bother with RPC ping for NSM upcalls Cut NSM upcall RPC traffic in half -- don't do a NULL call first. The cases where a ping would be helpful are rare. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:11 -07:00
Chuck Lever	6c9dc42551	lockd: Update NSM state from SM_MON replies When rpc.statd starts up in user space at boot time, it attempts to write the latest NSM local state number into /proc/sys/fs/nfs/nsm_local_state. If lockd.ko isn't loaded yet (as is the case in most configurations), that file doesn't exist, thus the kernel's NSM state remains set to its initial value of zero during lockd operation. This is a problem because rpc.statd and lockd use the NSM state number to prevent repeated lock recovery on rebooted hosts. If lockd sends a zero NSM state, but then a delayed SM_NOTIFY with a real NSM state number is received, there is no way for lockd or rpc.statd to distinguish that stale SM_NOTIFY from an actual reboot. Thus lock recovery could be performed after the rebooted host has already started reclaiming locks, and those locks will be lost. We could change /etc/init.d/nfslock so it always modprobes lockd.ko before starting rpc.statd. However, if lockd.ko is ever unloaded and reloaded, we are back at square one, since the NSM state is not preserved across an unload/reload cycle. This may happen frequently on clients that use automounter. A period of NFS inactivity causes lockd.ko to be unloaded, and the kernel loses its NSM state setting. Instead, let's use the fact that rpc.statd plants the local system's NSM state in every SM_MON (and SM_UNMON) reply. lockd performs a synchronous SM_MON upcall to the local rpc.statd _before_ sending its first NLM request to a new remote. This would permit rpc.statd to provide the current NSM state to lockd, even after lockd.ko had been unloaded and reloaded. Note that NLMPROC_LOCK arguments are constructed before the nsm_monitor() call, so we have to rearrange argument construction very slightly to make this all work out. And, the kernel appears to treat NSM state as a u32 (see struct nlm_args and nsm_res). Make nsm_local_state a u32 as well, to ensure we don't get bogus comparison results. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 18:02:10 -07:00
Trond Myklebust	5cd973c44a	NFSv4/NLM: Push file locking BKL dependencies down into the NLM layer Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 13:23:01 -07:00
J. Bruce Fields	7eef4091a6	Merge commit 'v2.6.30' into for-2.6.31	2009-06-15 18:08:07 -07:00
J. Bruce Fields	89996df4b5	lockd: fix list corruption on lockd restart If lockd is signalled soon enough after restart then locks_start_grace() will try to re-add an entry to a list and trigger a lock corruption warning. Thanks to Wang Chen for the problem report and diagnosis. WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c() ... list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128). ... Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3 Call Trace: [<c042d5b5>] warn_slowpath+0x71/0xa0 [<c0422a96>] ? update_curr+0x11d/0x125 [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c044b270>] ? trace_hardirqs_on+0xb/0xd [<c051c61a>] ? _raw_spin_lock+0x53/0xfa [<c051c89f>] __list_add+0x27/0x5c [<ef8f6daa>] locks_start_grace+0x22/0x30 [lockd] [<ef8f34da>] set_grace_period+0x39/0x53 [lockd] [<c06b8921>] ? lock_kernel+0x1c/0x28 [<ef8f3558>] lockd+0x64/0x164 [lockd] [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c04227b0>] ? complete+0x34/0x3e [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<c043dd42>] kthread+0x45/0x6b [<c043dcfd>] ? kthread+0x0/0x6b [<c0403c23>] kernel_thread_helper+0x7/0x10 Reported-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: stable@kernel.org	2009-05-06 17:19:36 -04:00
Felix Blyakher	a9e61e25f9	lockd: call locks_release_private to cleanup per-filesystem state For every lock request lockd creates a new file_lock object in nlmsvc_setgrantargs() by copying the passed in file_lock with locks_copy_lock(). A filesystem can attach it's own lock_operations vector to the file_lock. It has to be cleaned up at the end of the file_lock's life. However, lockd doesn't do it today, yet it asserts in nlmclnt_release_lockargs() that the per-filesystem state is clean. This patch fixes it by exporting locks_release_private() and adding it to nlmsvc_freegrantargs(), to be symmetrical to creating a file_lock in nlmsvc_setgrantargs(). Signed-off-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-24 16:36:03 -04:00
Linus Torvalds	a63856252d	Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits) nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4 nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc nfsd41: Documentation/filesystems/nfs41-server.txt nfsd41: CREATE_EXCLUSIVE4_1 nfsd41: SUPPATTR_EXCLCREAT attribute nfsd41: support for 3-word long attribute bitmask nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify nfsd41: pass writable attrs mask to nfsd4_decode_fattr nfsd41: provide support for minor version 1 at rpc level nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap nfsd41: access_valid nfsd41: clientid handling nfsd41: check encode size for sessions maxresponse cached nfsd41: stateid handling nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op nfsd41: destroy_session operation nfsd41: non-page DRC for solo sequence responses nfsd41: Add a create session replay cache nfsd41: create_session operation ...	2009-04-06 13:25:56 -07:00
Mans Rullgard	ad5b365c12	NSM: Fix unaligned accesses in nsm_init_private() This fixes unaligned accesses in nsm_init_private() when creating nlm_reboot keys. Signed-off-by: Mans Rullgard <mans@mansr.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-04-01 13:24:14 -04:00
Chuck Lever	eb16e90778	lockd: Start PF_INET6 listener only if IPv6 support is available Apparently a lot of people need to disable IPv6 completely on their distributor-built systems, which have CONFIG_IPV6_MODULE enabled at build time. They do this by blacklisting the ipv6.ko module. This causes the creation of the lockd service listener to fail if CONFIG_IPV6_MODULE is set, but the module cannot be loaded. Now that the kernel's PF_INET6 RPC listeners are completely separate from PF_INET listeners, we can always start PF_INET. Then lockd can try to start PF_INET6, but it isn't required to be available. Note this has the added benefit that NLM callbacks from AF_INET6 servers will never come from AF_INET remotes. We no longer have to worry about matching mapped IPv4 addresses to AF_INET when comparing addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 16:01:16 -04:00
Chuck Lever	26298caaca	NFS: Revert creation of IPv6 listeners for lockd and NFSv4 callbacks We're about to convert over to using separate PF_INET and PF_INET6 listeners, instead of a single PF_INET6 listener that also receives AF_INET requests and maps them to AF_INET6. Clear the way by removing the logic in lockd and the NFSv4 callback server that creates an AF_INET6 service listener. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:55:06 -04:00
Chuck Lever	49a9072f29	SUNRPC: Remove @family argument from svc_create() and svc_create_pooled() Since an RPC service listener's protocol family is specified now via svc_create_xprt(), it no longer needs to be passed to svc_create() or svc_create_pooled(). Remove that argument from the synopsis of those functions, and remove the sv_family field from the svc_serv struct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:48 -04:00
Chuck Lever	9652ada3fb	SUNRPC: Change svc_create_xprt() to take a @family argument The sv_family field is going away. Pass a protocol family argument to svc_create_xprt() instead of extracting the family from the passed-in svc_serv struct. Again, as this is a listener socket and not an address, we make this new argument an "int" protocol family, instead of an "sa_family_t." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:36 -04:00

1 2 3 4 5 ...

536 Commits