linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Fred Isaman	808ba32abe	pnfs: Store return value of decode_layoutget for later processing This will be needed to seperate return value of OPEN and LAYOUTGET when they are combined into a single RPC. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:10 -04:00
Fred Isaman	34ec9aac7d	pnfs: Remove redundant assignment from nfs4_proc_layoutget(). nfs_init_sequence() will clear this for us. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:03:10 -04:00
Benjamin Coddington	a3cf9bca2a	NFSv4: Don't add a new lock on an interrupted wait for LOCK If the wait for a LOCK operation is interrupted, and then the file is closed, the locks cleanup code will assume that no new locks will be added to the inode after it has completed. We already have a mechanism to detect if there was signal, so let's use that to avoid recreating the local lock once the RPC completes. Also skip re-sending the LOCK operation for the various error cases if we were signaled. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> [Trond: Fix inverted test of locks_lock_inode_wait()] Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	cf61eb2686	NFSv4: Always clear the pNFS layout when handling ESTALE If we get an ESTALE error in response to an RPC call operating on the file on the MDS, we should immediately cancel the layout for that file. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Dave Wysochanski	d68894800e	NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message In nfs_idmap_read_and_verify_message there is an incorrect sprintf '%d' that converts the __u32 'im_id' from struct idmap_msg to 'id_str', which is a stack char array variable of length NFS_UINT_MAXLEN == 11. If a uid or gid value is > 2147483647 = 0x7fffffff, the conversion overflows into a negative value, for example: crash> p (unsigned) (0x80000000) $1 = 2147483648 crash> p (signed) (0x80000000) $2 = -2147483648 The '-' sign is written to the buffer and this causes a 1 byte overflow when the NULL byte is written, which corrupts kernel stack memory. If CONFIG_CC_STACKPROTECTOR_STRONG is set we see a stack-protector panic: [11558053.616565] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa05b8a8c [11558053.639063] CPU: 6 PID: 9423 Comm: rpc.idmapd Tainted: G W ------------ T 3.10.0-514.el7.x86_64 #1 [11558053.641990] Hardware name: Red Hat OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014 [11558053.644462] ffffffff818c7bc0 00000000b1f3aec1 ffff880de0f9bd48 ffffffff81685eac [11558053.646430] ffff880de0f9bdc8 ffffffff8167f2b3 ffffffff00000010 ffff880de0f9bdd8 [11558053.648313] ffff880de0f9bd78 00000000b1f3aec1 ffffffff811dcb03 ffffffffa05b8a8c [11558053.650107] Call Trace: [11558053.651347] [<ffffffff81685eac>] dump_stack+0x19/0x1b [11558053.653013] [<ffffffff8167f2b3>] panic+0xe3/0x1f2 [11558053.666240] [<ffffffff811dcb03>] ? kfree+0x103/0x140 [11558053.682589] [<ffffffffa05b8a8c>] ? idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4] [11558053.689710] [<ffffffff810855db>] __stack_chk_fail+0x1b/0x30 [11558053.691619] [<ffffffffa05b8a8c>] idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4] [11558053.693867] [<ffffffffa00209d6>] rpc_pipe_write+0x56/0x70 [sunrpc] [11558053.695763] [<ffffffff811fe12d>] vfs_write+0xbd/0x1e0 [11558053.702236] [<ffffffff810acccc>] ? task_work_run+0xac/0xe0 [11558053.704215] [<ffffffff811fec4f>] SyS_write+0x7f/0xe0 [11558053.709674] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b Fix this by calling the internally defined nfs_map_numeric_to_string() function which properly uses '%u' to convert this __u32. For consistency, also replace the one other place where snprintf is called. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reported-by: Stephen Johnston <sjohnsto@redhat.com> Fixes: `cf4ab538f1` ("NFSv4: Fix the string length returned by the idmapper") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	d554168f87	NFS: Fix up nfs_post_op_update_inode() to force ctime updates We do not want to ignore ctime updates that originate from functions such as link(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	472f761e11	NFS: Ensure we revalidate the inode correctly after setacl Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	59a707b0d4	NFS: Ensure we revalidate the inode correctly after remove or rename We may need to revalidate the change attribute, ctime and the nlinks count. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	821a868a23	NFS: Set the force revalidate flag if the inode is not completely initialised Ensure that a delegation doesn't cause us to skip initialising the inode if it was incomplete when we exited nfs_fhget() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	3cb3fd6da4	NFS: Fix up sillyrename() Ensure that we register the fact that the inode ctime has changed. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	ed7e9ad090	NFSv4: Fix sillyrename to return the delegation when appropriate Ensure that we pass down the inode of the file being deleted so that we can return any delegation being held. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Trond Myklebust	991eedb137	NFSv4: Only pass the delegation to setattr if we're sending a truncate Even then it isn't really necessary. The reason why we may not want to pass in a stateid in other cases is that we cannot use the delegation credential. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Anna Schumaker	2f261020b6	NFS: Merge nfs41_free_stateid() with _nfs41_free_stateid() Having these exist as two functions doesn't seem to add anything useful, and I think merging them together makes this easier to follow. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Anna Schumaker	fba83f3411	NFS: Pass "privileged" value to nfs4_init_sequence() We currently have a separate function just to set this, but I think it makes more sense to set it at the same time as the other values in nfs4_init_sequence() Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Anna Schumaker	e9ae1ee2b2	NFS: Move call to nfs4_state_protect() to nfs4_commit_setup() Rather than doing this in the generic NFS client code. Let's put this with the other v4 stuff so it's all in one place. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
Anna Schumaker	fb91fb0ee7	NFS: Move call to nfs4_state_protect_write() to nfs4_write_setup() This doesn't really need to be in the generic NFS client code, and I think it makes more sense to keep the v4 code in one place. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:16 -04:00
NeilBrown	e04bbf6b1b	NFS: Avoid quadratic search when freeing delegations. There are three places that walk all delegation for an nfs_client and restart whenever they find something interesting - potentially resulting in a quadratic search: If there are 10,000 uninteresting delegations followed by 10,000 interesting one, then the code skips over 100,000,000 delegations, which can take a noticeable amount of time. Of these nfs_delegation_reap_unclaimed() and nfs_reap_expired_delegations() are only called during unusual events: a server reboots or reports expired delegations, probably due to a network partition. Optimizing these is not particularly important. The third, nfs_client_return_marked_delegations(), is called periodically via nfs_expire_unreferenced_delegations(). It could cause periodic problems on a busy server. New delegations are added to the end of the list, so if there are 10,000 open files with delegations, and 10,000 more recently opened files that received delegations but are now closed, then nfs_client_return_marked_delegations() can take seconds to skip over the 10,000 open files 10,000 times. That is a waste of time. The avoid this waste a place-holder (an inode) is kept when locks are dropped, so that the place can usually be found again after taking rcu_readlock(). This place holder ensure that we find the right starting point in the list of nfs_servers, and makes is probable that we find the right starting point in the list of delegations. We might need to occasionally restart at the head of that list. It might be possible that the place_holder inode could lose its delegation separately, and then get a new one using the same (freed and then reallocated) 'struct nfs_delegation'. Were this to happen, the new delegation would be at the end of the list and we would miss returning some other delegations. This would have the effect of unnecessarily delaying the return of some unused delegations until the next time this function is called - typically 90 seconds later. As this is not a correctness issue and is vanishingly unlikely to happen, it does not seem worth addressing. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 15:02:14 -04:00
NeilBrown	3ca951b618	NFS: use cond_resched() when restarting walk of delegation list. In three places we walk the list of delegations for an nfs_client until an interesting one is found, then we act of that delegation and restart the walk. New delegations are added to the end of a list and the interesting delegations are usually old, so in many case we won't repeat a long walk over and over again, but it is possible - particularly if the first server in the list has a large number of uninteresting delegations. In each cache the work done on interesting delegations will often complete without sleeping, so this could loop many times without giving up the CPU. So add a cond_resched() at an appropriate point to avoid hogging the CPU for too long. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 14:59:19 -04:00
NeilBrown	f389349142	NFS: slight optimization for walking list for delegations There are 3 places where we walk the list of delegations for an nfs_client. In each case there are two nested loops, one for nfs_servers and one for nfs_delegations. When we find an interesting delegation we try to get an active reference to the server. If that fails, it is pointless to continue to look at the other delegation for the server as we will never be able to get an active reference. So instead of continuing in the inner loop, break out and continue in the outer loop. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-31 14:59:19 -04:00
Trond Myklebust	9f6d44d418	NFS: Optimise away lookups for rename targets We can optimise away any lookup for a rename target, unless we're being asked to revalidate a dentry that might be in use. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-28 13:29:19 -04:00
Trond Myklebust	73dd684a4d	NFS: If the VFS sets LOOKUP_REVAL then force a lookup of the dentry If nfs_lookup_revalidate() is called with LOOKUP_REVAL because a previous path lookup failed, then we ought to force a full lookup of the component name. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-28 13:29:19 -04:00
Trond Myklebust	479219218f	NFS: Optimise away the close-to-open GETATTR when we have NFSv4 OPEN NFSv4 should not need to perform an extra close-to-open GETATTR as part of the process of looking up a regular file, since the OPEN call will do that for us. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-05-28 13:29:19 -04:00
Deepa Dinamani	0a2dfbecb3	fs: nfs: get rid of memcpys for inode times Subsequent patches in the series convert inode timestamps to use struct timespec64 instead of struct timespec as part of solving the y2038 problem. This will lead to type mismatch for memcpys. Use regular assignments instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: trond.myklebust@primarydata.com	2018-05-25 15:31:13 -07:00
Christoph Hellwig	c350637227	proc: introduce proc_create_net{,_data} Variants of proc_create{,_data} that directly take a struct seq_operations and deal with network namespaces in ->open and ->release. All callers of proc_create + seq_open_net converted over, and seq_{open,release}_net are removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-05-16 07:24:30 +02:00
Linus Torvalds	a1bf4c7da6	NFS client updates for Linux 4.17 Stable bugfixes: - xprtrdma: Fix corner cases when handling device removal # v4.12+ - xprtrdma: Fix latency regression on NUMA NFS/RDMA clients # v4.15+ Features: - New sunrpc tracepoint for RPC pings - Finer grained NFSv4 attribute checking - Don't unnecessarily return NFS v4 delegations Other bugfixes and cleanups: - Several other small NFSoRDMA cleanups - Improvements to the sunrpc RTT measurements - A few sunrpc tracepoint cleanups - Various fixes for NFS v4 lock notifications - Various sunrpc and NFS v4 XDR encoding cleanups - Switch to the ida_simple API - Fix NFSv4.1 exclusive create - Forget acl cache after setattr operation - Don't advance the nfs_entry readdir cookie if xdr decoding fails -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlrNG1IACgkQ18tUv7Cl QOvotw//fQoUgQ/AOJGlZo/4ws2mGJN3dfwwKM8xYOnHaxppOYubZRHwvswK8d22 +XR/Q6IVbUxI3mJluv1L0d9CJT06s3c9CO90McIJbk4CWihGP19bNIY4JiPlzrbv 4FDiyOvMBej2UXbHX5EzKj0srxyBoEVf3iUAIa6DaHi3c6EIUo6fP3d2eRNJStqd WMyZs+nqr2W9biyClxntT7l/Sk+o+4I7M3Oo9pjjS+PiePYdaMrL5T1kPeHaJshF GMGXkbvVdqpDRiXX84R9+2/nuSiA15eEnaR94UNvs84oLR3qob3ZhxhudqFdSPrX RS6E7m34gY/EaQm/wbB26PZm+3jHd4Pqm5SKLbyFfoCmG6oMwBvXNRJZas1DFaHM CMOECvfAr6kixVLkAN0MNQ2Ku/FuJ52OLP1dRLmxsblocnhEPujc6RSz6Ju/v3a0 adbpmJMA2IoSGgXMu3g1VGnjHfMj7ZmjtpigXVvlcUqQGCL7t4ngh23cpeTQeJ76 bMwSHUQu18NbmtJjBTE+PIm7mdCrpQD7ZuOPWpK62zxLYUnnv7nm75m84DrDru7d XAmrCmdUJNrVWQs6BAtCXgO4PZ6xNGLosb0xTQXTAQYftc+DRJ9SW/VGc0Mp1L9m 0G0iz++b8cy4Pih5UCDJcCkpjCIvHLcn72zn1kbufWqG3xr2koc= =IlWo -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.17-1' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client updates from Anna Schumaker: "Stable bugfixes: - xprtrdma: Fix corner cases when handling device removal # v4.12+ - xprtrdma: Fix latency regression on NUMA NFS/RDMA clients # v4.15+ Features: - New sunrpc tracepoint for RPC pings - Finer grained NFSv4 attribute checking - Don't unnecessarily return NFS v4 delegations Other bugfixes and cleanups: - Several other small NFSoRDMA cleanups - Improvements to the sunrpc RTT measurements - A few sunrpc tracepoint cleanups - Various fixes for NFS v4 lock notifications - Various sunrpc and NFS v4 XDR encoding cleanups - Switch to the ida_simple API - Fix NFSv4.1 exclusive create - Forget acl cache after setattr operation - Don't advance the nfs_entry readdir cookie if xdr decoding fails" * tag 'nfs-for-4.17-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (47 commits) NFS: advance nfs_entry cookie only after decoding completes successfully NFSv3/acl: forget acl cache after setattr NFSv4.1: Fix exclusive create NFSv4: Declare the size up to date after it was set. nfs: Use ida_simple API NFSv4: Fix the nfs_inode_set_delegation() arguments NFSv4: Clean up CB_GETATTR encoding NFSv4: Don't ask for attributes when ACCESS is protected by a delegation NFSv4: Add a helper to encode/decode struct timespec NFSv4: Clean up encode_attrs NFSv4; Clean up XDR encoding of type bitmap4 NFSv4: Allow GFP_NOIO sleeps in decode_attr_owner/decode_attr_group SUNRPC: Add a helper for encoding opaque data inline SUNRPC: Add helpers for decoding opaque and string types NFSv4: Ignore change attribute invalidations if we hold a delegation NFS: More fine grained attribute tracking NFS: Don't force unnecessary cache invalidation in nfs_update_inode() NFS: Don't redirty the attribute cache in nfs_wcc_update_inode() NFS: Don't force a revalidation of all attributes if change is missing NFS: Convert NFS_INO_INVALID flags to unsigned long ...	2018-04-12 12:55:50 -07:00
Frank Sorenson	98de9ce6f6	NFS: advance nfs_entry cookie only after decoding completes successfully In nfs[34]_decode_dirent, the cookie is advanced as soon as it is read, but decoding may still fail later in the function, returning an error. Because the cookie has been advanced, the failing entry is not re-requested from the server, resulting in a missing directory entry. In addition, nfs v3 and v4 read the cookie at different locations in the xdr_stream, so the behavior of the two can be inconsistent. Fix these by reading the cookie into a temporary variable, and only advancing the cookie once the entire entry has been decoded from the xdr_stream successfully. Signed-off-by: Frank Sorenson <sorenson@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
chendt	dbc898ae10	NFSv3/acl: forget acl cache after setattr Sync of ACL with std permissions fail，We need to forget the ACL cache after setattr. Reproduction: #!/bin/bash touch testfile cat <<EOF >testfile #!/bin/bash echo "Test was executed" EOF chmod u=rwx testfile chmod g=rw- testfile chmod o=r-- testfile chacl u::r--,g::rwx,o:rw- testfile chmod u+w testfile ls -l testfile chacl -l testfile Output： -rw-rwxrw- 1 root root 0 Mar 28 05:29 testfile testfile [u::r--,g::rwx,o::rw-] Signed-off-by: chendt.fnst <chendt.fnst@cn.fujitsu.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Kinglong Mee <Kinglong Mee> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	609339c123	NFSv4.1: Fix exclusive create When we use EXCLUSIVE4_1 mode, the server returns an attribute mask where all the bits indicate which attributes were set, and where the verifier was stored. In order to figure out which attribute we have to resend, we need to clear out the attributes that are set in exclcreat_bitmask. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> [Anna: Fixed typo NFS4_CREATE_EXCLUSIVE4 -> NFS4_CREATE_EXCLUSIVE] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	f6cdfa6dd6	NFSv4: Declare the size up to date after it was set. When we've changed the file size, then ensure we declare it to be up to date in the inode attributes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Matthew Wilcox	aae5730e2d	nfs: Use ida_simple API Allocate the owner_id when we allocate the state and free it when we free the state. That lets us get rid of a gnarly ida_pre_get() / ida_get_new() loop. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	35156bfff3	NFSv4: Fix the nfs_inode_set_delegation() arguments Neither nfs_inode_set_delegation() nor nfs_inode_reclaim_delegation() are generic code. They have no business delving into NFSv4 OPEN xdr structures, so let's replace the "struct nfs_openres" parameter. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	8b06494624	NFSv4: Clean up CB_GETATTR encoding Replace the open coded bitmap implementation with a generic one. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	8bcbe7d98c	NFSv4: Don't ask for attributes when ACCESS is protected by a delegation If we hold a delegation, then the results of the ACCESS call are protected anyway. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	36b3743fef	NFSv4: Add a helper to encode/decode struct timespec Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	40a3426c75	NFSv4: Clean up encode_attrs Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	37c88763de	NFSv4; Clean up XDR encoding of type bitmap4 Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	e8d8aa46be	NFSv4: Allow GFP_NOIO sleeps in decode_attr_owner/decode_attr_group Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	d943f2dd8d	NFSv4: Ignore change attribute invalidations if we hold a delegation Don't bother even recording an invalid change attribute if we hold a delegation since we already know the state of our attribute cache. We can rely on the fact that we will pick up a copy from the server when we return the delegation. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	16e1437517	NFS: More fine grained attribute tracking Currently, if the NFS_INO_INVALID_ATTR flag is set, for instance by a call to nfs_post_op_update_inode_locked(), then it will not be cleared until all the attributes have been revalidated. This means, for instance, that NFSv4 writes will always force a full attribute revalidation. Track the ctime, mtime, size and change attribute separately from the other attributes so that we can have nfs_post_op_update_inode_locked() set them correctly, and later have the cache consistency bitmask be able to clear them. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	cac88f942d	NFS: Don't force unnecessary cache invalidation in nfs_update_inode() If we managed to revalidate all the attributes, then there is no reason to mark them as invalid again. We do, however want to ensure that we set nfsi->attrtimeo correctly. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	783b194c6e	NFS: Don't redirty the attribute cache in nfs_wcc_update_inode() If we received weak cache consistency data from the server, then those attributes are up to date, and there is no reason to mark them as dirty in the attribute cache. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	8619ddd07b	NFS: Don't force a revalidation of all attributes if change is missing Even if the change attribute is missing, it is still OK to mark the other attributes as being up to date. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	c01d36457d	NFSv4: Don't return the delegation when not needed by NFSv4.x (x>0) Starting with NFSv4.1, the server is able to deduce the client id from the SEQUENCE op which means it can always figure out whether or not the client is holding a delegation on a file that is being changed. For that reason, RFC5661 does not require a delegation to be unconditionally recalled on operations such as SETATTR, RENAME, or REMOVE. Note that for now, we continue to return READ delegations since that is still expected by the Linux knfsd server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	c135cb39a9	NFS: Remove the unused return_delegation() callback Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	199366f017	NFS: Move the delegation return down into _nfs4_do_setattr() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	977fcc2b0b	NFS: Add a delegation return into nfs4_proc_unlink_setup() Ensure that when we do finally delete the file, then we return the delegation. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	f2c2c552f1	NFS: Move delegation recall into the NFSv4 callback for rename_setup() Move the delegation recall out of the generic code, and into the NFSv4 specific callback. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	912678dbc5	NFS: Move the delegation return down into nfs4_proc_remove() Move the delegation return out of generic code and down into the NFSv4 specific unlink code. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	9f76827287	NFS: Move the delegation return down into nfs4_proc_link() Move the delegation return out of generic code. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Trond Myklebust	f50862423f	NFSv4: Fix nfs4_return_incompatible_delegation The 'fmode' argument can take an FMODE_EXEC value, which we want to filter out before comparing to the delegation type. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Jeff Layton	571745935b	nfs4: wake any lock waiters on successful RECLAIM_COMPLETE If we have a RECLAIM_COMPLETE with a populated cl_lock_waitq, then that implies that a reconnect has occurred. Since we can't expect a CB_NOTIFY_LOCK callback at that point, just wake up the entire queue so that all the tasks can re-poll for their locks. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Jeff Layton	5656610325	nfs4: don't compare clientid in nfs4_wake_lock_waiter The task is expected to sleep for a while here, and it's possible that a new EXCHANGE_ID has occurred in the interim, and we were assigned a new clientid. Since this is a per-client list, there isn't a lot of value in vetting the clientid on the incoming request. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Jeff Layton	41a7462018	nfs4: always reset notified flag to false before repolling for lock We may get a notification and lose the race to another client. Ensure that we wait again for a notification in that case. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2018-04-10 16:06:22 -04:00
Linus Torvalds	62f8e6c5dc	fscache development -----BEGIN PGP SIGNATURE----- iQIVAwUAWsdxrvu3V2unywtrAQJVmQ/9Fv8d/Ecdwv5nxVBmN7uA8lOYcHEbZWmd FhFQE8qYLjKMo9Fy4tPkBbu1l6CVnetaTRE5qwixACJAftrdjABKJAazGR3Uxief 0jMSWScrV1XCeRErPcczHcx52Hefl8f1DQdA3zpoF0ewz7CjyxMxkl67bsYJbNKE T4ebCu5IJk+5PPwwMM3REKjQbunSXXnzgCLUI2cc0Yf76CTVpx6p+NpxV+2wq0p7 vym83F68qACAEzNH+oozN7IwqjkWyYOnTtCLiMsh4iq30jP6ohtLom6RcRp7QUxM Z9hxgG3NptypuVBO1jKxaQ6XZGgAasYmppOmJ/SoALv2PKsAbxi372lTR4ikceKq H4oNTbs5tVmyvu3qFwtLN+vX+GdfaoSUnUG8vTvnCB3tHHtYj7q5QeFE0HaX4QSq oLANkCOZU8TJsT30pxsCNYiqc5HK9kaLjUQId9K+xq7mM/IuhtNtBQ+ZpqAh5IxB 4bXKYLdeJ1myZrkYTa6gcTqeFax3djCBJ3UvjTnuqRZAaQg079WkG84Kdq1ZjDRp IQpKQnPX9JGhjW1zqLK1Ay8h+HFPgWR5BBVOaLwImr1mH+ccG0iNIeDjrOc8h6J5 e60XM/x2dIYxpXyFYAkldbAI24aRg1FNzfniG4rSAPecf3SwWrxg/qK7uujLbJHM fKNA80yifHo= =ukqs -----END PGP SIGNATURE----- Merge tag 'fscache-next-20180406' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull fscache updates from David Howells: "Three patches that fix some of AFS's usage of fscache: (1) Need to invalidate the cache if a foreign data change is detected on the server. (2) Move the vnode ID uniquifier (equivalent to i_generation) from the auxiliary data to the index key to prevent a race between file delete and a subsequent file create seeing the same index key. (3) Need to retire cookies that correspond to files that we think got deleted on the server. Four patches to fix some things in fscache and cachefiles: (4) Fix a couple of checker warnings. (5) Correctly indicate to the end-of-operation callback whether an operation completed or was cancelled. (6) Add a check for multiple cookie relinquishment. (7) Fix a path through the asynchronous write that doesn't wake up a waiter for a page if the cache decides not to write that page, but discards it instead. A couple of patches to add tracepoints to fscache and cachefiles: (8) Add tracepoints for cookie operators, object state machine execution, cachefiles object management and cachefiles VFS operations. (9) Add tracepoints for fscache operation management and page wrangling. And then three development patches: (10) Attach the index key and auxiliary data to the cookie, pass this information through various fscache-netfs API functions and get rid of the callbacks to the netfs to get it. This means that the cache can get at this information, even if the netfs goes away. It also means that the cache can be lazy in updating the coherency data. (11) Pass the object data size through various fscache-netfs API rather than calling back to the netfs for it, and store the value in the object. This makes it easier to correctly resize the object, as the size is updated on writes to the cache, rather than calling back out to the netfs. (12) Maintain a catalogue of allocated cookies. This makes it possible to catch cookie collision up front rather than down in the bowels of the cache being run from a service thread from the object state machine. This will also make it possible in the future to reconnect to a cookie that's not gone dead yet because it's waiting for finalisation of the storage and also make it possible to bring cookies online if the cache is added after the cookie has been obtained" * tag 'fscache-next-20180406' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: fscache: Maintain a catalogue of allocated cookies fscache: Pass object size in rather than calling back for it fscache: Attach the index key and aux data to the cookie fscache: Add more tracepoints fscache: Add tracepoints fscache: Fix hanging wait on page discarded by writeback fscache: Detect multiple relinquishment of a cookie fscache: Pass the correct cancelled indications to fscache_op_complete() fscache, cachefiles: Fix checker warnings afs: Be more aggressive in retiring cached vnodes afs: Use the vnode ID uniquifier in the cache key not the aux data afs: Invalidate cache on server data change	2018-04-07 09:08:24 -07:00
Linus Torvalds	9022ca6b11	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted stuff, including Christoph's I_DIRTY patches" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: move I_DIRTY_INODE to fs.h ubifs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC \| I_DIRTY_DATASYNC) call ntfs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC \| I_DIRTY_DATASYNC) call gfs2: fix bogus __mark_inode_dirty(I_DIRTY_SYNC \| I_DIRTY_DATASYNC) calls fs: fold open_check_o_direct into do_dentry_open vfs: Replace stray non-ASCII homoglyph characters with their ASCII equivalents vfs: make sure struct filename->iname is word-aligned get rid of pointless includes of fs_struct.h [poll] annotate SAA6588_CMD_POLL users	2018-04-06 11:07:08 -07:00
David Howells	ee1235a9a0	fscache: Pass object size in rather than calling back for it Pass the object size in to fscache_acquire_cookie() and fscache_write_page() rather than the netfs providing a callback by which it can be received. This makes it easier to update the size of the object when a new page is written that extends the object. The current object size is also passed by fscache to the check_aux function, obviating the need to store it in the aux data. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Anna Schumaker <anna.schumaker@netapp.com> Tested-by: Steve Dickson <steved@redhat.com>	2018-04-06 14:05:14 +01:00
David Howells	402cb8dda9	fscache: Attach the index key and aux data to the cookie Attach copies of the index key and auxiliary data to the fscache cookie so that: (1) The callbacks to the netfs for this stuff can be eliminated. This can simplify things in the cache as the information is still available, even after the cache has relinquished the cookie. (2) Simplifies the locking requirements of accessing the information as we don't have to worry about the netfs object going away on us. (3) The cache can do lazy updating of the coherency information on disk. As long as the cache is flushed before reboot/poweroff, there's no need to update the coherency info on disk every time it changes. (4) Cookies can be hashed or put in a tree as the index key is easily available. This allows: (a) Checks for duplicate cookies can be made at the top fscache layer rather than down in the bowels of the cache backend. (b) Caching can be added to a netfs object that has a cookie if the cache is brought online after the netfs object is allocated. A certain amount of space is made in the cookie for inline copies of the data, but if it won't fit there, extra memory will be allocated for it. The downside of this is that live cache operation requires more memory. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Anna Schumaker <anna.schumaker@netapp.com> Tested-by: Steve Dickson <steved@redhat.com>	2018-04-04 13:41:28 +01:00
Linus Torvalds	ce6eba3dba	Merge branch 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull wait_var_event updates from Ingo Molnar: "This introduces the new wait_var_event() API, which is a more flexible waiting primitive than wait_on_atomic_t(). All wait_on_atomic_t() users are migrated over to the new API and wait_on_atomic_t() is removed. The migration fixes one bug and should result in no functional changes for the other usecases" * 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/wait: Improve __var_waitqueue() code generation sched/wait: Remove the wait_on_atomic_t() API sched/wait, arch/mips: Fix and convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/media: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait: Introduce wait_var_event()	2018-04-02 16:50:39 -07:00
Peter Zijlstra	723c921e7d	sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:23:21 +01:00
Linus Torvalds	df09348f78	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: - backport-friendly part of lock_parent() race fix - a fix for an assumption in the heurisic used by path_connected() that is not true on NFS - livelock fixes for d_alloc_parallel() * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Teach path_connected to handle nfs filesystems with multiple roots. fs: dcache: Use READ_ONCE when accessing i_dir_seq fs: dcache: Avoid livelock between d_alloc_parallel and __d_add lock_parent() needs to recheck if dentry got __dentry_kill'ed under it	2018-03-15 18:57:14 -07:00
Eric W. Biederman	95dd77580c	fs: Teach path_connected to handle nfs filesystems with multiple roots. On nfsv2 and nfsv3 the nfs server can export subsets of the same filesystem and report the same filesystem identifier, so that the nfs client can know they are the same filesystem. The subsets can be from disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no way to find the common root of all directory trees exported form the server with the same filesystem identifier. The practical result is that in struct super s_root for nfs s_root is not necessarily the root of the filesystem. The nfs mount code sets s_root to the root of the first subset of the nfs filesystem that the kernel mounts. This effects the dcache invalidation code in generic_shutdown_super currently called shrunk_dcache_for_umount and that code for years has gone through an additional list of dentries that might be dentry trees that need to be freed to accomodate nfs. When I wrote path_connected I did not realize nfs was so special, and it's hueristic for avoiding calling is_subdir can fail. The practical case where this fails is when there is a move of a directory from the subtree exposed by one nfs mount to the subtree exposed by another nfs mount. This move can happen either locally or remotely. With the remote case requiring that the move directory be cached before the move and that after the move someone walks the path to where the move directory now exists and in so doing causes the already cached directory to be moved in the dcache through the magic of d_splice_alias. If someone whose working directory is in the move directory or a subdirectory and now starts calling .. from the initial mount of nfs (where s_root == mnt_root), then path_connected as a heuristic will not bother with the is_subdir check. As s_root really is not the root of the nfs filesystem this heuristic is wrong, and the path may actually not be connected and path_connected can fail. The is_subdir function might be cheap enough that we can call it unconditionally. Verifying that will take some benchmarking and the result may not be the same on all kernels this fix needs to be backported to. So I am avoiding that for now. Filesystems with snapshots such as nilfs and btrfs do something similar. But as the directory tree of the snapshots are disjoint from one another and from the main directory tree rename won't move things between them and this problem will not occur. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Fixes: `397d425dc2` ("vfs: Test for and handle paths that are unreachable from their mnt_root") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-03-15 18:48:38 -04:00
Trond Myklebust	c4f24df942	NFS: Fix unstable write completion We do want to respect the FLUSH_SYNC argument to nfs_commit_inode() to ensure that all outstanding COMMIT requests to the inode in question are complete. Currently we may exit early from both nfs_commit_inode() and nfs_write_inode() even if there are COMMIT requests in flight, or unstable writes on the commit list. In order to get the right semantics w.r.t. sync_inode(), we don't need to have nfs_commit_inode() reset the inode dirty flags when called from nfs_wb_page() and/or nfs_wb_all(). We just need to ensure that nfs_write_inode() leaves them in the right state if there are outstanding commits, or stable pages. Reported-by: Scott Mayhew <smayhew@redhat.com> Fixes: `dc4fd9ab01` ("nfs: don't wait on commit in nfs_commit_inode()...") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-03-08 12:56:32 -05:00
Trond Myklebust	9c6376ebdd	pNFS: Prevent the layout header refcount going to zero in pnfs_roc() Ensure that we hold a reference to the layout header when processing the pNFS return-on-close so that the refcount value does not inadvertently go to zero. Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.10+ Tested-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>	2018-03-08 12:56:31 -05:00
Trond Myklebust	d9ee65539d	NFS: Fix an incorrect type in struct nfs_direct_req The start offset needs to be of type loff_t. Fixed: `5fadeb47dc` ("nfs: count DIO good bytes correctly with mirroring") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-03-08 12:56:31 -05:00
Al Viro	304ec482f5	get rid of pointless includes of fs_struct.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-02-22 14:28:50 -05:00
Colin Ian King	1b72040645	NFS: make struct nlmclnt_fl_close_lock_ops static The structure nlmclnt_fl_close_lock_ops s local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: fs/nfs/nfs3proc.c:876:33: warning: symbol 'nlmclnt_fl_close_lock_ops' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-02-22 12:23:01 -05:00
Bill.Baker@oracle.com	ad86f605c5	nfs: system crashes after NFS4ERR_MOVED recovery nfs4_update_server unconditionally releases the nfs_client for the source server. If migration fails, this can cause the source server's nfs_client struct to be left with a low reference count, resulting in use-after-free. Also, adjust reference count handling for ELOOP. NFS: state manager: migration failed on NFSv4 server nfsvmu10 with error 6 WARNING: CPU: 16 PID: 17960 at fs/nfs/client.c:281 nfs_put_client+0xfa/0x110 [nfs]() nfs_put_client+0xfa/0x110 [nfs] nfs4_run_state_manager+0x30/0x40 [nfsv4] kthread+0xd8/0xf0 BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8 nfs4_xdr_enc_write+0x6b/0x160 [nfsv4] rpcauth_wrap_req+0xac/0xf0 [sunrpc] call_transmit+0x18c/0x2c0 [sunrpc] __rpc_execute+0xa6/0x490 [sunrpc] rpc_async_schedule+0x15/0x20 [sunrpc] process_one_work+0x160/0x470 worker_thread+0x112/0x540 ? rescuer_thread+0x3f0/0x3f0 kthread+0xd8/0xf0 This bug was introduced by `32e62b7c` ("NFS: Add nfs4_update_server"), but the fix applies cleanly to `52442f9b` ("NFS4: Avoid migration loops") Reported-by: Helen Chao <helen.chao@oracle.com> Fixes: `52442f9b11` ("NFS4: Avoid migration loops") Signed-off-by: Bill Baker <bill.baker@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-02-22 12:17:42 -05:00
Trond Myklebust	6d243a2356	NFSv4: Fix broken cast in nfs4_callback_recallany() Passing a pointer to a unsigned integer to test_bit() is broken. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-02-21 16:35:50 -05:00
Goffredo Baroncelli	c472c07bfe	iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw} The function inode_cmp_iversion{+raw} is counter-intuitive, because it returns true when the counters are different and false when these are equal. Rename it to inode_eq_iversion{+raw}, which will returns true when the counters are equal and false otherwise. Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it> Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-02-01 08:15:25 -05:00
Linus Torvalds	19e7b5f994	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "All kinds of misc stuff, without any unifying topic, from various people. Neil's d_anon patch, several bugfixes, introduction of kvmalloc analogue of kmemdup_user(), extending bitfield.h to deal with fixed-endians, assorted cleanups all over the place..." * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits) alpha: osf_sys.c: use timespec64 where appropriate alpha: osf_sys.c: fix put_tv32 regression jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path dcache: delete unused d_hash_mask dcache: subtract d_hash_shift from 32 in advance fs/buffer.c: fold init_buffer() into init_page_buffers() fs: fold __inode_permission() into inode_permission() fs: add RWF_APPEND sctp: use vmemdup_user() rather than badly open-coding memdup_user() snd_ctl_elem_init_enum_names(): switch to vmemdup_user() replace_user_tlv(): switch to vmemdup_user() new primitive: vmemdup_user() memdup_user(): switch to GFP_USER eventfd: fold eventfd_ctx_get() into eventfd_ctx_fileget() eventfd: fold eventfd_ctx_read() into eventfd_read() eventfd: convert to use anon_inode_getfd() nfs4file: get rid of pointless include of btrfs.h uvc_v4l2: clean copyin/copyout up vme_user: don't use __copy_..._user() usx2y: don't bother with memdup_user() for 16-byte structure ...	2018-01-31 09:25:20 -08:00
Linus Torvalds	efd52b5d36	NFS client updates for Linux 4.16 Highlights include: Stable bugfixes: - Fix breakages in the nfsstat utility due to the inclusion of the NFSv4 LOOKUPP operation. - Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall() due to nfs_idmap_legacy_upcall() being called without an 'aux' parameter. - Fix a refcount leak in the standard O_DIRECT error path. - Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path. - Fix CPU latency issues with nfs_commit_release_pages() - Fix the LAYOUTUNAVAILABLE error case in the file layout type. - NFS: Fix a race between mmap() and O_DIRECT Features: - Support the statx() mask and query flags to enable optimisations when the user is requesting only attributes that are already up to date in the inode cache, or is specifying the AT_STATX_DONT_SYNC flag. - Add a module alias for the SCSI pNFS layout type. Bugfixes: - Automounting when resolving a NFSv4 referral should preserve the RDMA transport protocol settings. - Various other RDMA bugfixes from Chuck. - pNFS block layout fixes. - Always set NFS_LOCK_LOST when a lock is lost. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJacHcyAAoJEGcL54qWCgDy5WcP/Aw7GIAVZ+n5B+EIFVvEaWlC C++eA8vej433ezKRj8IExeCX1C8OKrQZ4iQ3nqIg0mVVQS/Qk+469OhYP9jDagA+ tDZeOs3lbl1EUhUWT+GNxw8bZqKGn9fYzcWFjiYeFTjrcfLYfZTW50V0tofjgmlR 3nQmpPx/56rXE9ZO/EW66HRZWauw7a0hg3/5Ft+F5csqIb3yQOlW8Osp3WClzGBF to4lS8/IwHvCn3qWAMuivRaMJDxeKrmoJNQh9Kw1Mw3+vurGAjmKo1a153qKPz4N 7wjeP+o3ujc/P7WsJLCIgQRimzSm9FZXMqEVmz07+cIhGbERt2yy0RbHev8bpa+U 3IMj70K9ciPuMZwrAtRAeZL+o9gxlUGUXvTaDUgo4DFgBw9Q5CnMnFn6a725l4h0 nSZsE+bR8d4l/yEjf77SbTrk7atMLfUG1XnKH20i1CUjtd4CaLLzjn81TlbQrfuI XaFdJUUt63dPTIbhPEk7wHFcITkGZiyXhcepgbaXLiDH/3gyZmqTYzJ2EH14sOC5 NaTueE3ASTiFChvG7jvc89HJN5SN5W11PyzI+GHezx9VkFnPZM3/q2V7oEX2aSld tkRlDMO4dVmpAA4LVAAarDr0a8ZsJQOdb+yLn21pbmKNAs1vol7tMfJe57ykEV6v WNAgtKJLtZE0Lh1UyEq0 =5Vv2 -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: Stable bugfixes: - Fix breakages in the nfsstat utility due to the inclusion of the NFSv4 LOOKUPP operation - Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall() due to nfs_idmap_legacy_upcall() being called without an 'aux' parameter - Fix a refcount leak in the standard O_DIRECT error path - Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path - Fix CPU latency issues with nfs_commit_release_pages() - Fix the LAYOUTUNAVAILABLE error case in the file layout type - NFS: Fix a race between mmap() and O_DIRECT Features: - Support the statx() mask and query flags to enable optimisations when the user is requesting only attributes that are already up to date in the inode cache, or is specifying the AT_STATX_DONT_SYNC flag - Add a module alias for the SCSI pNFS layout type Bugfixes: - Automounting when resolving a NFSv4 referral should preserve the RDMA transport protocol settings - Various other RDMA bugfixes from Chuck - pNFS block layout fixes - Always set NFS_LOCK_LOST when a lock is lost" * tag 'nfs-for-4.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (69 commits) NFS: Fix a race between mmap() and O_DIRECT NFS: Remove a redundant call to unmap_mapping_range() pnfs/blocklayout: Ensure disk address in block device map pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors lockd: Fix server refcounting SUNRPC: Fix null rpc_clnt dereference in rpc_task_queued tracepoint SUNRPC: Micro-optimize __rpc_execute SUNRPC: task_run_action should display tk_callback sunrpc: Format RPC events consistently for display SUNRPC: Trace xprt_timer events xprtrdma: Correct some documenting comments xprtrdma: Fix "bytes registered" accounting xprtrdma: Instrument allocation/release of rpcrdma_req/rep objects xprtrdma: Add trace points to instrument QP and CQ access upcalls xprtrdma: Add trace points in the client-side backchannel code paths xprtrdma: Add trace points for connect events xprtrdma: Add trace points to instrument MR allocation and recovery xprtrdma: Add trace points to instrument memory invalidation xprtrdma: Add trace points in reply decoder path xprtrdma: Add trace points to instrument memory registration ..	2018-01-30 19:03:48 -08:00
Linus Torvalds	a4b7fd7d34	inode->i_version rework for v4.16 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJabwjlAAoJEAAOaEEZVoIVeEEP/R84kZJjlZV/vNmFFvY46jM+ 0hpMHXRNym+nW1Du1CKNkesEUAY8ACAQIyzJh63Q72341QTDdz3+asHwPYRNOqdC PgryidPieojkNKQg+h7dmoKYlYh1xiCicvn66Q5PFb9B0lH36twekOK4X1qqJj8Z breRmRoFLka9looMSuYgwbErts023fmASalvGum6T0ZM/7F9hUj4O3OsQtKTLUNM VQ+gLJTQrUqrgzvWUwq3WTMa9YAaKP4oad8nsglNSpiVLG7WtURr5HokW9hAziqL k99Y+K2ni1wZJlNGJAyV7PyEG2ieI5Xn+LzM2RM+SndD1QHF2QXACmSTDYfL51k5 G2RsKeTZvQPtX4qx9+vnCp/4oV6JduvCaq2Mt8SQb9nYZxKjs85TNLrARJv+85eQ zP0OTxlH1Gfu3j36n3cny4XemyMYYF4hCFYfRPqTGst37fgLBtfIfUSQ6jedoCK2 Xcyb6ukGXMh6If/A7DSy91hvSSPrWSH7TPPsbfLy6o+wUOtpAGR4eXVlEuAiXrzc gnoAz85oIMUQae66LrdrPk1NyE59qOb24g/yU5gyRBSpi2+/aoboNCKaD73tgs/C XIMwGXLYmqkcud7IBQF0tHHiM+jsEkbSM4LUqRXSnqMdwNnS18Z4Q+JKqpdP0cii eRdenDvUfu8Gu1Y9vWBv =iihN -----END PGP SIGNATURE----- Merge tag 'iversion-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull inode->i_version rework from Jeff Layton: "This pile of patches is a rework of the inode->i_version field. We have traditionally incremented that field on every inode data or metadata change. Typically this increment needs to be logged on disk even when nothing else has changed, which is rather expensive. It turns out though that none of the consumers of that field actually require this behavior. The only real requirement for all of them is that it be different iff the inode has changed since the last time the field was checked. Given that, we can optimize away most of the i_version increments and avoid dirtying inode metadata when the only change is to the i_version and no one is querying it. Queries of the i_version field are rather rare, so we can help write performance under many common workloads. This patch series converts existing accesses of the i_version field to a new API, and then converts all of the in-kernel filesystems to use it. The last patch in the series then converts the backend implementation to a scheme that optimizes away a large portion of the metadata updates when no one is looking at it. In my own testing this series significantly helps performance with small I/O sizes. I also got this email for Christmas this year from the kernel test robot (a 244% r/w bandwidth improvement with XFS over DAX, with 4k writes): https://lkml.org/lkml/2017/12/25/8 A few of the earlier patches in this pile are also flowing to you via other trees (mm, integrity, and nfsd trees in particular)". * tag 'iversion-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: (22 commits) fs: handle inode->i_version more efficiently btrfs: only dirty the inode in btrfs_update_time if something was changed xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing fs: only set S_VERSION when updating times if necessary IMA: switch IMA over to new i_version API xfs: convert to new i_version API ufs: use new i_version API ocfs2: convert to new i_version API nfsd: convert to new i_version API nfs: convert to new i_version API ext4: convert to new i_version API ext2: convert to new i_version API exofs: switch to new i_version API btrfs: convert to new i_version API afs: convert to new i_version API affs: convert to new i_version API fat: convert to new i_version API fs: don't take the i_lock in inode_inc_iversion fs: new API for handling inode->i_version ntfs: remove i_version handling ...	2018-01-29 13:33:53 -08:00
Jeff Layton	1eb5d98f16	nfs: convert to new i_version API For NFS, we just use the "raw" API since the i_version is mostly managed by the server. The exception there is when the client holds a write delegation, but we only need to bump it once there anyway to handle CB_GETATTR. Tested-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Jeff Layton <jlayton@redhat.com>	2018-01-29 06:42:21 -05:00
Trond Myklebust	e231c6879c	NFS: Fix a race between mmap() and O_DIRECT When locking the file in order to do O_DIRECT on it, we must unmap any mmapped ranges on the pagecache so that we can flush out the dirty data. Fixes: `a5864c999d` ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.8+	2018-01-28 22:00:15 -05:00
Trond Myklebust	128159f292	NFS: Remove a redundant call to unmap_mapping_range() We don't need to call unmap_mapping_range() prior to calling nfs_sync_mapping(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-28 09:35:54 -05:00
Benjamin Coddington	f34462c3c8	pnfs/blocklayout: Ensure disk address in block device map It's possible that the device map is smaller than the offset into the device for the I/O we're adding. Add a check for it and bail out, otherwise we risk botching the bio calculations that follow. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trondmy@gmail.com>	2018-01-25 16:42:35 -05:00
Benjamin Coddington	b39604755c	pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors Fixup the field types to match their use. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trondmy@gmail.com>	2018-01-25 16:42:35 -05:00
Eric Biggers	49686cbbb3	NFS: reject request for id_legacy key without auxdata nfs_idmap_legacy_upcall() is supposed to be called with 'aux' pointing to a 'struct idmap', via the call to request_key_with_auxdata() in nfs_idmap_request_key(). However it can also be reached via the request_key() system call in which case 'aux' will be NULL, causing a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall(), assuming that the key description is valid enough to get that far. Fix this by making nfs_idmap_legacy_upcall() negate the key if no auxdata is provided. As usual, this bug was found by syzkaller. A simple reproducer using the command-line keyctl program is: keyctl request2 id_legacy uid:0 '' @s Fixes: `57e62324e4` ("NFS: Store the legacy idmapper result in the keyring") Reported-by: syzbot+5dfdbcf7b3eb5912abbb@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> # v3.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Trond Myklebust <trondmy@gmail.com>	2018-01-22 10:05:11 -05:00
Jan Chochol	cbebc6ef4f	nfs: Do not convert nfs_idmap_cache_timeout to jiffies Since commit `57e62324e4` ("NFS: Store the legacy idmapper result in the keyring") nfs_idmap_cache_timeout changed units from jiffies to seconds. Unfortunately sysctl interface was not updated accordingly. As a effect updating /proc/sys/fs/nfs/idmap_cache_timeout with some value will incorrectly multiply this value by HZ. Also reading /proc/sys/fs/nfs/idmap_cache_timeout will show real value divided by HZ. Fixes: `57e62324e4` ("NFS: Store the legacy idmapper result in the keyring") Signed-off-by: Jan Chochol <jan@chochol.info> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-18 15:10:47 -05:00
Chuck Lever	06e1902456	nfs: Use proper enum definitions for nfs_show_stable Commit `8224b2734a` ("NFS: Add static NFS I/O tracepoints") had a hack to work around some odd behavior observed with __print_symbolic. I couldn't ever get it to display NFS_FILE_SYNC when using TRACE_DEFINE_ENUM macros to set up the enum values. I tracked down the actual bug that forced me to add the workaround. That issue will be addressed soon, so replace the hack with a proper implementation. Fixes: `8224b2734a` ("NFS: Add static NFS I/O tracepoints") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-18 15:01:22 -05:00
Tigran Mkrtchyan	7ff4cff637	nfs41: do not return ENOMEM on LAYOUTUNAVAILABLE A pNFS server may return LAYOUTUNAVAILABLE error on LAYOUTGET for files which don't have any layout. In this situation pnfs_update_layout currently returns NULL. As this NULL is converted into ENOMEM, IO requests fails instead of falling back to MDS. Do not return ENOMEM on LAYOUTUNAVAILABLE and let client retry through MDS. Fixes `8d40b0f148`. I will suggest to backport this fix to affected stable branches. Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> [trondmy: Use IS_ERR_OR_NULL()] Fixes: `8d40b0f148` ("NFS filelayout:call GETDEVICEINFO after...") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-18 12:51:31 -05:00
J. Bruce Fields	1b8d97b0a8	NFS: commit direct writes even if they fail partially If some of the WRITE calls making up an O_DIRECT write syscall fail, we neglect to commit, even if some of the WRITEs succeed. We also depend on the commit code to free the reference count on the nfs_page taken in the "if (request_commit)" case at the end of nfs_direct_write_completion(). The problem was originally noticed because ENOSPC's encountered partway through a write would result in a closed file being sillyrenamed when it should have been unlinked. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-16 10:13:23 -05:00
Arnd Bergmann	f96adf1ea0	nfs: remove unused label in nfs_encode_fh() The only reference to the label got removed, so we now get a harmless compiler warning: fs/nfs/export.c: In function 'nfs_encode_fh': fs/nfs/export.c:58:1: error: label 'out' defined but not used [-Werror=unused-label] Fixes: `aaa1500894` ("nfs: remove dead code from nfs_encode_fh()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-16 10:12:49 -05:00
Chuck Lever	801b564309	nfs: Update server port after referral or migration After traversing a referral or recovering from a migration event, ensure that the server port reported in /proc/mounts is updated to the correct port setting for the new submount. Reported-by: Helen Chao <helen.chao@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:30 -05:00
Chuck Lever	530ea42192	nfs: Referrals should use the same proto setting as their parent Helen Chao <helen.chao@oracle.com> noticed that when a user traverses a referral on an NFS/RDMA mount, the resulting submount always uses TCP. This behavior does not match the vers= setting when traversing a referral (vers=4.1 is preserved). It also does not match the behavior of crossing from the pseudofs into a real filesystem (proto=rdma is preserved in that case). The Linux NFS client does not currently support the fs_locations_info attribute. The situation is similar for all NFSv4 servers I know of. Therefore until the community has broad support for fs_locations_info, when following a referral: - First try to connect with RPC-over-RDMA. This will fail quickly if the client has no RDMA-capable interfaces. - If connecting with RPC-over-RDMA fails, or the RPC-over-RDMA transport is not available, use TCP. Reported-by: Helen Chao <helen.chao@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:30 -05:00
Scott Mayhew	ba4a76f703	nfs/pnfs: fix nfs_direct_req ref leak when i/o falls back to the mds Currently when falling back to doing I/O through the MDS (via pnfs_{read\|write}_through_mds), the client frees the nfs_pgio_header without releasing the reference taken on the dreq via pnfs_generic_pg_{read\|write}pages -> nfs_pgheader_init -> nfs_direct_pgio_init. It then takes another reference on the dreq via nfs_generic_pg_pgios -> nfs_pgheader_init -> nfs_direct_pgio_init and as a result the requester will become stuck in inode_dio_wait. Once that happens, other processes accessing the inode will become stuck as well. Ensure that pnfs_read_through_mds() and pnfs_write_through_mds() clean up correctly by calling hdr->completion_ops->completion() instead of calling hdr->release() directly. This can be reproduced (sometimes) by performing "storage failover takeover" commands on NetApp filer while doing direct I/O from a client. This can also be reproduced using SystemTap to simulate a failure while doing direct I/O from a client (from Dave Wysochanski <dwysocha@redhat.com>): stap -v -g -e 'probe module("nfs_layout_nfsv41_files").function("nfs4_fl_prepare_ds").return { $return=NULL; exit(); }' Suggested-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: `1ca018d28d` ("pNFS: Fix a memory leak when attempted pnfs fails") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Benjamin Coddington	b3dce6a2f0	pnfs/blocklayout: handle transient devices PNFS block/SCSI layouts should gracefully handle cases where block devices are not available when a layout is retrieved, or the block devices are removed while the client holds a layout. While setting up a layout segment, keep a record of an unavailable or un-parsable block device in cache with a flag so that subsequent layouts do not spam the server with GETDEVINFO. We can reuse the current NFS_DEVICEID_UNAVAILABLE handling with one variation: instead of reusing the device, we will discard it and send a fresh GETDEVINFO after the timeout, since the lookup and validation of the device occurs within the GETDEVINFO response handling. A lookup of a layout segment that references an unavailable device will return a segment with the NFS_LSEG_UNAVAILABLE flag set. This will allow the pgio layer to mark the layout with the appropriate fail bit, which forces subsequent IO to the MDS, and prevents spamming the server with LAYOUTGET, LAYOUTRETURN. Finally, when IO to a block device fails, look up the block device(s) referenced by the pgio header, and mark them as unavailable. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Benjamin Coddington	d78471d32b	pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR If there's an error doing I/O to block device, and the client resends the I/O to the MDS, the MDS must recall the layout from the client before processing the I/O. Let's preempt that exchange by returning the layout before falling back to the MDS when there's an error. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Benjamin Coddington	ad6b0241c9	pnfs/blocklayout: Add module alias for LAYOUT4_SCSI The blocklayout module contains the client support for both block and SCSI layouts. Add a module alias for the SCSI layout type so that the module will be loaded for SCSI layouts. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Benjamin Coddington	e545735a32	NFS: remove unused offset arg in nfs_pgio_rpcsetup nfs_pgio_rpcsetup() is always called with an offset of 0, so we should be able to drop the arguement altogether. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
NeilBrown	dce2630c7d	NFSv4: always set NFS_LOCK_LOST when a lock is lost. There are 2 comments in the NFSv4 code which suggest that SIGLOST should possibly be sent to a process. In these cases a lock has been lost. The current practice is to set NFS_LOCK_LOST so that read/write returns EIO when a lock is lost. So change these comments to code when sets NFS_LOCK_LOST. One case is when lock recovery after apparent server restart fails with NFS4ERR_DENIED, NFS4ERR_RECLAIM_BAD, or NFS4ERRO_RECLAIM_CONFLICT. The other case is when a lock attempt as part of lease recovery fails with NFS4ERR_DENIED. In an ideal world, these should not happen. However I have a packet trace showing an NFSv4.1 session getting NFS4ERR_BADSESSION after an extended network parition. The NFSv4.1 client treats this like server reboot until/unless it get NFS4ERR_NO_GRACE, in which case it switches over to "nograce" recovery mode. In this network trace, the client attempts to recover a lock and the server (incorrectly) reports NFS4ERR_DENIED rather than NFS4ERR_NO_GRACE. This leads to the ineffective comment and the client then continues to write using the OPEN stateid. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
NeilBrown	aaa1500894	nfs: remove dead code from nfs_encode_fh() This code can never be used as the IS_AUTOMOUNT(inode) case has already been handled. So remove it to avoid confusion. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Trond Myklebust	9ccee940bd	Support statx() mask and query flags parameters Support the query flags AT_STATX_FORCE_SYNC by forcing an attribute revalidation, and AT_STATX_DONT_SYNC by returning cached attributes only. Use the mask to optimise away server revalidation for attributes that are not being requested by the user. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Trond Myklebust	8634ef5e05	NFS: Fix nfsstat breakage due to LOOKUPP The LOOKUPP operation was inserted into the nfs4_procedures array rather than being appended, which put /proc/net/rpc/nfs out of whack, and broke the nfsstat utility. Fix by moving the LOOKUPP operation to the end of the array, and by ensuring that it keeps the same length whether or not NFSV4.1 and NFSv4.2 are compiled in. Fixes: `5b5faaf6df` ("nfs4: add NFSv4 LOOKUPP handlers") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Trond Myklebust	82571552a0	NFSv4: Convert LOCKU to use nfs4_async_handle_exception() Convert CLOSE so that it specifies the correct stateid and inode for the error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Trond Myklebust	e0dba0128a	NFSv4: Convert DELEGRETURN to use nfs4_handle_exception() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:29 -05:00
Trond Myklebust	b8b8d22109	NFSv4: Convert CLOSE to use nfs4_async_handle_exception() Convert CLOSE so that it specifies the correct stateid, state and inode for the error handling. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:28 -05:00
Trond Myklebust	7f1bda447c	NFS: Add a cond_resched() to nfs_commit_release_pages() The commit list can get very large, and so we need a cond_resched() in nfs_commit_release_pages() in order to ensure we don't hog the CPU for excessive periods of time. Reported-by: Mike Galbraith <efault@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2018-01-14 23:06:28 -05:00
Al Viro	6db620012f	nfs4file: get rid of pointless include of btrfs.h should've been killed by "vfs: pull btrfs clone API to vfs layer"... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-12-30 00:03:39 -05:00
Linus Torvalds	d025fbf1a2	NFS client fixes for Linux 4.15-rc4 Stable bugfixes: - NFS: Avoid a BUG_ON() in nfs_commit_inode() by not waiting for a commit in the case that there were no commit requests. - SUNRPC: Fix a race in the receive code path Other fixes: - NFS: Fix a deadlock in nfs client initialization - xprtrdma: Fix a performance regression for small IOs -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlo0PdMACgkQ18tUv7Cl QOvlUg/+KoXWXNwItHIyyegYgRXcAPpaCtdnCjjOP6R9HEJ+clnLcaqDxdDKVWQ/ oDvEcQcsBpywbUi7vVrvdar4mofwuyjXPpbcZPlDP1Ru4yyAlyylftwIuQW/nzdd vX2tZaVf+B9y1XvSD5NI+2EKWmp7MVrPdNhYxAB39TQZnAAvYDFHhywtZ0UR7vJt 7YVcZoPtKUhg15jhCOr73eaCT0884/tlgedfd6DkDGR6bCtSQC2PySfqq9Lnnl/1 ruDzzcgTARzSEzvta/uyBRspOLBHeeBhTdQUp79lMfekC4+68Tx6DFWnydIUttuE G7LphN6hfbJLF20U/ENb2H8v10WZsKvGEuxM+fp5PXGcIMSlX4qoJUe/egJFiiSL IaikgibvfiKmYSJvwdxTlOcr793X2Ej19HNciNjJQp4pviDOdZixgtGvVVHJBmh6 LYzE5q9jgbW9wQXwTTeWHp/nyqL80NslX0UARYnS2Ua0B96GRCESXqCUFtxK6tKR wbYiHzKc4dOfSxpNlKI+FlX63m5oSAmTEii3ODsWZjObbwYHNX2Zqj2cVFiSLCpv ZXgmpNL+tL2zBWxPvn6rzYhpaXo++PqlHK7vv2QVBI6XM2J8ztpj5Wr5zneRoJaE ejk8nw/mR43bfdQuUGZRKh/Z+FTqL0/2WbDgJMXl09c+zRz7J2c= =XhEC -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client fixes from Anna Schumaker: "This has two stable bugfixes, one to fix a BUG_ON() when nfs_commit_inode() is called with no outstanding commit requests and another to fix a race in the SUNRPC receive codepath. Additionally, there are also fixes for an NFS client deadlock and an xprtrdma performance regression. Summary: Stable bugfixes: - NFS: Avoid a BUG_ON() in nfs_commit_inode() by not waiting for a commit in the case that there were no commit requests. - SUNRPC: Fix a race in the receive code path Other fixes: - NFS: Fix a deadlock in nfs client initialization - xprtrdma: Fix a performance regression for small IOs" * tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: SUNRPC: Fix a race in the receive code path nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests xprtrdma: Spread reply processing over more CPUs nfs: fix a deadlock in nfs client initialization	2017-12-16 13:12:53 -08:00

1 2 3 4 5 ...

5125 Commits