Commit Graph

319 Commits

Author SHA1 Message Date
NeilBrown ef8d98f20d NFS: avoid spurious warning of lost lock that is being unlocked.
When the NFSv4 state manager recovers state after a server restart, it
reports that locks have been lost if it finds any lock state for which
recovery hasn't been successful.  i.e. any for which
NFS_LOCK_INITIALIZED is not set.

However it only tries to recover locks that are still linked to
inode->i_flctx.  So if a lock has been removed from inode->i_flctx, but
the state for that lock has not yet been destroyed, then a spurious
warning results.

nfs4_proc_unlck() calls locks_lock_inode_wait() - which removes the lock
from ->i_flctx - before sending the unlock request to the server and
before the final nfs4_put_lock_state() is called.  This allows a window
in which a spurious warning can be produced.

So add a new flag NFS_LOCK_UNLOCKING which is set once the decision has
been made to unlock the lock.  This will prevent it from triggering any
warning.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-12-06 10:45:11 -05:00
Kees Cook cf0d7e7f45 NFS: Avoid memcpy() run-time warning for struct sockaddr overflows
The 'nfs_server' and 'mount_server' structures include a union of
'struct sockaddr' (with the older 16 bytes max address size) and
'struct sockaddr_storage' which is large enough to hold all the
supported sa_family types (128 bytes max size). The runtime memcpy()
buffer overflow checker is seeing attempts to write beyond the 16
bytes as an overflow, but the actual expected size is that of 'struct
sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more
completely through-out NFS, which results in adjusting the memcpy()
buffers to the correct union members. Avoids this false positive run-time
warning under CONFIG_FORTIFY_SOURCE:

  memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16)

Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Gaosheng Cui 8aa7cf8524 NFSv4: remove nfs4_renewd_prepare_shutdown() declaration
nfs4_renewd_prepare_shutdown() has been removed since
commit 3050141bae ("NFSv4: Kill nfs4_renewd_prepare_shutdown()"),
so remove it.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-05 15:29:23 -04:00
NeilBrown 4dc73c6791 NFSv4: keep state manager thread active if swap is enabled
If we are swapping over NFSv4, we may not be able to allocate memory to
start the state-manager thread at the time when we need it.
So keep it always running when swap is enabled, and just signal it to
start.

This requires updating and testing the cl_swapper count on the root
rpc_clnt after following all ->cl_parent links.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2022-03-13 12:59:35 -04:00
Olga Kornievskaia a8d54baba7 NFSv4 handle port presence in fs_location server string
An fs_location attribute returns a string that can be ipv4, ipv6,
or DNS name. An ip location can have a port appended to it and if
no port is present a default port needs to be set. If rpc_pton()
fails to parse, try calling rpc_uaddr2socaddr() that can convert
an universal address.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-01-13 09:36:58 -05:00
Olga Kornievskaia f5b27cc676 NFSv4 expose nfs_parse_server_name function
Make nfs_parse_server_name available outside of nfs4namespace.c.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-01-13 09:36:58 -05:00
Olga Kornievskaia 1976b2b314 NFSv4.1 query for fs_location attr on a new file system
Query the server for other possible trunkable locations for a given
file system on a 4.1+ mount.

v2:
-- added missing static to nfs4_discover_trunking,
reported by the kernel test robot

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-01-13 09:30:48 -05:00
Trond Myklebust 85847280b1 NFSv4: Allow writebacks to request 'blocks used'
When doing a non-pNFS write, allow the writeback code to specify that it
also needs to update 'blocks used'.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-01-06 14:00:21 -05:00
Anna Schumaker 2ef61e0eaa NFS: Remove the nfs4_label from the nfs4_getattr_res
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-11-05 14:54:39 -04:00
Anna Schumaker 76baa2b29c NFS: Remove the f_label from the nfs4_opendata and nfs_openres
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-11-05 14:54:39 -04:00
Trond Myklebust e93a5e9306 NFSv4: Add support for application leases underpinned by a delegation
If the NFSv4 client already holds a delegation for a file, then we can
support application leases (i.e. fcntl(fd, F_SETLEASE,...)) because the
underlying delegation guarantees that the file is not being modified on
the server by another client in a way that might conflict with the lease
guarantees.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-13 19:36:28 -04:00
Trond Myklebust be20037725 NFSv4: Fix delegation return in cases where we have to retry
If we're unable to immediately recover all locks because the server is
unable to immediately service our reclaim calls, then we want to retry
after we've finished servicing all the other asynchronous delegation
returns on our queue.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-13 19:36:27 -04:00
Trond Myklebust dfe1fe75e0 NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()
If the inode is being evicted, but has to return a delegation first,
then it can cause a deadlock in the corner case where the server reboots
before the delegreturn completes, but while the call to iget5_locked() in
nfs4_opendata_get_inode() is waiting for the inode free to complete.
Since the open call still holds a session slot, the reboot recovery
cannot proceed.

In order to break the logjam, we can turn the delegation return into a
privileged operation for the case where we're evicting the inode. We
know that in that case, there can be no other state recovery operation
that conflicts.

Reported-by: zhangxiaoxu (A) <zhangxiaoxu5@huawei.com>
Fixes: 5fcdfacc01 ("NFSv4: Return delegations synchronously in evict_inode")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03 10:14:42 -04:00
Benjamin Coddington b4868b44c5 NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE
Since commit 0e0cb35b41 ("NFSv4: Handle NFS4ERR_OLD_STATEID in
CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE races
with the update of the nfs_state:

Process 1           Process 2           Server
=========           =========           ========
 OPEN file
                    OPEN file
                                        Reply OPEN (1)
                                        Reply OPEN (2)
 Update state (1)
 CLOSE file (1)
                                        Reply OLD_STATEID (1)
 CLOSE file (2)
                                        Reply CLOSE (-1)
                    Update state (2)
                    wait for state change
 OPEN file
                    wake
 CLOSE file
 OPEN file
                    wake
 CLOSE file
 ...
                    ...

We can avoid this situation by not issuing an immediate retry with a bumped
seqid when CLOSE/OPEN_DOWNGRADE receives NFS4ERR_OLD_STATEID.  Instead,
take the same approach used by OPEN and wait at least 5 seconds for
outstanding stateid updates to complete if we can detect that we're out of
sequence.

Note that after this change it is still possible (though unlikely) that
CLOSE waits a full 5 seconds, bumps the seqid, and retries -- and that
attempt races with another OPEN at the same time.  In order to avoid this
race (which would result in the livelock), update
nfs_need_update_open_stateid() to handle the case where:
 - the state is NFS_OPEN_STATE, and
 - the stateid doesn't match the current open stateid

Finally, nfs_need_update_open_stateid() is modified to be idempotent and
renamed to better suit the purpose of signaling that the stateid passed
is the next stateid in sequence.

Fixes: 0e0cb35b41 ("NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-10-02 08:43:09 -04:00
Linus Torvalds 37711e5e23 NFS client updates for Linux 5.9
Highlights include:
 
 Stable fixes:
 - pNFS: Don't return layout segments that are being used for I/O
 - pNFS: Don't move layout segments off the active list when being used for I/O
 
 Features:
 - NFS: Add support for user xattrs through the NFSv4.2 protocol
 - NFS: Allow applications to speed up readdir+statx() using AT_STATX_DONT_SYNC
 - NFSv4.0 allow nconnect for v4.0
 
 Bugfixes and cleanups:
 - nfs: ensure correct writeback errors are returned on close()
 - nfs: nfs_file_write() should check for writeback errors
 - nfs: Fix getxattr kernel panic and memory overflow
 - NFS: Fix the pNFS/flexfiles mirrored read failover code
 - SUNRPC: dont update timeout value on connection reset
 - freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS
 - sunrpc: destroy rpc_inode_cachep after unregister_filesystem
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAl83CYEACgkQZwvnipYK
 APLx4g/9EQNTG5VkUToElRHs4b3vFxbmh9Odnk/JwPHxY5GQ8/AyGqwWHBgMZc7/
 2AV/C83pk7pJsDNsKbVAaFCT1cwjmItHM63vKJYGBYbE8LhkZ/1sYkdtPBYwHoVl
 7CWfpVY/4NjYw5GJfrVA5Y0m7lrQInRtIMzfaENw2tpw+/cKUpadxgEJltzFNvpa
 Ploinr1ZRBl1tvfeHNRP5ZMPk2AfgGWtQKQ/b2UWUk5tXALoQm2Eu+/oku39uqhy
 hW5tCbU2BzR91gg5JwF9n7VowkCHXfe7lMzDBTVfwZOELOmoyys/1wKv550FWcWi
 yymljWiPGZOnXGT1vfKptPESQjdtElMfanvEZ0BzS+yNR0HZGnIupaxGlYlG9ZGU
 2sXHQPp3mk2Q+L1IgbTSCnSju0YlZo32JQpYCZiROjIXnPWPQ50YNhr8GL18M1FW
 hTeShg2avWH+59GB6moEBmsuvui7Dy1jkimblToLEoGJ4kbvEl72FYSqTCkAXXbB
 rVUzhmJFgfk/EOS4d+QKJoBqNzw3aw79wyT7PLkoCYBqPZBHQexlmI+ktMbgUEdw
 c/fM7l5/Vcb9weIHKzul2Jbk5q6bFME/xPnIkr3v/oKIFlLFwQ04BX6R/42AMJHw
 V5Q9Wp81Vy6RXQMn8P0ZMeY0WQC/rhpijOEVMUC+Ni+spz44AdM=
 =k4pE
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Stable fixes:
   - pNFS: Don't return layout segments that are being used for I/O
   - pNFS: Don't move layout segments off the active list when being used for I/O

  Features:
   - NFS: Add support for user xattrs through the NFSv4.2 protocol
   - NFS: Allow applications to speed up readdir+statx() using AT_STATX_DONT_SYNC
   - NFSv4.0 allow nconnect for v4.0

  Bugfixes and cleanups:
   - nfs: ensure correct writeback errors are returned on close()
   - nfs: nfs_file_write() should check for writeback errors
   - nfs: Fix getxattr kernel panic and memory overflow
   - NFS: Fix the pNFS/flexfiles mirrored read failover code
   - SUNRPC: dont update timeout value on connection reset
   - freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS
   - sunrpc: destroy rpc_inode_cachep after unregister_filesystem"

* tag 'nfs-for-5.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (32 commits)
  NFS: Fix flexfiles read failover
  fs: nfs: delete repeated words in comments
  rpc_pipefs: convert comma to semicolon
  nfs: Fix getxattr kernel panic and memory overflow
  NFS: Don't return layout segments that are in use
  NFS: Don't move layouts to plh_return_segs list while in use
  NFS: Add layout segment info to pnfs read/write/commit tracepoints
  NFS: Add tracepoints for layouterror and layoutstats.
  NFS: Report the stateid + status in trace_nfs4_layoutreturn_on_close()
  SUNRPC dont update timeout value on connection reset
  nfs: nfs_file_write() should check for writeback errors
  nfs: ensure correct writeback errors are returned on close()
  NFSv4.2: xattr cache: get rid of cache discard work queue
  NFS: remove redundant initialization of variable result
  NFSv4.0 allow nconnect for v4.0
  freezer: Add unsafe versions of freezable_schedule_timeout_interruptible for NFS
  sunrpc: destroy rpc_inode_cachep after unregister_filesystem
  NFSv4.2: add client side xattr caching.
  NFSv4.2: hook in the user extended attribute handlers
  NFSv4.2: add the extended attribute proc functions.
  ...
2020-08-15 08:26:55 -07:00
Ahmed S. Darwish 76246c9219 NFSv4: Use sequence counter with associated spinlock
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.

Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.

If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-22-a.darwish@linutronix.de
2020-07-29 16:14:28 +02:00
Frank van der Linden 95ad37f90c NFSv4.2: add client side xattr caching.
Implement client side caching for NFSv4.2 extended attributes. The cache
is a per-inode hashtable, with name/value entries. There is one special
entry for the listxattr cache.

NFS inodes have a pointer to a cache structure. The cache structure is
allocated on demand, freed when the cache is invalidated.

Memory shrinkers keep the size in check. Large entries (> PAGE_SIZE)
are collected by a separate shrinker, and freed more aggressively
than others.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-13 17:52:46 -04:00
Frank van der Linden ccde1e9c01 nfs: make the buf_to_pages_noslab function available to the nfs code
Make the buf_to_pages_noslab function available to the rest of the NFS
code. Rename it to nfs4_buf_to_pages_noslab to be consistent.

This will be used later in the NFSv4.2 xattr code.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-13 17:52:45 -04:00
Frank van der Linden 1b523ca972 nfs: modify update_changeattr to deal with regular files
Until now, change attributes in change_info form were only returned by
directory operations. However, they are also used for the RFC 8276
extended attribute operations, which work on both directories
and regular files.  Modify update_changeattr to deal:

* Rename it to nfs4_update_changeattr and make it non-static.
* Don't always use INO_INVALID_DATA, this isn't needed for a
  directory that only had its extended attributes changed by us.
* Existing callers now always pass in INO_INVALID_DATA.

For the current callers of this function, behavior is unchanged.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-13 17:52:45 -04:00
Frank van der Linden 04a5da690e NFSv4.2: define limits and sizes for user xattr handling
Set limits for extended attributes (attribute value size and listxattr
buffer size), based on the fs-independent limits (XATTR_*_MAX).

Define the maximum XDR sizes for the RFC 8276 XATTR operations.
In the case of operations that carry a larger payload (SETXATTR,
GETXATTR, LISTXATTR), these exclude that payload, which is added
as separate pages, like other operations do.

Define, much like for read and write operations, the maximum overhead
sizes for get/set/listxattr, and use them to limit the maximum payload
size for those operations, in combination with the channel attributes.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-07-13 17:52:45 -04:00
Trond Myklebust b5fdf8418c NFSv4: Add support for CB_RECALL_ANY for flexfiles layouts
When we receive a CB_RECALL_ANY that asks us to return flexfiles
layouts, we iterate through all the layouts and look at whether or
not there are active open file descriptors that might need them
for I/O. If there are no such descriptors, we return the layouts.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-03-16 08:34:30 -04:00
Robert Milkowski 7dc2993a9e NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals
Currently, each time nfs4_do_fsinfo() is called it will do an implicit
NFS4 lease renewal, which is not compliant with the NFS4 specification.
This can result in a lease being expired by an NFS server.

Commit 83ca7f5ab3 ("NFS: Avoid PUTROOTFH when managing leases")
introduced implicit client lease renewal in nfs4_do_fsinfo(),
which can result in the NFSv4.0 lease to expire on a server side,
and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

This can easily be reproduced by frequently unmounting a sub-mount,
then stat'ing it to get it mounted again, which will delay or even
completely prevent client from sending RENEW operations if no other
NFS operations are issued. Eventually nfs server will expire client's
lease and return an error on file access or next RENEW.

This can also happen when a sub-mount is automatically unmounted
due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
mounted again via stat(). This can result in a short window during
which client's lease will expire on a server but not on a client.
This specific case was observed on production systems.

This patch removes the implicit lease renewal from nfs4_do_fsinfo().

Fixes: 83ca7f5ab3 ("NFS: Avoid PUTROOTFH when managing leases")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-02-04 12:27:55 -05:00
Trond Myklebust f7b37b8b13 NFS: Add softreval behaviour to nfs_lookup_revalidate()
If the server is unavaliable, we want to allow the revalidating
lookup to time out, and to default to validating the cached dentry
if the 'softreval' mount option is set.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-24 16:51:13 -05:00
David Howells f2aedb713c NFS: Add fs_context support.
Add filesystem context support to NFS, parsing the options in advance and
attaching the information to struct nfs_fs_context.  The highlights are:

 (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context.  This
     structure represents NFS's superblock config.

 (*) Make use of the VFS's parsing support to split comma-separated lists

 (*) Pin the NFS protocol module in the nfs_fs_context.

 (*) Attach supplementary error information to fs_context.  This has the
     downside that these strings must be static and can't be formatted.

 (*) Remove the auxiliary file_system_type structs since the information
     necessary can be conveyed in the nfs_fs_context struct instead.

 (*) Root mounts are made by duplicating the config for the requested mount
     so as to have the same parameters.  Submounts pick up their parameters
     from the parent superblock.

[AV -- retrans is u32, not string]
[SM -- Renamed cfg to ctx in a few functions in an earlier patch]
[SM -- Moved fs_context mount option parsing to an earlier patch]
[SM -- Moved fs_context error logging to a later patch]
[SM -- Fixed printks in nfs4_try_get_tree() and nfs4_get_referral_tree()]
[SM -- Added is_remount_fc() helper]
[SM -- Deferred some refactoring to a later patch]
[SM -- Fixed referral mounts, which were broken in the original patch]
[SM -- Fixed leak of nfs_fattr when fs_context is freed]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:17 -05:00
Al Viro a55d3297be nfs: don't bother passing nfs_subversion to ->try_mount() and nfs_fs_mount_common()
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15 10:15:16 -05:00
Chuck Lever 21f86d2d63 NFS4: Trace lock reclaims
One of the most frustrating messages our sustaining team sees is
the "Lock reclaim failed!" message. Add some observability in the
client's lock reclaim logic so we can capture better data the
first time a problem occurs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 11:04:32 +01:00
Ben Dooks d49dd11753 NFSv4: add declaration of current_stateid
The current_stateid is exported from nfs4state.c but not
declared in any of the headers. Add to nfs4_fs.h to
remove the following warning:

fs/nfs/nfs4state.c:80:20: warning: symbol 'current_stateid' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-18 10:36:45 +01:00
Trond Myklebust 807ce06c24 Merge branch 'linux-ssc-for-5.5' 2019-11-06 08:55:23 -05:00
Trond Myklebust 42c304c34e NFS: nfs_inode_find_state_and_recover() fix stateid matching
In nfs_inode_find_state_and_recover() we want to mark for recovery
only those stateids that match or are older than the supplied
stateid parameter.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-11-03 21:28:46 -05:00
Olga Kornievskaia 0e65a32c8a NFS: handle source server reboot
When the source server reboots after a server-to-server copy was
issued, we need to retry the copy from COPY_NOTIFY. We need to
detect that the source server rebooted and there is a copy waiting
on a destination server and wake it up.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-10-09 12:06:19 -04:00
Olga Kornievskaia 0b9018b9ca NFS: skip recovery of copy open on dest server
Mark the open created for the source file on the destination
server. Then if this open is going thru a recovery, then fail
the recovery as we don't need to be recoving a "fake" open.
We need to fail the ongoing READs and vfs_copy_file_range().

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-10-09 12:05:56 -04:00
Olga Kornievskaia ec4b092508 NFS: inter ssc open
NFSv4.2 inter server to server copy requires the destination server to
READ the data from the source server using the provided stateid and
file handle.

Given an NFSv4 stateid and filehandle from the COPY operaion, provide the
destination server with an NFS client function to create a struct file
suitable for the destiniation server to READ the data to be copied.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
2019-10-09 12:05:52 -04:00
Olga Kornievskaia 0491567b51 NFS: add COPY_NOTIFY operation
Try using the delegation stateid, then the open stateid.

Only NL4_NETATTR, No support for NL4_NAME and NL4_URL.
Allow only one source server address to be returned for now.

To distinguish between same server copy offload ("intra") and
a copy between different server ("inter"), do a check of server
owner identity and also make sure server is capable of doing
a copy offload.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-10-09 12:05:45 -04:00
Trond Myklebust 0e0cb35b41 NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE
If a CLOSE or OPEN_DOWNGRADE operation receives a NFS4ERR_OLD_STATEID
then bump the seqid before resending. Ensure we only bump the seqid
by 1.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-09-20 15:56:19 -04:00
Trond Myklebust 9228395709 NFSv4: Add a helper to increment stateid seqids
Add a helper function to increment stateid seqids according to the
rules specified in RFC5661 Section 8.2.2.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-09-20 15:44:45 -04:00
Trond Myklebust c77e22834a NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
John Hubbard reports seeing the following stack trace:

nfs4_do_reclaim
   rcu_read_lock /* we are now in_atomic() and must not sleep */
       nfs4_purge_state_owners
           nfs4_free_state_owner
               nfs4_destroy_seqid_counter
                   rpc_destroy_wait_queue
                       cancel_delayed_work_sync
                           __cancel_work_timer
                               __flush_work
                                   start_flush_work
                                       might_sleep:
                                        (kernel/workqueue.c:2975: BUG)

The solution is to separate out the freeing of the state owners
from nfs4_purge_state_owners(), and perform that outside the atomic
context.

Reported-by: John Hubbard <jhubbard@nvidia.com>
Fixes: 0aaaf5c424 ("NFS: Cache state owners after files are closed")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Donald Buczek 0efb01b2ac nfs4: Make nfs4_proc_get_lease_time available for nfs4.0
Compile nfs4_proc_get_lease_time, enc_get_lease_time and
dec_get_lease_time for nfs4.0. Use nfs4_sequence_done instead of
nfs41_sequence_done in nfs4_proc_get_lease_time,

Signed-off-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-13 11:48:41 -04:00
Trond Myklebust 0688e64bc6 NFS: Allow signal interruption of NFS4ERR_DELAYed operations
If the server is unable to immediately execute an RPC call, and returns
an NFS4ERR_DELAY then we can assume it is safe to interrupt the operation
in order to handle ordinary signals. This allows the application to
service timer interrupts that would otherwise have to wait until the
server is again able to respond.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-04-25 14:18:14 -04:00
Santosh kumar pradhan 10e037d1e0 sunrpc: Add xprt after nfs4_test_session_trunk()
Multipathing: In case of NFSv3, rpc_clnt_test_and_add_xprt() adds
the xprt to xprt switch (i.e. xps) if rpc_call_null_helper() returns
success. But in case of NFSv4.1, it needs to do EXCHANGEID to verify
the path along with check for session trunking.

Add the xprt in nfs4_test_session_trunk() only when
nfs4_detect_session_trunking() returns success. Also release refcount
hold by rpc_clnt_setup_test_and_add_xprt().

Signed-off-by: Santosh kumar pradhan <santoshkumar.pradhan@wdc.com>
Tested-by: Suresh Jayaraman <suresh.jayaraman@wdc.com>
Reported-by: Aditya Agnihotri <aditya.agnihotri@wdc.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-01-02 12:05:19 -05:00
NeilBrown a52458b48a NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.
SUNRPC has two sorts of credentials, both of which appear as
"struct rpc_cred".
There are "generic credentials" which are supplied by clients
such as NFS and passed in 'struct rpc_message' to indicate
which user should be used to authorize the request, and there
are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
which describe the credential to be sent over the wires.

This patch replaces all the generic credentials by 'struct cred'
pointers - the credential structure used throughout Linux.

For machine credentials, there is a special 'struct cred *' pointer
which is statically allocated and recognized where needed as
having a special meaning.  A look-up of a low-level cred will
map this to a machine credential.

Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:46 -05:00
NeilBrown 5e16923b43 NFS/SUNRPC: don't lookup machine credential until rpcauth_bindcred().
When NFS creates a machine credential, it is a "generic" credential,
not tied to any auth protocol, and is really just a container for
the princpal name.
This doesn't get linked to a genuine credential until rpcauth_bindcred()
is called.
The lookup always succeeds, so various places that test if the machine
credential is NULL, are pointless.

As a step towards getting rid of generic credentials, this patch gets
rid of generic machine credentials.  The nfs_client and rpc_client
just hold a pointer to a constant principal name.
When a machine credential is wanted, a special static 'struct rpc_cred'
pointer is used. rpcauth_bindcred() recognizes this, finds the
principal from the client, and binds the correct credential.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
NeilBrown f15e1e8bc6 NFSv4: don't require lock for get_renew_cred or get_machine_cred
This lock is no longer necessary.

If nfs4_get_renew_cred() needs to hunt through the open-state
creds for a user cred, it still takes the lock to stablize
the rbtree, but otherwise there are no races.

Note that this completely removes the lock from nfs4_renew_state().
It appears that the original need for the locking here was removed
long ago, and there is no longer anything to protect.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-19 13:52:45 -05:00
Trond Myklebust aeabb3c961 NFSv4: Fix a NFSv4 state manager deadlock
Fix a deadlock whereby the NFSv4 state manager can get stuck in the
delegation return code, waiting for a layout return to complete in
another thread. If the server reboots before that other thread
completes, then we need to be able to start a second state
manager thread in order to perform recovery.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-19 20:11:45 -05:00
Trond Myklebust ace9fad43a NFSv4: Convert struct nfs4_state to use refcount_t
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30 15:35:17 -04:00
Trond Myklebust 9ae075fdd1 NFSv4: Convert open state lookup to use RCU
Further reduce contention on the inode->i_lock.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30 15:35:17 -04:00
Linus Torvalds 53a01c9a5f NFS client updates for Linux 4.19
Stable bufixes:
 - v3.17+: Fix an off-by-one in bl_map_stripe()
 - v4.9+: NFSv4 client live hangs after live data migration recovery
 - v4.18+: xprtrdma: Fix disconnect regression
 - v4.14+: Fix locking in pnfs_generic_recover_commit_reqs
 - v4.9+: Fix a sleep in atomic context in nfs4_callback_sequence()
 
 Features:
 - Add support for asynchronous server-side COPY operations
 
 Other bugfixes and cleanups:
 - Optitmizations and fixes involving NFS v4.1 / pNFS layout handling
 - Optimize lseek(fd, SEEK_CUR, 0) on directories to avoid locking
 - Immediately reschedule writeback when the server replies with an error
 - Fix excessive attribute revalidation in nfs_execute_ok()
 - Add error checking to nfs_idmap_prepare_message()
 - Use new vm_fault_t return type
 - Return a delegation when reclaiming one that the server has recalled
 - Referrals should inherit proto setting from parents
 - Make rpc_auth_create_args a const
 - Improvements to rpc_iostats tracking
 - Fix a potential reference leak when there is an error processing a callback
 - Fix rmdir / mkdir / rename nlink accounting
 - Fix updating inode change attribute
 - Fix error handling in nfsn4_sp4_select_mode()
 - Use an appropriate work queue for direct-write completion
 - Don't busy wait if NFSv4 session draining is interrupted
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlt/CYIACgkQ18tUv7Cl
 QOu8gBAA0xQWmgRoG6oIdYUxvgYqhuJmMqC4SU1E6mCJ93xEuUSvEFw51X+84KCt
 r6UPkp/bKiVe3EIinKTplIzuxgggXNG0EQmO46FYNTl7nqpN85ffLsQoWsiD23fp
 j8afqKPFR2zfhHXLKQC7k1oiOpwGqJ+EJWgIW4llE80pSNaErEoEaDqSPds5thMN
 dHEjjLr8ef6cbBux6sSPjwWGNbE82uoSu3MDuV2+e62hpGkgvuEYo1vyE6ujeZW5
 MUsmw+AHZkwro0msTtNBOHcPZAS0q/2UMPzl1tsDeCWNl2mugqZ6szQLSS2AThKq
 Zr6iK9Q5dWjJfrQHcjRMnYJB+SCX1SfPA7ASuU34opwcWPjecbS9Q92BNTByQYwN
 o9ngs2K0mZfqpYESMAmf7Il134cCBrtEp3skGko2KopJcYcE5YUFhdKihi1yQQjU
 UbOOubMpQk8vY9DpDCAwGbICKwUZwGvq27uuUWL20kFVDb1+jvfHwcV4KjRAJo/E
 J9aFtU+qOh4rMPMnYlEVZcAZBGfenlv/DmBl1upRpjzBkteUpUJsAbCmGyAk4616
 3RECasehgsjNCQpFIhv3FpUkWzP5jt0T3gRr1NeY6WKJZwYnHEJr9PtapS+EIsCT
 tB5DvvaJqFtuHFOxzn+KlGaxdSodHF7klOq7NM3AC0cX8AkWqaU=
 =8+9t
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "These patches include adding async support for the v4.2 COPY
  operation. I think Bruce is planning to send the server patches for
  the next release, but I figured we could get the client side out of
  the way now since it's been in my tree for a while. This shouldn't
  cause any problems, since the server will still respond with
  synchronous copies even if the client requests async.

  Features:
   - Add support for asynchronous server-side COPY operations

  Stable bufixes:
   - Fix an off-by-one in bl_map_stripe() (v3.17+)
   - NFSv4 client live hangs after live data migration recovery (v4.9+)
   - xprtrdma: Fix disconnect regression (v4.18+)
   - Fix locking in pnfs_generic_recover_commit_reqs (v4.14+)
   - Fix a sleep in atomic context in nfs4_callback_sequence() (v4.9+)

  Other bugfixes and cleanups:
   - Optimizations and fixes involving NFS v4.1 / pNFS layout handling
   - Optimize lseek(fd, SEEK_CUR, 0) on directories to avoid locking
   - Immediately reschedule writeback when the server replies with an
     error
   - Fix excessive attribute revalidation in nfs_execute_ok()
   - Add error checking to nfs_idmap_prepare_message()
   - Use new vm_fault_t return type
   - Return a delegation when reclaiming one that the server has
     recalled
   - Referrals should inherit proto setting from parents
   - Make rpc_auth_create_args a const
   - Improvements to rpc_iostats tracking
   - Fix a potential reference leak when there is an error processing a
     callback
   - Fix rmdir / mkdir / rename nlink accounting
   - Fix updating inode change attribute
   - Fix error handling in nfsn4_sp4_select_mode()
   - Use an appropriate work queue for direct-write completion
   - Don't busy wait if NFSv4 session draining is interrupted"

* tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (54 commits)
  pNFS: Remove unwanted optimisation of layoutget
  pNFS/flexfiles: ff_layout_pg_init_read should exit on error
  pNFS: Treat RECALLCONFLICT like DELAY...
  pNFS: When updating the stateid in layoutreturn, also update the recall range
  NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence()
  NFSv4: Fix locking in pnfs_generic_recover_commit_reqs
  NFSv4: Fix a typo in nfs4_init_channel_attrs()
  NFSv4: Don't busy wait if NFSv4 session draining is interrupted
  NFS recover from destination server reboot for copies
  NFS add a simple sync nfs4_proc_commit after async COPY
  NFS handle COPY ERR_OFFLOAD_NO_REQS
  NFS send OFFLOAD_CANCEL when COPY killed
  NFS export nfs4_async_handle_error
  NFS handle COPY reply CB_OFFLOAD call race
  NFS add support for asynchronous COPY
  NFS COPY xdr handle async reply
  NFS OFFLOAD_CANCEL xdr
  NFS CB_OFFLOAD xdr
  NFS: Use an appropriate work queue for direct-write completion
  NFSv4: Fix error handling in nfs4_sp4_select_mode()
  ...
2018-08-23 16:03:58 -07:00
Olga Kornievskaia e4648aa4f9 NFS recover from destination server reboot for copies
Mark the destination state to indicate a server-side copy is
happening. On detecting a reboot and recovering open state check
if any state is engaged in a server-side copy, if so, find the
copy and mark it and then signal the waiting thread. Upon wakeup,
if copy was marked then propage EAGAIN to the nfsd_copy_file_range
and restart the copy from scratch.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-13 17:04:23 -04:00
Olga Kornievskaia 6b8d84e2f1 NFS add a simple sync nfs4_proc_commit after async COPY
A COPY with unstable write data needs a simple sync commit.
Filehandle value is gotten as a part of the inner loop so in
case of a reboot retry it should get the new value.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09 12:56:39 -04:00
Olga Kornievskaia 0f913a57d7 NFS export nfs4_async_handle_error
Make this function available to nfs42proc.c

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-08-09 12:56:39 -04:00
Al Viro 44907d7900 get rid of 'opened' argument of ->atomic_open() - part 3
now it can be done...

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-07-12 10:04:20 -04:00