NFSv4.1 LAYOUTRETURN implementation
Currently, does not support layout-type payload encoding.
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
[call pnfs_return_layout right before pnfs_destroy_layout]
[remove assert_spin_locked from pnfs_clear_lseg_list]
[remove wait parameter from the layoutreturn path.]
[remove return_type field from nfs4_layoutreturn_args]
[remove range from nfs4_layoutreturn_args]
[no need to send layoutcommit from _pnfs_return_layout]
[don't wait on sync layoutreturn]
[fix layout stateid in layoutreturn args]
[fixed NULL deref in _pnfs_return_layout]
[removed recaim member of nfs4_layoutreturn_args]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Non-rpc layout driver such as for objects and blocks
implement their own I/O path and error handling logic.
Therefore bypass NFS-based error handling for these layout drivers.
[fix lseg ref-count bugs, and null de-refs]
[Fall out from: non-rpc layout drivers]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[get rid of PNFS_USE_RPC_CODE]
[get rid of __nfs4_write_done_cb]
[revert useless change in nfs4_write_done_cb]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Currently, writebacks may end up recursing back into the filesystem due to
GFP_KERNEL direct reclaims in the pnfs subsystem.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When readdir() returns a directory entry for the root of a mounted
filesystem, Linux follows the old convention of returning the inode
number of the covered directory (despite newer versions of POSIX declaring
that this is a bug).
To ensure this continues to work, the NFSv4 readdir implementation requests
the 'mounted-on-fileid' from the server.
However, readdirplus also needs to instantiate an inode for this entry, and
for that, we also need to request the real fileid as per this patch.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
changes LAYOUTGET and GETDEVICEINFO XDR parsing to:
- not use vmap, which doesn't work on incoherent archs
- use xdr_stream parsing for all xdr
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A submount may use different security than the parent
mount does. We should figure out what sec flavor the
submount uses at mount time.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A later patch will need to perform a lookup using an
alternate client with a different security flavor.
This patch adds support for doing that on NFS v4.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to
the data server (as opposed to the MDS) and the data server WRITE
is not NFS_FILE_SYNC.
Only whole file layout support means that there is only one IOMODE_RW layout
segment.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement all the hooks created in the previous patches.
This requires exporting quite a few functions and adding a few
structure fields.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Commit 45a52a0207 (NFS move nfs_client
initialization into nfs_get_client) introduces a new warning in
fs/nfs/idmap.c:
‘struct rpc_timeout’ declared inside parameter list
Fix it by adding a forward declaration for the struct rpc_timeout
in include/linux/nfs_xdr.h
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Allows the pnfs filelayout driver to write to the data servers.
Note that COMMIT to data servers will be implemented in a future
patch. To avoid improper behavior, for the moment any WRITE to a data
server that would also require a COMMIT to the data server is sent
NFS_FILE_SYNC.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We grab the lseg sent in from the doio function and attach it to
each struct nfs_write_data created. This is how the lseg will be
sent to the layout driver.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add callback that pnfs layout driver can use to do its own handling
of data server WRITE response.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use our own async error handler.
Mark the layout as failed and retry i/o through the MDS on specified errors.
Update the mds_offset in nfs_readpage_retry so that a failed short-read retry
to a DS gets correctly resent through the MDS.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Attempt a pNFS file layout read by setting up the nfs_read_data struct and
calling nfs_initiate_read with the data server rpc client and the
filelayout rpc call ops.
Error handling is implemented in a subsequent patch.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Tested-by: Guo Mingyang <guomingyang@nrchpc.ac.cn>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Introduce a data server set_client and init session following the
nfs4_set_client and nfs4_init_session convention.
Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state
serializes access to creating an nfs_client struct with matching properties.
Use the new nfs_get_client() that initializes new clients.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Separate the rpc run portion of nfs_read_rpcsetup into a new function
nfs_initiate_read that is called for normal NFS I/O.
Add a pNFS read_pagelist function that is called instead of nfs_intitate_read
for pNFS reads.
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move the pnfs_update_layout call location to nfs_pageio_do_add_request().
Grab the lseg sent in the doio function to nfs_read_rpcsetup and attach
it to each nfs_read_data so it can be sent to the layout driver.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now nfs_get_client returns an nfs_client ready to be used no matter if it was
found or created.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cleanup nfs_read_data. We also won't use CONFIG_NFS_V4_1 for additional
NFSv4.1 fields in subsequent patches.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
In order to enable migration support, we will want to move some of the
structures that are subject to migration into the struct nfs_server.
In particular, if we are to move the state_owner and state_owner_id to
being a per-filesystem structure, then we should label the resulting
open/lock owners with a per-filesytem label to ensure global uniqueness.
This patch does so by adding the super block s_dev to the open/lock owner
name.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We shouldn't send a LAYOUTGET(openstateid) unless all outstanding RPCs
using the previous stateid are completed. This requires choosing the
stateid to encode earlier, so we can abort if one is not available (we
want to use the open stateid, but a LAYOUTGET is already out using
it), and adding a count of the number of outstanding rpc calls using
layout state (which for now consist solely of LAYOUTGETs).
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up.
The pointer returned by ->decode_dirent() is no longer used as a
pointer. The only call site (xdr_decode() in fs/nfs/dir.c) simply
extracts the errno value encoded in the pointer. Replace the
returned pointer with a standard integer errno return value.
Also, pass the "server" argument as part of the nfs_entry instead of
as a separate parameter. It's faster to derive "server" in
nfs_readdir_xdr_to_array() since we already have the directory's inode
handy. "server" ought to be invariant for a set of entries in the
same directory, right?
The legacy versions of decode_dirent() don't use "server" anyway, so
it's wasted work for them to derive and pass "server" for each entry.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Store the dirent->d_type in the struct nfs_cache_array_entry so that we
can use it in getdents() calls.
This fixes a regression with the new readdir code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add the ability to actually send LAYOUTGET and GETDEVICEINFO. This also adds
in the machinery to handle layout state and the deviceid cache. Note that
GETDEVICEINFO is not called directly by the generic layer. Instead it
is called by the drivers while parsing the LAYOUTGET opaque data in response
to an unknown device id embedded therein. RFC 5661 only encodes
device ids within the driver-specific opaque data.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This information will be used to determine which layout driver,
if any, to use for subsequent IO on this filesystem. Each driver
is assigned an integer id, with 0 reserved to indicate no driver.
The server can in theory return multiple ids. However, our current
client implementation only notes the first entry and ignores the
rest.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Instead of blindly zapping the caches, attempt to revalidate them if
the server has indicated that it uses high resolution timestamps.
NFSv4 should be able to always revalidate the cache since the
protocol requires the update of the change attribute on modification of
the data. In reality, there are servers (the Linux NFS server
for example) that do not obey this requirement and use ctime as the
basis for change attribute. Long term, the server needs to be fixed.
At this time, and to be on the safe side, continue zapping caches if
the server indicates that it does not have a high resolution timestamp.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We can use vmapped pages to read more information from the network at once.
This will reduce the number of calls needed to complete a readdir.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Convert nfs*xdr.c to use an xdr stream in decode_dirent. This will prevent a
kernel oops that has been occuring when reading a vmapped page.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE
resulted in numerous bugs. Keeping the current slot as a pointer
to the slot table is more straight forward and robust as it's
implicitly set up to NULL wherever the seq_res member is initialized
to zeroes.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A synchronous rename can be interrupted by a SIGKILL. If that happens
during a sillyrename operation, it's possible for the rename call to
be sent to the server, but the task exits before processing the
reply. If this happens, the sillyrenamed file won't get cleaned up
during nfs_dentry_iput and the server is left with a dangling .nfs* file
hanging around.
Fix this problem by turning sillyrename into an asynchronous operation
and have the task doing the sillyrename just wait on the reply. If the
task is killed before the sillyrename completes, it'll still proceed
to completion.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Right now, v3 and v4 have their own variants. Create a standard struct
that will work for v3 and v4. v2 doesn't get anything but a simple error
and so isn't affected by this.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Each NFS version has its own version of the rename args container.
Standardize them on a common one that's identical to the one NFSv4
uses.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Remove all remaining references to the struct nameidata from the low level
NFS layers. Again pass down a partially initialised struct nfs_open_context
when we want to do atomic open+create.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This is needed by NFSv4.0 servers in order to keep the number of locking
stateids at a manageable level.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: (42 commits)
nfsd: remove pointless paths in file headers
nfsd: move most of nfsfh.h to fs/nfsd
nfsd: remove unused field rq_reffh
nfsd: enable V4ROOT exports
nfsd: make V4ROOT exports read-only
nfsd: restrict filehandles accepted in V4ROOT case
nfsd: allow exports of symlinks
nfsd: filter readdir results in V4ROOT case
nfsd: filter lookup results in V4ROOT case
nfsd4: don't continue "under" mounts in V4ROOT case
nfsd: introduce export flag for v4 pseudoroot
nfsd: let "insecure" flag vary by pseudoflavor
nfsd: new interface to advertise export features
nfsd: Move private headers to source directory
vfs: nfsctl.c un-used nfsd #includes
lockd: Remove un-used nfsd headers #includes
s390: remove un-used nfsd #includes
sparc: remove un-used nfsd #includes
parsic: remove un-used nfsd #includes
compat.c: Remove dependence on nfsd private headers
...
An header should be compilation independent, .i.e pull in
any header who's declarations are directly used by this header.
And not let users re-include all it's dependencies all over
again.
[At the end of the day what's the use of a header if it does
not have more then one user?]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
XDR encoding and decoding for RECLAIM_COMPLETE. Implements the necessary
encoding to indicate reclaim complete for the entire client. In the future,
it can be extended to provide reclaim complete functionality for a single
file system after migration.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
the server can indicate a number of error conditions by setting the
appropriate bits in the SEQUENCE operation. The client re-establishes
state with the server when it receives one of those, with the action
depending on the specific case.
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
reorder nfs4_sequence_args to remove 8 bytes of padding on 64 bit
builds.
The size of this structure drops to 24 bytes from 32 and reduces the
text size of nfs.ko.
On my x86_64 size reports
text data bss
2.6.32-rc5 200996 8512 432 209940 33414 nfs.ko
+patch 200884 8512 432 209828 333a4 nfs.ko
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>