If the argument and reply together exceed the maximum payload size, then
a reply with a read-like operation can overlow the rq_pages array.
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Lease time is a part of NFSv4 state engine, which is constructed per network
namespace.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The spec requires badname, not inval, in these cases.
Some callers want us to return enoent, but I can see no justification
for that.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Our server rejects compounds containing more than one write operation.
It's unclear whether this is really permitted by the spec; with 4.0,
it's possibly OK, with 4.1 (which has clearer limits on compound
parameters), it's probably not OK. No client that we're aware of has
ever done this, but in theory it could be useful.
The source of the limitation: we need an array of iovecs to pass to the
write operation. In the worst case that array of iovecs could have
hundreds of elements (the maximum rwsize divided by the page size), so
it's too big to put on the stack, or in each compound op. So we instead
keep a single such array in the compound argument.
We fill in that array at the time we decode the xdr operation.
But we decode every op in the compound before executing any of them. So
once we've used that array we can't decode another write.
If we instead delay filling in that array till the time we actually
perform the write, we can reuse it.
Another option might be to switch to decoding compound ops one at a
time. I considered doing that, but it has a number of other side
effects, and I'd rather fix just this one problem for now.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The comment here is totally bogus:
- OP_WRITE + 1 is RELEASE_LOCKOWNER. Maybe there was some older
version of the spec in which that served as a sort of
OP_ILLEGAL? No idea, but it's clearly wrong now.
- In any case, I can't see that the spec says anything about
what to do if the client sends us less ops than promised.
It's clearly nutty client behavior, and we should do
whatever's easiest: returning an xdr error (even though it
won't be consistent with the error on the last op returned)
seems fine to me.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
For now this only adds support for AUTH_NULL. (Previously we assumed
AUTH_UNIX.) We'll also need AUTH_GSS, which is trickier.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Actually, xfs and jfs can optionally be case insensitive; we'll handle
that case in later patches.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull the rest of the nfsd commits from Bruce Fields:
"... and then I cherry-picked the remainder of the patches from the
head of my previous branch"
This is the rest of the original nfsd branch, rebased without the
delegation stuff that I thought really needed to be redone.
I don't like rebasing things like this in general, but in this situation
this was the lesser of two evils.
* 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
nfsd4: fix, consolidate client_has_state
nfsd4: don't remove rebooted client record until confirmation
nfsd4: remove some dprintk's and a comment
nfsd4: return "real" sequence id in confirmed case
nfsd4: fix exchange_id to return confirm flag
nfsd4: clarify that renewing expired client is a bug
nfsd4: simpler ordering of setclientid_confirm checks
nfsd4: setclientid: remove pointless assignment
nfsd4: fix error return in non-matching-creds case
nfsd4: fix setclientid_confirm same_cred check
nfsd4: merge 3 setclientid cases to 2
nfsd4: pull out common code from setclientid cases
nfsd4: merge last two setclientid cases
nfsd4: setclientid/confirm comment cleanup
nfsd4: setclientid remove unnecessary terms from a logical expression
nfsd4: move rq_flavor into svc_cred
nfsd4: stricter cred comparison for setclientid/exchange_id
nfsd4: move principal name into svc_cred
nfsd4: allow removing clients not holding state
nfsd4: rearrange exchange_id logic to simplify
...
The fs_location->hosts list is split on colons, but this doesn't work when
IPv6 addresses are used (they contain colons).
This patch adds the function nfsd4_encode_components_esc() to
allow the caller to specify escape characters when splitting on 'sep'.
In order to fix referrals, this patch must be used with the mountd patch
that similarly fixes IPv6 [] escaping.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Though actually this doesn't matter much, as NFSv4.0 clients are
required to treat the change attribute as opaque.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
->ts_id_status gets nfs errno, i.e. it's already big-endian; no need
to apply htonl() to it. Broken by commit 174568 (NFSD: Added TEST_STATEID
operation) last year...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Clean up due to code review.
The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.
We can fix most of this by using a __be32 array to generate the
verifier's contents and then byte-copying it into the verifier field.
However, there is one spot where there is a backwards compatibility
constraint: the do_nfsd_create() call expects a verifier which is
32-bit aligned. Fix this spot by forcing the alignment of the create
verifier in the nfsd4_open args structure.
Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier. The two are not interchangeable, even if they happen to
have the same value.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Respect client request for not getting a delegation in NFSv4.1
Appropriately return delegation "type" NFS4_OPEN_DELEGATE_NONE_EXT
and WND4_NOT_WANTED reason.
[nfsd41: add missing break when encoding op_why_no_deleg]
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
When I initially wrote it, I didn't understand how lists worked so I
wrote something that didn't use them. I think making a list of stateids
to test is a more straightforward implementation, especially compared to
especially compared to decoding stateids while simultaneously encoding
a reply to the client.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Combine two booleans into a single flag field, move the smaller fields
to the end.
(In practice this doesn't make the struct any smaller. But we'll be
adding another flag here soon.)
Remove some debugging code that doesn't look useful, while we're in the
neighborhood.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
According to rfc5661 18.50, implement DESTROY_CLIENTID operation.
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
RFC5661 says:
The client may set one or both of
OPEN4_SHARE_ACCESS_WANT_SIGNAL_DELEG_WHEN_RESRC_AVAIL and
OPEN4_SHARE_ACCESS_WANT_PUSH_DELEG_WHEN_UNCONTENDED.
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This should be a bitwise negate here. It silences a Sparse warning:
fs/nfsd/nfs4xdr.c:693:16: warning: dubious: x & !y
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
I'd rather put more of these sorts of checks into standardized xdr
decoders for the various types rather than have them cluttering up the
core logic in nfs4proc.c and nfs4state.c.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Use a separate stateid idr per client, and lookup a stateid by first
finding the client, then looking up the stateid relative to that client.
Also some minor refactoring.
This allows us to improve error returns: we can return expired when the
clientid is not found and bad_stateid when the clientid is found but not
the stateid, as opposed to returning expired for both cases.
I hope this will also help to replace the state lock mostly by a
per-client lock, but that hasn't been done yet.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Test_stateid is 4.1-only and only allowed after a sequence operation, so
this check is unnecessary.
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Keep around an unhashed copy of the final stateid after the last close
using an openowner, and when identifying a replay, match against that
stateid instead of just against the open owner id. Free it the next
time the seqid is bumped or the stateowner is destroyed.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
For checking the size of reply before calling a operation,
we need try to get maxsize of the operation's reply.
v3: using new method as Bruce said,
"we could handle operations in two different ways:
- For operations that actually change something (write, rename,
open, close, ...), do it the way we're doing it now: be
very careful to estimate the size of the response before even
processing the operation.
- For operations that don't change anything (read, getattr, ...)
just go ahead and do the operation. If you realize after the
fact that the response is too large, then return the error at
that point.
So we'd add another flag to op_flags: say, OP_MODIFIES_SOMETHING. And for
operations with OP_MODIFIES_SOMETHING set, we'd do the first thing. For
operations without it set, we'd do the second."
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
[bfields@redhat.com: crash, don't attempt to handle, undefined op_rsize_bop]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The current code is sort of hackish in that it assumes a referral is always
matched to an export. When we add support for junctions that may not be the
case.
We can replace nfsd4_path() with a function that encodes the components
directly from the dentries. Since nfsd4_path is currently the only user of
the 'ex_pathname' field in struct svc_export, this has the added benefit
of allowing us to get rid of that.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The stateowner has some fields that only make sense for openowners, and
some that only make sense for lockowners, and I find it a lot clearer if
those are separated out.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Now that the replay owner is in the cstate we can remove it from a lot
of other individual operations and further simplify
nfs4_preprocess_seqid_op().
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Set the stateowner associated with a replay in one spot in
nfs4_preprocess_seqid_op() and keep it in cstate. This allows removing
a few lines of boilerplate from all the nfs4_preprocess_seqid_op()
callers.
Also turn ENCODE_SEQID_OP_TAIL into a function while we're here.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The values here represent highest slotid numbers. Since slotid's are
numbered starting from zero, the highest should be one less than the
number of slots.
Reported-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The nfsd4 code has a bunch of special exceptions for error returns which
map nfserr_symlink to other errors.
In fact, the spec makes it clear that nfserr_symlink is to be preferred
over less specific errors where possible.
The patch that introduced it back in 2.6.4 is "kNFSd: correct symlink
related error returns.", which claims that these special exceptions are
represent an NFSv4 break from v2/v3 tradition--when in fact the symlink
error was introduced with v4.
I suspect what happened was pynfs tests were written that were overly
faithful to the (known-incomplete) rfc3530 error return lists, and then
code was fixed up mindlessly to make the tests pass.
Delete these unnecessary exceptions.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>