Commit Graph

52 Commits

Author SHA1 Message Date
Linus Torvalds 8f03cf50bc NFS client updates for Linux 4.11
Stable bugfixes:
 - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
 - xprtrdma: Fix Read chunk padding
 - xprtrdma: Per-connection pad optimization
 - xprtrdma: Disable pad optimization by default
 - xprtrdma: Reduce required number of send SGEs
 - nlm: Ensure callback code also checks that the files match
 - pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
 - NFSv4: Fix reboot recovery in copy offload
 - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
 - NFSv4: fix getacl head length estimation
 - NFSv4: fix getacl ERANGE for sum ACL buffer sizes
 
 Features:
 - Add and use dprintk_cont macros
 - Various cleanups to NFS v4.x to reduce code duplication and complexity
 - Remove unused cr_magic related code
 - Improvements to sunrpc "read from buffer" code
 - Clean up sunrpc timeout code and allow changing TCP timeout parameters
 - Remove duplicate mw_list management code in xprtrdma
 - Add generic functions for encoding and decoding xdr streams
 
 Bugfixes:
 - Clean up nfs_show_mountd_netid
 - Make layoutreturn_ops static and use NULL instead of 0 to fix sparse warnings
 - Properly handle -ERESTARTSYS in nfs_rename()
 - Check if register_shrinker() failed during rpcauth_init()
 - Properly clean up procfs/pipefs entries
 - Various NFS over RDMA related fixes
 - Silence unititialized variable warning in sunrpc
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAli3F7YACgkQ18tUv7Cl
 QOvzrQ//dL+nnBaqsm9bA2wwuVJSQ2R1zdkwHOCWghEWROZrQHzpi0VHu0ZKBLzr
 YsYFhHvIPax9Q8USY4B/QFQ3eUuZILEVn+xDruRxZaJPnsA4Zmr16VJwGF2F68Lh
 CGekA5qybqy8lAG6v96Gyjbi+JqjHNCmelYWRv7SX9IZcDjNJpsEbrSI4LkabTWh
 70WtCl3LBzVMRYRxe8+f0mcx4g4XCQ8pDaQRgRnfKtNeQk/+PgWz66xSNinDakVb
 A8AkaiUadPRgUTpap6HfBSicpRvtLQeLhARC0E4YE5pXp2H/kUt2MFe5szblfSCv
 zf2nrPUbNEHjBypFhERzCZZk6EonY6FeOojyW0g2C+rmPdK7WLlKbwTQFxdRGvsx
 78fIiPRdlDHDp9CXzD8V4xxRBJX/KkicA1Vp8CoyQtmpzpu2fjwT0kr9HeD+aEe6
 293+72QUfk05re2HYWF9MCGGVVLdnLLjrKCgwwRQ0HX5WF6GNQxX/yVgBVlqFeV3
 xc8m7ltKco5N9JxIqwlIpySq2e114EQOqsmHYz3gxd7ID9J1NJz+9H2z2EvgAKZ7
 wIPSLoZrdBdnoXG8ZDDTAvPKeB8l6egi6wjrvGKxewVlMbjzogdARsMKWoifnCfG
 HMkH+IEvLGvFc1pPeLbscJGEdVWXVn0thO+8fkS9F9sE/zMX9PA=
 =01DU
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "Highlights include:

  Stable bugfixes:
   - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state
   - xprtrdma: Fix Read chunk padding
   - xprtrdma: Per-connection pad optimization
   - xprtrdma: Disable pad optimization by default
   - xprtrdma: Reduce required number of send SGEs
   - nlm: Ensure callback code also checks that the files match
   - pNFS/flexfiles: If the layout is invalid, it must be updated before
     retrying
   - NFSv4: Fix reboot recovery in copy offload
   - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION
     replies to OP_SEQUENCE"
   - NFSv4: fix getacl head length estimation
   - NFSv4: fix getacl ERANGE for sum ACL buffer sizes

  Features:
   - Add and use dprintk_cont macros
   - Various cleanups to NFS v4.x to reduce code duplication and
     complexity
   - Remove unused cr_magic related code
   - Improvements to sunrpc "read from buffer" code
   - Clean up sunrpc timeout code and allow changing TCP timeout
     parameters
   - Remove duplicate mw_list management code in xprtrdma
   - Add generic functions for encoding and decoding xdr streams

  Bugfixes:
   - Clean up nfs_show_mountd_netid
   - Make layoutreturn_ops static and use NULL instead of 0 to fix
     sparse warnings
   - Properly handle -ERESTARTSYS in nfs_rename()
   - Check if register_shrinker() failed during rpcauth_init()
   - Properly clean up procfs/pipefs entries
   - Various NFS over RDMA related fixes
   - Silence unititialized variable warning in sunrpc"

* tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (64 commits)
  NFSv4: fix getacl ERANGE for some ACL buffer sizes
  NFSv4: fix getacl head length estimation
  Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE"
  NFSv4: Fix reboot recovery in copy offload
  pNFS/flexfiles: If the layout is invalid, it must be updated before retrying
  NFSv4: Clean up owner/group attribute decode
  SUNRPC: Add a helper function xdr_stream_decode_string_dup()
  NFSv4: Remove bogus "struct nfs_client" argument from decode_ace()
  NFSv4: Fix the underestimation of delegation XDR space reservation
  NFSv4: Replace callback string decode function with a generic
  NFSv4: Replace the open coded decode_opaque_inline() with the new generic
  NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* generics
  SUNRPC: Add generic helpers for xdr_stream encode/decode
  sunrpc: silence uninitialized variable warning
  nlm: Ensure callback code also checks that the files match
  sunrpc: Allow xprt->ops->timer method to sleep
  xprtrdma: Refactor management of mw_list field
  xprtrdma: Handle stale connection rejection
  xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs
  xprtrdma: Shrink send SGEs array
  ...
2017-03-01 16:10:30 -08:00
Linus Torvalds 8313064c2e The nfsd update this round is mainly a lot of miscellaneous cleanups and
bugfixes.
 
 A couple changes could theoretically break working setups on upgrade.  I
 don't expect complaints in practice, but they seem worth calling out
 just in case:
 
 	- NFS security labels are now off by default; a new
 	  security_label export flag reenables it per export.  But,
 	  having them on by default is a disaster, as it generally only
 	  makes sense if all your clients and servers have similar
 	  enough selinux policies.  Thanks to Jason Tibbitts for
 	  pointing this out.
 
 	- NFSv4/UDP support is off.  It was never really supported, and
 	  the spec explicitly forbids it.  We only ever left it on out
 	  of laziness; thanks to Jeff Layton for finally fixing that.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYtejbAAoJECebzXlCjuG+JhEQAK3YTYYNrPY26Pfiu0FghLFV
 4qOHK4DOkJzrWIom5uWyBo7yOwH6WnQtTe/gCx/voOEW3lsJO7F3IfTnTVp+Smp6
 GJeVtsr1vI9EBnwhMlyoJ5hZ2Ju5kX3MBVnew6+momt6620ZO7a+EtT+74ePaY8Y
 jxLzWVA1UqbWYoMabNQpqgKypKvNrhwst72iYyBhNuL/qtGeBDQWwcrA+TFeE9tv
 Ad7qB53xL1mr0Wn1CNIOR/IzVAj4o2H0vdjqrPjAdvbfwf8YYLNpJXt5k591wx/j
 1TpiWIPqnwLjMT3X5NkQN3agZKeD+2ZWPrClr35TgRRe62CK6JblK9/Wwc5BRSzV
 paMP3hOm6/dQOBA5C+mqPaHdEI8VqcHyZpxU4VC/ttsVEGTgaLhGwHGzSn+5lYiM
 Qx9Sh50yFV3oiBW/sb/y8lBDwYm/Cq0OyqAU277idbdzjcFerMg1qt06tjEQhYMY
 K2V7rS8NuADUF6F1BwOONZzvg7Rr7iWHmLh+iSM9TeQoEmz2jIHSaLIFaYOET5Jr
 PIZS3rOYoa0FaKOYVYnZMC74n/LqP/Aou8B+1rRcLy5YEdUIIIVpFvqpg1nGv6PI
 sA3zx/f13IRte3g0CuQiY0/2cx7uXk/gXJ7s5+ejEzljF/aYWiomx3mr6HqPQETn
 CWEtXlfyJCyX+A8hbO+U
 =iLLz
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "The nfsd update this round is mainly a lot of miscellaneous cleanups
  and bugfixes.

  A couple changes could theoretically break working setups on upgrade.
  I don't expect complaints in practice, but they seem worth calling out
  just in case:

   - NFS security labels are now off by default; a new security_label
     export flag reenables it per export. But, having them on by default
     is a disaster, as it generally only makes sense if all your clients
     and servers have similar enough selinux policies. Thanks to Jason
     Tibbitts for pointing this out.

   - NFSv4/UDP support is off. It was never really supported, and the
     spec explicitly forbids it. We only ever left it on out of
     laziness; thanks to Jeff Layton for finally fixing that"

* tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux: (34 commits)
  nfsd: Fix display of the version string
  nfsd: fix configuration of supported minor versions
  sunrpc: don't register UDP port with rpcbind when version needs congestion control
  nfs/nfsd/sunrpc: enforce transport requirements for NFSv4
  sunrpc: flag transports as having congestion control
  sunrpc: turn bitfield flags in svc_version into bools
  nfsd: remove superfluous KERN_INFO
  nfsd: special case truncates some more
  nfsd: minor nfsd_setattr cleanup
  NFSD: Reserve adequate space for LOCKT operation
  NFSD: Get response size before operation for all RPCs
  nfsd/callback: Drop a useless data copy when comparing sessionid
  nfsd/callback: skip the callback tag
  nfsd/callback: Cleanup callback cred on shutdown
  nfsd/idmap: return nfserr_inval for 0-length names
  SUNRPC/Cache: Always treat the invalid cache as unexpired
  SUNRPC: Drop all entries from cache_detail when cache_purge()
  svcrdma: Poll CQs in "workqueue" mode
  svcrdma: Combine list fields in struct svc_rdma_op_ctxt
  svcrdma: Remove unused sc_dto_q field
  ...
2017-02-28 15:39:09 -08:00
Kinglong Mee 863d7d9c2e sunrpc/nfs: cleanup procfs/pipefs entry in cache_detail
Record flush/channel/content entries is useless, remove them.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-02-08 17:02:45 -05:00
Kinglong Mee d6fc8821c2 SUNRPC/Cache: Always treat the invalid cache as unexpired
When the first time pynfs runs after rpc/nfsd startup, always get the warning,

"Got error: Connection closed"

I found the problem is caused by,
1. A new startup of nfsd, rpc.mountd, etc,
2. A rpc request from client (pynfs test, or normal mounting),
3. An ip_map cache is created but invalid, so upcall to rpc.mountd,
4. rpc.mountd process the ip_map upcall, before write the valid data to nfsd,
   do auth_reload(), and check_useipaddr(),
5. For the first time, old_use_ipaddr = -1, it causes rpc.mountd do write_flush    that doing cache_clean,
6. The ip_map cache will be treat as expired and clean,
7. When rpc.mountd write the valid data to nfsd, a new ip_map is created
   and updated, the cache_check of old ip_map(doing the upcall) will
   return -ETIMEDOUT.
8. RPC layer return SVC_CLOSE and close the xprt after commit 4d712ef1db
   "svcauth_gss: Close connection when dropping an incoming message"

NeilBrown suggest in another email,

"If CACHE_VALID is not set, then there is no data in the cache item,
 so there is nothing to expire. So it would be nice if cache items that
 don't have CACHE_VALID are never treated as expired."

v3, change the order of the two patches
v2, change the checking of CACHE_PENDING to CACHE_VALID

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-08 16:49:53 -05:00
Neil Brown 2b477c00f3 svcrpc: free contexts immediately on PROC_DESTROY
We currently handle a client PROC_DESTROY request by turning it
CACHE_NEGATIVE, setting the expired time to now, and then waiting for
cache_clean to clean it up later.  Since we forgot to set the cache's
nextcheck value, that could take up to 30 minutes.  Also, though there's
probably no real bug in this case, setting CACHE_NEGATIVE directly like
this probably isn't a great idea in general.

So let's just remove the entry from the cache directly, and move this
bit of cache manipulation to a helper function.

Signed-off-by: Neil Brown <neilb@suse.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-31 12:31:53 -05:00
Peter Zijlstra 2c935bc572 locking/atomic, kref: Add kref_read()
Since we need to change the implementation, stop exposing internals.

Provide kref_read() to read the current reference count; typically
used for debug messages.

Kills two anti-patterns:

	atomic_read(&kref->refcount)
	kref->refcount.counter

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 11:37:18 +01:00
NeilBrown d8d29138b1 sunrpc: remove 'inuse' flag from struct cache_detail.
This field is not currently in use.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13 15:32:47 -04:00
Neil Brown 778620364e sunrpc/cache: make cache flushing more reliable.
The caches used to store sunrpc authentication information can be
flushed by writing a timestamp to a file in /proc.

This timestamp has a one-second resolution and any entry in cache that
was last_refreshed *before* that time is treated as expired.

This is problematic as it is not possible to reliably flush the cache
without interrupting NFS service.
If the current time is written to the "flush" file, any entry that was
added since the current second started will still be treated as valid.
If one second beyond than the current time is written to the file
then no entries can be valid until the second ticks over.  This will
mean that no NFS request will be handled for up to 1 second.

To resolve this issue we make two changes:

1/ treat an entry as expired if the timestamp when it was last_refreshed
  is before *or the same as* the expiry time.  This means that current
  code which writes out the current time will now flush the cache
  reliably.

2/ when a new entry in added to the cache -  set the last_refresh timestamp
  to 1 second *beyond* the current flush time, when that not in the
  past.
  This ensures that newly added entries will always be valid.

Now that we have a very reliable way to flush the cache, and also
since we are using "since-boot" timestamps which are monotonic,
change cache_purge() to set the smallest future flush_time which
will work, and leave it there: don't revert to '1'.

Also disable the setting of the 'flush_time' far into the future.
That has never been useful and is now awkward as it would cause
last_refresh times to be strange.
Finally: if a request is made to set the 'flush_time' to the current
second, assume the intent is to flush the cache and advance it, if
necessary, to 1 second beyond the current 'flush_time' so that all
active entries will be deemed to be expired.

As part of this we need to add a 'cache_detail' arg to cache_init()
and cache_fresh_locked() so they can find the current ->flush_time.

Signed-off-by: NeilBrown <neilb@suse.com>
Reported-by: Olaf Kirch <okir@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-10-23 15:57:30 -04:00
Kinglong Mee 129e5824cd sunrpc: Switch to using hash list instead single list
Switch using list_head for cache_head in cache_detail,
it is useful of remove an cache_head entry directly from cache_detail.

v8, using hash list, not head list

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13 08:59:03 -04:00
Kinglong Mee c8c081b70c sunrpc/nfsd: Remove redundant code by exports seq_operations functions
Nfsd has implement a site of seq_operations functions as sunrpc's cache.
Just exports sunrpc's codes, and remove nfsd's redundant codes.

v8, same as v6

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13 08:59:02 -04:00
Harshula Jayasuriya 2f74f972d4 sunrpc: prepare NFS for 2038
1) The kernel sunrpc code needs to handle seconds since epoch
greater than 2147483647. This means functions that parse time
as an int need to handle it as time_t.

2) The kernel changes must be accompanied by userspace changes
in nfs-utils.

Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-19 09:55:01 -04:00
NeilBrown 7715cde868 net/sunrpc: xpt_auth_cache should be ignored when expired.
commit d202cce896
    sunrpc: never return expired entries in sunrpc_cache_lookup

moved the 'entry is expired' test from cache_check to
sunrpc_cache_lookup, so that it happened early and some races could
safely be ignored.

However the ip_map (in svcauth_unix.c) has a separate single-item
cache which allows quick lookup without locking.  An entry in this
case would not be subject to the expiry test and so could be used
well after it has expired.

This is not normally a big problem because the first time it is used
after it is expired an up-call will be scheduled to refresh the entry
(if it hasn't been scheduled already) and the old entry will then
be invalidated.  So on the second attempt to use it after it has
expired, ip_map_cached_get will discard it.

However that is subtle and not ideal, so replace the "!cache_valid"
test with "cache_is_expired".
In doing this we drop the test on the "CACHE_VALID" bit.  This is
unnecessary as the bit is never cleared, and an entry will only
be cached if the bit is set.

Reported-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:53:28 -04:00
NeilBrown 013920eb5d sunrpc/cache: ensure items removed from cache do not have pending upcalls.
It is possible for a race to set CACHE_PENDING after cache_clean()
has removed a cache entry from the cache.
If CACHE_PENDING is still set when the entry is finally 'put',
the cache_dequeue() will never happen and we can leak memory.

So set a new flag 'CACHE_CLEANED' when we remove something from
the cache, and don't queue any upcall if it is set.

If CACHE_PENDING is set before CACHE_CLEANED, the call that
cache_clean() makes to cache_fresh_unlocked() will free memory
as needed.  If CACHE_PENDING is set after CACHE_CLEANED, the
test in sunrpc_cache_pipe_upcall will ensure that the memory
is not allocated.

Reported-by: <bstroesser@ts.fujitsu.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:53:28 -04:00
Stanislav Kinsbursky 21cd1254d3 SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
Passing this pointer is redundant since it's stored on cache_detail structure,
which is also passed to sunrpc_cache_pipe_upcall () function.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 10:43:47 -05:00
Stanislav Kinsbursky 73fb847a44 SUNRPC: introduce cache_detail->cache_request callback
This callback will allow to simplify upcalls in further patches in this
series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-15 10:43:45 -05:00
J. Bruce Fields 621eb19ce1 svcrpc: Revert "sunrpc/cache.h: replace simple_strtoul"
Commit bbf43dc888 "sunrpc/cache.h: replace
simple_strtoul" introduced new range-checking which could cause get_int
to fail on unsigned integers too large to be represented as an int.

We could parse them as unsigned instead--but it turns out svcgssd is
actually passing down "-1" in some cases.  Which is perhaps stupid, but
there's nothing we can do about it now.

So just revert back to the previous "sloppy" behavior that accepts
either representation.

Cc: stable@vger.kernel.org
Reported-by: Sven Geggus <lists@fuchsschwanzdomain.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:32 -05:00
J. Bruce Fields a007c4c3e9 nfsd: add get_uint for u32's
I don't think there's a practical difference for the range of values
these interfaces should see, but it would be safer to be unambiguous.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-25 09:18:27 -04:00
Eldad Zack bbf43dc888 sunrpc/cache.h: replace simple_strtoul
This patch replaces the usage of simple_strtoul with kstrtoint in
get_int(), since the simple_str* family doesn't account for overflow
and is deprecated.
Also, in this specific case, the long from strtol is silently converted
to an int by the caller.

As Joe Perches <joe@perches.com> suggested, this patch also removes
the redundant temporary variable rv, since kstrtoint() will not write to
anint unless it's successful.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-11 16:08:00 -04:00
Eldad Zack d9c2ede63c sunrpc/cache.h: fix coding style
Neaten code style in get_int().
Also use sizeof() instead of hard coded number as suggested by
Joe Perches <joe@perches.com>.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-11 16:07:59 -04:00
Stanislav Kinsbursky 2c5f846747 SUNRPC: generic cache register routines removed
All cache users now uses network-namespace-aware routines, so generic ones
are obsolete.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
2012-01-31 19:28:16 -05:00
Stanislav Kinsbursky 0a402d5a65 SUNRPC: cache creation and destruction routines introduced
This patch prepares infrastructure for network namespace aware cache detail
allocation.
One note about adding network namespace link to cache structure. It's going to
be used later in NFS DNS cache parsing routine (nfs_dns_parse for rpc_pton()
call).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
2012-01-31 19:28:14 -05:00
Stanislav Kinsbursky 820f9442e7 SUNRPC: split cache creation and PipeFS registration
This precursor patch splits SUNRPC cache creation and PipeFS registartion.
It's required for latter split of NFS DNS resolver cache creation per network
namespace context and PipeFS registration/unregistration on MOUNT/UMOUNT
events.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:26 -05:00
Al Viro 64f1426f3c sunrpc: propagate umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:04 -05:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
NeilBrown 49b28684fd nfsd: Remove deprecated nfsctl system call and related code.
As promised in feature-removal-schedule.txt it is time to
remove the nfsctl system call.

Userspace has perferred to not use this call throughout 2.6 and it has been
excluded in the default configuration since 2.6.36 (9 months ago).

So this patch removes all the code that was being compiled out.

There are still references to sys_nfsctl in various arch systemcall tables
and related code.  These should be cleaned out too, probably in the next
merge window.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:42 -04:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds 18bce371ae Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
  nfsd4: fix callback restarting
  nfsd: break lease on unlink, link, and rename
  nfsd4: break lease on nfsd setattr
  nfsd: don't support msnfs export option
  nfsd4: initialize cb_per_client
  nfsd4: allow restarting callbacks
  nfsd4: simplify nfsd4_cb_prepare
  nfsd4: give out delegations more quickly in 4.1 case
  nfsd4: add helper function to run callbacks
  nfsd4: make sure sequence flags are set after destroy_session
  nfsd4: re-probe callback on connection loss
  nfsd4: set sequence flag when backchannel is down
  nfsd4: keep finer-grained callback status
  rpc: allow xprt_class->setup to return a preexisting xprt
  rpc: keep backchannel xprt as long as server connection
  rpc: move sk_bc_xprt to svc_xprt
  nfsd4: allow backchannel recovery
  nfsd4: support BIND_CONN_TO_SESSION
  nfsd4: modify session list under cl_lock
  Documentation: fl_mylease no longer exists
  ...

Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work.  The
vfs-scale work touched some msnfs cases, and this merge removes support
for that entirely, so the conflict was trivial to resolve.
2011-01-14 13:17:26 -08:00
Alexey Dobriyan 57cc7215b7 headers: kobject.h redux
Remove kobject.h from files which don't need it, notably,
sched.h and fs.h.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-10 08:51:44 -08:00
J. Bruce Fields bdd5f05d91 SUNRPC: Remove more code when NFSD_DEPRECATED is not configured
Signed-off-by: NeilBrown <neilb@suse.de>
[bfields@redhat.com: moved svcauth_unix_purge outside ifdef's.]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:48:02 -05:00
Pavel Emelyanov 593ce16b94 sunrpc: Add routines that allow registering per-net caches
Existing calls do the same, but for the init_net.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-27 10:16:11 -04:00
NeilBrown 1117449276 sunrpc/cache: change deferred-request hash table to use hlist.
Being a hash table, hlist is the best option.

There is currently some ugliness were we treat "->next == NULL" as
a special case to avoid having to initialise the whole array.
This change nicely gets rid of that case.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 21:51:11 -04:00
NeilBrown f16b6e8d83 sunrpc/cache: allow threads to block while waiting for cache update.
The current practice of waiting for cache updates by queueing the
whole request to be retried has (at least) two problems.

1/ With NFSv4, requests can be quite complex and re-trying a whole
  request when a later part fails should only be a last-resort, not a
  normal practice.

2/ Large requests, and in particular any 'write' request, will not be
  queued by the current code and doing so would be undesirable.

In many cases only a very sort wait is needed before the cache gets
valid data.

So, providing the underlying transport permits it by setting
 ->thread_wait,
arrange to wait briefly for an upcall to be completed (as reflected in
the clearing of CACHE_PENDING).
If the short wait was not long enough and CACHE_PENDING is still set,
fall back on the old approach.

The 'thread_wait' value is set to 5 seconds when there are spare
threads, and 1 second when there are no spare threads.

These values are probably much higher than needed, but will ensure
some forward progress.

Note that as we only request an update for a non-valid item, and as
non-valid items are updated in place it is extremely unlikely that
cache_check will return -ETIMEDOUT.  Normally cache_defer_req will
sleep for a short while and then find that the item is_valid.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-07 19:22:07 -04:00
NeilBrown c5b29f885a sunrpc: use seconds since boot in expiry cache
This protects us from confusion when the wallclock time changes.

We convert to and from wallclock when  setting or reading expiry
times.

Also use seconds since boot for last_clost time.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-07 19:21:20 -04:00
NeilBrown 17cebf658e sunrpc: extract some common sunrpc_cache code from nfsd
Rather can duplicating this idiom twice, put it in an inline function.
This reduces the usage of 'expiry_time' out side the sunrpc/cache.c
code and thus the impact of a change that is about to be made to that
field.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-07 19:21:19 -04:00
Artem Bityutskiy 8eab945c56 sunrpc: make the cache cleaner workqueue deferrable
This patch makes the cache_cleaner workqueue deferrable, to prevent
unnecessary system wake-ups, which is very important for embedded
battery-powered devices.

do_cache_clean() is called every 30 seconds at the moment, and often
makes the system wake up from its power-save sleep state. With this
change, when the workqueue uses a deferrable timer, the
do_cache_clean() invocation will be delayed and combined with the
closest "real" wake-up. This improves the power consumption situation.

Note, I tried to create a DECLARE_DELAYED_WORK_DEFERRABLE() helper
macro, similar to DECLARE_DELAYED_WORK(), but failed because of the
way the timer wheel core stores the deferrable flag (it is the
LSBit in the time->base pointer). My attempt to define a static
variable with this bit set ended up with the "initializer element is
not constant" error.

Thus, I have to use run-time initialization, so I created a new
cache_initialize() function which is called once when sunrpc is
being initialized.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-07-06 12:27:48 -04:00
Trond Myklebust 8854e82d9a SUNRPC: Add an rpc_pipefs front end for the sunrpc cache code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:14:30 -04:00
Trond Myklebust 173912a6ad SUNRPC: Move procfs-specific stuff out of the generic sunrpc cache code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:14:29 -04:00
Trond Myklebust bc74b4f5e6 SUNRPC: Allow the cache_detail to specify alternative upcall mechanisms
For events that are rare, such as referral DNS lookups, it makes limited
sense to have a daemon constantly listening for upcalls on a channel. An
alternative in those cases might simply be to run the app that fills the
cache using call_usermodehelper_exec() and friends.

The following patch allows the cache_detail to specify alternative upcall
mechanisms for these particular cases.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:14:29 -04:00
Trond Myklebust 2da8ca26c6 NFSD: Clean up the idmapper warning...
What part of 'internal use' is so hard to understand?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:14:26 -04:00
J. Bruce Fields 67eb6ff610 svcrpc: move unused field from cache_deferred_req
This field is set once and never used; probably some artifact of an
earlier implementation idea.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-04-23 16:13:37 -04:00
J. Bruce Fields dbf847ecb6 knfsd: allow cache_register to return error on failure
Newer server features such as nfsv4 and gss depend on proc to work, so a
failure to initialize the proc files they need should be treated as
fatal.

Thanks to Andrew Morton for style fix and compile fix in case where
CONFIG_NFSD_V4 is undefined.

Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:05 -05:00
J. Bruce Fields df95a9d4fb knfsd: cache unregistration needn't return error
There's really nothing much the caller can do if cache unregistration
fails.  And indeed, all any caller does in this case is print an error
and continue.  So just return void and move the printk's inside
cache_unregister.

Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:04 -05:00
J. Bruce Fields dca1dd30ce nfsd: remove unused cache_for_each macro
This macro is unused.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Acked-by:  Neil Brown <neilb@suse.de>
2007-10-09 18:31:55 -04:00
Greg Banks 7b2b1fee30 [PATCH] knfsd: knfsd: cache ipmap per TCP socket
Speed up high call-rate workloads by caching the struct ip_map for the peer on
the connected struct svc_sock instead of looking it up in the ip_map cache
hashtable on every call.  This helps workloads using AUTH_SYS authentication
over TCP.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e.  recursive directory
listing) workload reading from an i386 RH9 install image (161480 regular files
in 10841 directories) on the server.  That tree is small enough to fill in the
server's RAM so no disk traffic was involved.  This setup gives a sustained
call rate in excess of 60000 calls/sec before being CPU-bound on the server.

Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
CPU, and ip_map_lookup() was taking 2.9%.  This patch drops both contribution
into the profile noise.

Note that the above result overstates this value of this patch for most
workloads.  The synthetic clients are all using separate IP addresses, so
there are 64 entries in the ip_map cache hash.  Because the kernel measured
contained the bug fixed in commit

commit 1f1e030bf7

and was running on 64bit little-endian machine, probably all of those 64
entries were on a single chain, thus increasing the cost of ip_map_lookup().

With a modern kernel you would need more clients to see the same amount of
performance improvement.  This patch has helped to scale knfsd to handle a
deployment with 2000 NFS clients.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
Adrian Bunk 74cae61ab4 [PATCH] fs/nfsd/export.c,net/sunrpc/cache.c: make needlessly global code static
We can now make some code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown baab935ff3 [PATCH] knfsd: Convert sunrpc_cache to use krefs
.. it makes some of the code nicer.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown ebd0cb1af3 [PATCH] knfsd: Unexport cache_fresh and fix a small race
Cache_fresh is now only used in cache.c, so unexport it.

Part of cache_fresh (setting CACHE_VALID) should really be done under the
lock, while part (calling cache_revisit_request etc) must be done outside the
lock.  So we split it up appropriately.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown 4d90452cb2 [PATCH] knfsd: Remove DefineCacheLookup
This has been replaced by more traditional code.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown 15a5f6bd23 [PATCH] knfsd: Create cache_lookup function instead of using a macro to declare one
The C++-like 'template' approach proves to be too ugly and hard to work with.

The old 'template' won't go away until all users are updated.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
NeilBrown 7d317f2c9f [PATCH] knfsd: Get rid of 'inplace' sunrpc caches
These were an unnecessary wart.  Also only have one 'DefineSimpleCache..'
instead of two.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00