Commit Graph

213 Commits

Author SHA1 Message Date
Vasily Averin 9ac312888e sunrpc: fix debug message in svc_create_xprt()
_svc_create_xprt() returns positive port number
so its non-zero return value is not an error

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-12-27 21:01:41 -05:00
Vasily Averin d4b09acf92 sunrpc: use-after-free in svc_process_common()
if node have NFSv41+ mounts inside several net namespaces
it can lead to use-after-free in svc_process_common()

svc_process_common()
        /* Setup reply header */
        rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE

svc_process_common() can use incorrect rqstp->rq_xprt,
its caller function bc_svc_process() takes it from serv->sv_bc_xprt.
The problem is that serv is global structure but sv_bc_xprt
is assigned per-netnamespace.

According to Trond, the whole "let's set up rqstp->rq_xprt
for the back channel" is nothing but a giant hack in order
to work around the fact that svc_process_common() uses it
to find the xpt_ops, and perform a couple of (meaningless
for the back channel) tests of xpt_flags.

All we really need in svc_process_common() is to be able to run
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()

Bruce J Fields points that this xpo_prep_reply_hdr() call
is an awfully roundabout way just to do "svc_putnl(resv, 0);"
in the tcp case.

This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(),
now it calls svc_process_common() with rqstp->rq_xprt = NULL.

To adjust reply header svc_process_common() just check
rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.

To handle rqstp->rq_xprt = NULL case in functions called from
svc_process_common() patch intruduces net namespace pointer
svc_rqst->rq_bc_net and adjust SVC_NET() definition.
Some other function was also adopted to properly handle described case.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Fixes: 23c20ecd44 ("NFS: callback up - users counting cleanup")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-12-27 21:00:58 -05:00
Linus Torvalds 310c7585e8 Olga added support for the NFSv4.2 asynchronous copy protocol. We
already supported COPY, by copying a limited amount of data and then
 returning a short result, letting the client resend.  The asynchronous
 protocol should offer better performance at the expense of some
 complexity.
 
 The other highlight is Trond's work to convert the duplicate reply cache
 to a red-black tree, and to move it and some other server caches to RCU.
 (Previously these have meant taking global spinlocks on every RPC.)
 
 Otherwise, some RDMA work and miscellaneous bugfixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJb2KWzAAoJECebzXlCjuG+gcQP/3DldB86CFxgSFx0t+h+s+TV
 CdYJDPyLyRkEMiD+4dCPPuhueve+j5BPHVsDbn98FTWrEn131NMIs6uhU/VGTtAU
 6a8f/ExtZ5U7s39MJCzlk2ozVElBc3QPp7p3p9NKn0Wi0PXbVgjuIqR5o2vwa8Si
 KOVdLm6ylfav/HTH8DO6zFPJRsTgTwcJOivXXshjpglMKAcw8AuqSsGgBrDeGpgU
 u91Vi0EM1vt96+CA6a01mTgC/sFX7EqGvxUUHOrKWf5cIjnpT3FDvouYPxi+GH8Z
 SIDlaMQyXF5m4m6MhELNTP4v97XAHyPJtvLkEe5lggTyABPiA2heo9e8onysWkzV
 1v8OZHCVFa1UL34mDlnFxbFCYVr7FFKMGjTBR/ntinobPfAbWRCO1Hdd+bBGPDD4
 byf7ctDVp7KQ2bSatIdlYavikuGDHWFDZHzPHlqkD3gpIZSNvhe26sV3NZqIFlXO
 cMUega2Y5mXmULauHhxAcNGtDK7dF5hHoMWKJy0DNxiyDiDLylwDOIfwt1De3Q7V
 ycd/wUytUS2LkAhyS2mvoDK6eXTBAeQwzmXAqveh6rewwO83HC/t9mtKBBDomvKG
 xRpRPmmbj9ijbwkilEBmijjR47wrihmEVIFahznEerZ+//QOfVVOB0MNtzIyU9/k
 CnP1ZNvOs3LR1pxxwFa8
 =TTo0
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Olga added support for the NFSv4.2 asynchronous copy protocol. We
  already supported COPY, by copying a limited amount of data and then
  returning a short result, letting the client resend. The asynchronous
  protocol should offer better performance at the expense of some
  complexity.

  The other highlight is Trond's work to convert the duplicate reply
  cache to a red-black tree, and to move it and some other server caches
  to RCU. (Previously these have meant taking global spinlocks on every
  RPC)

  Otherwise, some RDMA work and miscellaneous bugfixes"

* tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linux: (30 commits)
  lockd: fix access beyond unterminated strings in prints
  nfsd: Fix an Oops in free_session()
  nfsd: correctly decrement odstate refcount in error path
  svcrdma: Increase the default connection credit limit
  svcrdma: Remove try_module_get from backchannel
  svcrdma: Remove ->release_rqst call in bc reply handler
  svcrdma: Reduce max_send_sges
  nfsd: fix fall-through annotations
  knfsd: Improve lookup performance in the duplicate reply cache using an rbtree
  knfsd: Further simplify the cache lookup
  knfsd: Simplify NFS duplicate replay cache
  knfsd: Remove dead code from nfsd_cache_lookup
  SUNRPC: Simplify TCP receive code
  SUNRPC: Replace the cache_detail->hash_lock with a regular spinlock
  SUNRPC: Remove non-RCU protected lookup
  NFS: Fix up a typo in nfs_dns_ent_put
  NFS: Lockless DNS lookups
  knfsd: Lockless lookup of NFSv4 identities.
  SUNRPC: Lockless server RPCSEC_GSS context lookup
  knfsd: Allow lockless lookups of the exports
  ...
2018-10-30 13:03:29 -07:00
Trond Myklebust bb6ad5572c nfsd: Fix an Oops in free_session()
In call_xpt_users(), we delete the entry from the list, but we
do not reinitialise it. This triggers the list poisoning when
we later call unregister_xpt_user() in nfsd4_del_conns().

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29 16:58:04 -04:00
Trond Myklebust c544577dad SUNRPC: Clean up transport write space handling
Treat socket write space handling in the same way we now treat transport
congestion: by denying the XPRT_LOCK until the transport signals that it
has free buffer space.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30 15:35:15 -04:00
Chuck Lever 55f5088c22 svc: Report xprt dequeue latency
Record the time between when a rqstp is enqueued on a transport
and when it is dequeued. This includes how long the rqstp waits on
the queue and how long it takes the kernel scheduler to wake a
nfsd thread to service it.

The svc_xprt_dequeue trace point is altered to include the number
of microseconds between xprt_enqueue and xprt_dequeue.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:13 -04:00
Chuck Lever aaba72cd4e sunrpc: Report per-RPC execution stats
Introduce a mechanism to report the server-side execution latency of
each RPC. The goal is to enable user space to filter the trace
record for latency outliers, build histograms, etc.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:12 -04:00
Chuck Lever ece200ddd5 sunrpc: Save remote presentation address in svc_xprt for trace events
TP_printk defines a format string that is passed to user space for
converting raw trace event records to something human-readable.

My user space's printf (Oracle Linux 7), however, does not have a
%pI format specifier. The result is that what is supposed to be an
IP address in the output of "trace-cmd report" is just a string that
says the field couldn't be displayed.

To fix this, adopt the same approach as the client: maintain a pre-
formated presentation address for occasions when %pI is not
available.

The location of the trace_svc_send trace point is adjusted so that
rqst->rq_xprt is not NULL when the trace event is recorded.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:11 -04:00
Chuck Lever 41f306d0c2 sunrpc: Simplify trace_svc_recv
There doesn't seem to be a lot of value in calling trace_svc_recv
in the failing case.

1. There are two very common cases: one is the transport is not
ready, and the other is shutdown. Neither is terribly interesting.

2. The trace record for the failing case contains nothing but
the status code.

Therefore the trace point call site in the error exit is removed.
Since the trace point is now recording a length instead of a
status, rename the status field and remove the case that records a
zero XID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:11 -04:00
Chuck Lever 7dbb53baed sunrpc: Simplify do_enqueue tracing
There are three cases where svc_xprt_do_enqueue() returns without
waking an nfsd thread:

1. There is no work to do

2. The transport is already busy

3. There are no available nfsd threads

Only 3. is truly interesting. Move the trace point so it records
that there was work to do and either an nfsd thread was awoken, or
a free one could not found.

As an additional clean up, remove a redundant comment and a couple
of dprintk call sites.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:11 -04:00
Chuck Lever caa3e106dc sunrpc: Move trace_svc_xprt_dequeue()
Reduce the amount of noise generated by trace_svc_xprt_dequeue by
moving it to the end of svc_get_next_xprt. This generates exactly
one trace event when a ready xprt is found, rather than spurious
events when there is no work to do. The empty events contain no
information that can't be obtained simply by tracing function calls
to svc_xprt_dequeue.

A small additional benefit is simplification of the svc_xprt_event
trace class, which no longer has to handle the case when the @xprt
parameter is NULL.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:10 -04:00
Chuck Lever 989f881ebf svc: Simplify ->xpo_secure_port
Clean up: Instead of returning a value that is used to set or clear
a bit, just make ->xpo_secure_port mangle that bit, and return void.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:09 -04:00
Chuck Lever 63a1b15693 sunrpc: Remove unneeded pointer dereference
Clean up: Noticed during code inspection that there is already a
local automatic variable "xprt" so dereferencing rqst->rq_xprt
again is unnecessary.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:09 -04:00
Kees Cook 841b86f328 treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
With all callbacks converted, and the timer callback prototype
switched over, the TIMER_FUNC_TYPE cast is no longer needed,
so remove it. Conversion was done with the following scripts:

    perl -pi -e 's|\(TIMER_FUNC_TYPE\)||g' \
        $(git grep TIMER_FUNC_TYPE | cut -d: -f1 | sort -u)

    perl -pi -e 's|\(TIMER_DATA_TYPE\)||g' \
        $(git grep TIMER_DATA_TYPE | cut -d: -f1 | sort -u)

The now unused macros are also dropped from include/linux/timer.h.

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 16:35:54 -08:00
Linus Torvalds 4dd3c2e5a4 Lots of good bugfixes, including:
- fix a number of races in the NFSv4+ state code.
 	- fix some shutdown crashes in multiple-network-namespace cases.
 	- relax our 4.1 session limits; if you've an artificially low limit
 	  to the number of 4.1 clients that can mount simultaneously, try
 	  upgrading.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJaEH3oAAoJECebzXlCjuG++t0P/2t7RvRUunQa4pngCmg5QbOA
 rldfEd1HM1F6+4fXzN0wcxWjphUNxs19VjEaWNjThYoGGTEdSOuFhBHgK18xmHjp
 Cjz5IYJ0yS7PClCxMTmz5u3gfyExPR83whmNaNK69CGvn5xu97gDntOv/06Llw4Y
 nCUJrEmVcMAOHek3tOD0Rlv8eYFyfLhF6zacp+qWFIlymU118iK1Or83M7pi6j51
 yVVOvxktDLzkyDq5gQD/Py3rKHikOWFMCoseOPfMnOiGF/Bp7YDzWt6HT17mwyU4
 xDeICbnfqve2SwT9NChpJOYtUAPuZDiQR6G2ZtnI8/JN7ob/wls/4CbDVlzYFN4r
 dLsRlEC5spQmg34j6dscOKkt1vRK9vKXTC46wEMfXZLtiDLA/uZ/J0gNh3EXqpbt
 LQQZI4B2MomYPcp64i4UHHO8BqSIX+lC5otVlAW105TQvZflJ8Mhtawmpu1O3nXZ
 DSUhkZrImlBmb7/ulhjyXpmNAxQLXsqb0lP5tUYR5Re+A2lyea/pMJmtBLu3fv6h
 tzHqq2JL13kblqJY+Frc1zqQGI5AAyKmdTTjmljBIGHxbVwAMzk1qO+VOI/f+J21
 MWNmFkEqw+Tnvwy6sIm1eUGtTWIGc6ejvMxXguAfa+QjT4iHAL3F4PkpSihzIZnm
 bzHDeJ87HRWWj/ICPQ1j
 =PBs+
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Lots of good bugfixes, including:

   -  fix a number of races in the NFSv4+ state code

   -  fix some shutdown crashes in multiple-network-namespace cases

   -  relax our 4.1 session limits; if you've an artificially low limit
      to the number of 4.1 clients that can mount simultaneously, try
      upgrading"

* tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits)
  SUNRPC: Improve ordering of transport processing
  nfsd: deal with revoked delegations appropriately
  svcrdma: Enqueue after setting XPT_CLOSE in completion handlers
  nfsd: use nfs->ns.inum as net ID
  rpc: remove some BUG()s
  svcrdma: Preserve CB send buffer across retransmits
  nfds: avoid gettimeofday for nfssvc_boot time
  fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t
  fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t
  fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t
  lockd: double unregister of inetaddr notifiers
  nfsd4: catch some false session retries
  nfsd4: fix cached replies to solo SEQUENCE compounds
  sunrcp: make function _svc_create_xprt static
  SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status
  nfsd: use ARRAY_SIZE
  nfsd: give out fewer session slots as limit approaches
  nfsd: increase DRC cache limit
  nfsd: remove unnecessary nofilehandle checks
  nfs_common: convert int to bool
  ...
2017-11-18 11:22:04 -08:00
Trond Myklebust 22700f3c6d SUNRPC: Improve ordering of transport processing
Since it can take a while before a specific thread gets scheduled, it
is better to just implement a first come first served queue mechanism.
That way, if a thread is already scheduled and is idle, it can pick up
the work to do from the queue.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-07 16:44:03 -05:00
Colin Ian King da36e6dbf4 sunrcp: make function _svc_create_xprt static
The function _svc_create_xprt is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol '_svc_create_xprt' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-11-07 16:43:57 -05:00
Kees Cook ff861c4d64 sunrpc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:40:27 +01:00
Chuck Lever 8c6ae4980e sunrpc: Allocate up to RPCSVC_MAXPAGES per svc_rqst
svcrdma needs 259 pages allocated to receive 1MB NFSv4.0 WRITE requests:

 - 1 page for the transport header and head iovec
 - 256 pages for the data payload
 - 1 page for the trailing GETATTR request (since NFSD XDR decoding
   does not look for a tail iovec, the GETATTR is stuck at the end
   of the rqstp->rq_arg.pages list)
 - 1 page for building the reply xdr_buf

But RPCSVC_MAXPAGES is already 259 (on x86_64). The problem is that
svc_alloc_arg never allocates that many pages. To address this:

1. The final element of rq_pages always points to NULL. To
   accommodate up to 259 pages in rq_pages, add an extra element
   to rq_pages for the array termination sentinel.

2. Adjust the calculation of "pages" to match how RPCSVC_MAXPAGES
   is calculated, so it can go up to 259. Bruce noted that the
   calculation assumes sv_max_mesg is a multiple of PAGE_SIZE,
   which might not always be true. I didn't change this assumption.

3. Change the loop boundaries to allow 259 pages to be allocated.

Additional clean-up: WARN_ON_ONCE adds an extra conditional branch,
which is basically never taken. And there's no need to dump the
stack here because svc_alloc_arg has only one caller.

Keeping that NULL "array termination sentinel"; there doesn't appear to
be any code that depends on it, only code in nfsd_splice_actor() which
needs the 259th element to be initialized to *something*.  So it's
possible we could just keep the array at 259 elements and drop that
final NULL, but we're being conservative for now.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-07-12 15:54:55 -04:00
Linus Torvalds 42e1b14b6e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Implement wraparound-safe refcount_t and kref_t types based on
     generic atomic primitives (Peter Zijlstra)

   - Improve and fix the ww_mutex code (Nicolai Hähnle)

   - Add self-tests to the ww_mutex code (Chris Wilson)

   - Optimize percpu-rwsems with the 'rcuwait' mechanism (Davidlohr
     Bueso)

   - Micro-optimize the current-task logic all around the core kernel
     (Davidlohr Bueso)

   - Tidy up after recent optimizations: remove stale code and APIs,
     clean up the code (Waiman Long)

   - ... plus misc fixes, updates and cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  fork: Fix task_struct alignment
  locking/spinlock/debug: Remove spinlock lockup detection code
  lockdep: Fix incorrect condition to print bug msgs for MAX_LOCKDEP_CHAIN_HLOCKS
  lkdtm: Convert to refcount_t testing
  kref: Implement 'struct kref' using refcount_t
  refcount_t: Introduce a special purpose refcount type
  sched/wake_q: Clarify queue reinit comment
  sched/wait, rcuwait: Fix typo in comment
  locking/mutex: Fix lockdep_assert_held() fail
  locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()
  locking/rwsem: Reinit wake_q after use
  locking/rwsem: Remove unnecessary atomic_long_t casts
  jump_labels: Move header guard #endif down where it belongs
  locking/atomic, kref: Implement kref_put_lock()
  locking/ww_mutex: Turn off __must_check for now
  locking/atomic, kref: Avoid more abuse
  locking/atomic, kref: Use kref_get_unless_zero() more
  locking/atomic, kref: Kill kref_sub()
  locking/atomic, kref: Add kref_read()
  locking/atomic, kref: Add KREF_INIT()
  ...
2017-02-20 13:23:30 -08:00
Peter Zijlstra 2c935bc572 locking/atomic, kref: Add kref_read()
Since we need to change the implementation, stop exposing internals.

Provide kref_read() to read the current reference count; typically
used for debug messages.

Kills two anti-patterns:

	atomic_read(&kref->refcount)
	kref->refcount.counter

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 11:37:18 +01:00
Scott Mayhew 546125d161 sunrpc: don't call sleeping functions from the notifier block callbacks
The inet6addr_chain is an atomic notifier chain, so we can't call
anything that might sleep (like lock_sock)... instead of closing the
socket from svc_age_temp_xprts_now (which is called by the notifier
function), just have the rpc service threads do it instead.

Cc: stable@vger.kernel.org
Fixes: c3d4879e01 "sunrpc: Add a function to close..."
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-01-12 15:56:40 -05:00
Scott Mayhew ea08e39230 sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports
This fixes the following panic that can occur with NFSoRDMA.

general protection fault: 0000 [#1] SMP
Modules linked in: rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi
scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp
scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm
mlx5_ib ib_core intel_powerclamp coretemp kvm_intel kvm sg ioatdma
ipmi_devintf ipmi_ssif dcdbas iTCO_wdt iTCO_vendor_support pcspkr
irqbypass sb_edac shpchp dca crc32_pclmul ghash_clmulni_intel edac_core
lpc_ich aesni_intel lrw gf128mul glue_helper ablk_helper mei_me mei
ipmi_si cryptd wmi ipmi_msghandler acpi_pad acpi_power_meter nfsd
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod
crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper
syscopyarea sysfillrect sysimgblt ahci fb_sys_fops ttm libahci mlx5_core
tg3 crct10dif_pclmul drm crct10dif_common
ptp i2c_core libata crc32c_intel pps_core fjes dm_mirror dm_region_hash
dm_log dm_mod
CPU: 1 PID: 120 Comm: kworker/1:1 Not tainted 3.10.0-514.el7.x86_64 #1
Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.4.2 01/29/2015
Workqueue: events check_lifetime
task: ffff88031f506dd0 ti: ffff88031f584000 task.ti: ffff88031f584000
RIP: 0010:[<ffffffff8168d847>]  [<ffffffff8168d847>]
_raw_spin_lock_bh+0x17/0x50
RSP: 0018:ffff88031f587ba8  EFLAGS: 00010206
RAX: 0000000000020000 RBX: 20041fac02080072 RCX: ffff88031f587fd8
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 20041fac02080072
RBP: ffff88031f587bb0 R08: 0000000000000008 R09: ffffffff8155be77
R10: ffff880322a59b00 R11: ffffea000bf39f00 R12: 20041fac02080072
R13: 000000000000000d R14: ffff8800c4fbd800 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff880322a40000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3c52d4547e CR3: 00000000019ba000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
20041fac02080002 ffff88031f587bd0 ffffffff81557830 20041fac02080002
ffff88031f587c78 ffff88031f587c40 ffffffff8155ae08 000000010157df32
0000000800000001 ffff88031f587c20 ffffffff81096acb ffffffff81aa37d0
Call Trace:
[<ffffffff81557830>] lock_sock_nested+0x20/0x50
[<ffffffff8155ae08>] sock_setsockopt+0x78/0x940
[<ffffffff81096acb>] ? lock_timer_base.isra.33+0x2b/0x50
[<ffffffff8155397d>] kernel_setsockopt+0x4d/0x50
[<ffffffffa0386284>] svc_age_temp_xprts_now+0x174/0x1e0 [sunrpc]
[<ffffffffa03b681d>] nfsd_inetaddr_event+0x9d/0xd0 [nfsd]
[<ffffffff81691ebc>] notifier_call_chain+0x4c/0x70
[<ffffffff810b687d>] __blocking_notifier_call_chain+0x4d/0x70
[<ffffffff810b68b6>] blocking_notifier_call_chain+0x16/0x20
[<ffffffff815e8538>] __inet_del_ifa+0x168/0x2d0
[<ffffffff815e8cef>] check_lifetime+0x25f/0x270
[<ffffffff810a7f3b>] process_one_work+0x17b/0x470
[<ffffffff810a8d76>] worker_thread+0x126/0x410
[<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
[<ffffffff810b052f>] kthread+0xcf/0xe0
[<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[<ffffffff81696418>] ret_from_fork+0x58/0x90
[<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
Code: ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb d9 66 0f 1f 44 00 00 0f 1f
44 00 00 55 48 89 e5 53 48 89 fb e8 7e 04 a0 ff b8 00 00 02 00 <f0> 0f
c1 03 89 c2 c1 ea 10 66 39 c2 75 03 5b 5d c3 83 e2 fe 0f
RIP  [<ffffffff8168d847>] _raw_spin_lock_bh+0x17/0x50
RSP <ffff88031f587ba8>

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Fixes: c3d4879e ("sunrpc: Add a function to close temporary transports immediately")
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-11-14 10:30:58 -05:00
Trond Myklebust f4a4906e56 SUNRPC: Remove unused callback xpo_adjust_wspace()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13 15:53:50 -04:00
Trond Myklebust ff3ac5c3dc SUNRPC: Add a server side per-connection limit
Allow the user to limit the number of requests serviced through a single
connection, to help prevent faster clients from starving slower clients.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13 15:53:48 -04:00
Trond Myklebust 104f6351f7 SUNRPC: Add tracepoints for dropped and deferred requests
Dropping and/or deferring requests has an impact on performance. Let's
make sure we can trace those events.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13 15:53:43 -04:00
Trond Myklebust 82ea2d7615 SUNRPC: Add a tracepoint for server socket out-of-space conditions
Add a tracepoint to track when the processing of incoming RPC data gets
deferred due to out-of-space issues on the outgoing transport.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-13 15:53:42 -04:00
J. Bruce Fields 39a9beab5a rpc: share one xps between all backchannels
The spec allows backchannels for multiple clients to share the same tcp
connection.  When that happens, we need to use the same xprt for all of
them.  Similarly, we need the same xps.

This fixes list corruption introduced by the multipath code.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Trond Myklebust <trondmy@primarydata.com>
2016-06-15 10:32:25 -04:00
J. Bruce Fields d96b9c9398 svcrpc: autoload rdma module
This should fix failures like:

	# rpc.nfsd --rdma
	rpc.nfsd: Unable to request RDMA services: Protocol not supported

Reported-by: Steve Dickson <steved@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-05-23 10:55:24 -04:00
Scott Mayhew c3d4879e01 sunrpc: Add a function to close temporary transports immediately
Add a function svc_age_temp_xprts_now() to close temporary transports
whose xpt_local matches the address passed in server_addr immediately
instead of waiting for them to be closed by the timer function.

The function is intended to be used by notifier_blocks that will be
added to nfsd and lockd that will run when an ip address is deleted.

This will eliminate the ACK storms and client hangs that occur in
HA-NFS configurations where nfsd & lockd is left running on the cluster
nodes all the time and the NFS 'service' is migrated back and forth
within a short timeframe.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-12-23 10:08:15 -05:00
Jeff Layton b9e13cdfac nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation
For now, all services use svc_xprt_do_enqueue, but once we add
workqueue-based service support, we'll need to do something different.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:42 -04:00
Jeff Layton 3c5199143b sunrpc/lockd: fix references to the BKL
The BKL is completely out of the picture in the lockd and sunrpc code
these days. Update the antiquated comments that refer to it.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-01-23 10:29:12 -05:00
Jeff Layton acf06a7fa1 sunrpc: only call test_bit once in svc_xprt_received
...move the WARN_ON_ONCE inside the following if block since they use
the same condition.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:29:14 -05:00
Jeff Layton 83a712e0af sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt
These were useful when I was tracking down a race condition between
svc_xprt_do_enqueue and svc_get_next_xprt.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:29:14 -05:00
Jeff Layton b1691bc03d sunrpc: convert to lockless lookup of queued server threads
Testing has shown that the pool->sp_lock can be a bottleneck on a busy
server. Every time data is received on a socket, the server must take
that lock in order to dequeue a thread from the sp_threads list.

Address this problem by eliminating the sp_threads list (which contains
threads that are currently idle) and replacing it with a RQ_BUSY flag in
svc_rqst. This allows us to walk the sp_all_threads list under the
rcu_read_lock and find a suitable thread for the xprt by doing a
test_and_set_bit.

Note that we do still have a potential atomicity problem however with
this approach.  We don't want svc_xprt_do_enqueue to set the
rqst->rq_xprt pointer unless a test_and_set_bit of RQ_BUSY returned
zero (which indicates that the thread was idle). But, by the time we
check that, the bit could be flipped by a waking thread.

To address this, we acquire a new per-rqst spinlock (rq_lock) and take
that before doing the test_and_set_bit. If that returns false, then we
can set rq_xprt and drop the spinlock. Then, when the thread wakes up,
it must set the bit under the same spinlock and can trust that if it was
already set then the rq_xprt is also properly set.

With this scheme, the case where we have an idle thread no longer needs
to take the highly contended pool->sp_lock at all, and that removes the
bottleneck.

That still leaves one issue: What of the case where we walk the whole
sp_all_threads list and don't find an idle thread? Because the search is
lockess, it's possible for the queueing to race with a thread that is
going to sleep. To address that, we queue the xprt and then search again.

If we find an idle thread at that point, we can't attach the xprt to it
directly since that might race with a different thread waking up and
finding it.  All we can do is wake the idle thread back up and let it
attempt to find the now-queued xprt.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:22 -05:00
Jeff Layton 403c7b4444 sunrpc: fix potential races in pool_stats collection
In a later patch, we'll be removing some spinlocking around the socket
and thread queueing code in order to fix some contention problems. At
that point, the stats counters will no longer be protected by the
sp_lock.

Change the counters to atomic_long_t fields, except for the
"sockets_queued" counter which will still be manipulated under a
spinlock.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Chris Worley <chris.worley@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:22 -05:00
Jeff Layton ceff739c53 sunrpc: have svc_wake_up only deal with pool 0
The way that svc_wake_up works is a bit inefficient. It walks all of the
available pools for a service and either wakes up a task in each one or
sets the SP_TASK_PENDING flag in each one.

When svc_wake_up is called, there is no need to wake up more than one
thread to do this work. In practice, only lockd currently uses this
function and it's single threaded anyway. Thus, this just boils down to
doing a wake up of a thread in pool 0 or setting a single flag.

Eliminate the for loop in this function and change it to just operate on
pool 0. Also update the comments that sit above it and get rid of some
code that has been commented out for years now.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:21 -05:00
Jeff Layton 4d5db3f536 sunrpc: convert sp_task_pending flag to use atomic bitops
In a later patch, we'll want to be able to handle this flag without
holding the sp_lock. Change this field to an unsigned long flags
field, and declare a new flag in it that can be managed with atomic
bitops.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:21 -05:00
Jeff Layton 78b65eb3fd sunrpc: move rq_dropme flag into rq_flags
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:20 -05:00
Jeff Layton 30660e04b0 sunrpc: move rq_usedeferral flag to rq_flags
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:20 -05:00
Jeff Layton 4d152e2c9a sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it
In a later patch, we're going to need some atomic bit flags. Since that
field will need to be an unsigned long, we mitigate that space
consumption by migrating some other bitflags to the new field. Start
with the rq_secure flag.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:21:20 -05:00
J. Bruce Fields 2941b0e91b NFS client updates for Linux 3.19
Highlights include:
 
 Features:
 - NFSv4.2 client support for hole punching and preallocation.
 - Further RPC/RDMA client improvements.
 - Add more RPC transport debugging tracepoints.
 - Add RPC debugging tools in debugfs.
 
 Bugfixes:
 - Stable fix for layoutget error handling
 - Fix a change in COMMIT behaviour resulting from the recent io code updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUhRVTAAoJEGcL54qWCgDyfeUP/RoFo3ImTMbGxfcPJqoELjcO
 lZbQ+27pOE/whFDkWgiOVTwlgGct5a0WRo7GCZmpYJA4q1kmSv4ngTb3nMTCUztt
 xMJ0mBr0BqttVs+ouKiVPm3cejQXedEhttwWcloIXS8lNenlpL29Zlrx2NHdU8UU
 13+souocj0dwIyTYYS/4Lm9KpuCYnpDBpP5ShvQjVaMe/GxJo6GyZu70c7FgwGNz
 Nh9onzZV3mz1elhfizlV38aVA7KWVXtLWIqOFIKlT2fa4nWB8Hc07miR5UeOK0/h
 r+icnF2qCQe83MbjOxYNxIKB6uiA/4xwVc90X4AQ7F0RX8XPWHIQWG5tlkC9jrCQ
 3RGzYshWDc9Ud2mXtLMyVQxHVVYlFAe1WtdP8ZWb1oxDInmhrarnWeNyECz9xGKu
 VzIDZzeq9G8slJXATWGRfPsYr+Ihpzcen4QQw58cakUBcqEJrYEhlEOfLovM71k3
 /S/jSHBAbQqiw4LPMw87bA5A6+ZKcVSsNE0XCtNnhmqFpLc1kKRrl5vaN+QMk5tJ
 v4/zR0fPqH7SGAJWYs4brdfahyejEo0TwgpDs7KHmu1W9zQ0LCVTaYnQuUmQjta6
 WyYwIy3TTibdfR191O0E3NOW82Q/k/NBD6ySvabN9HqQ9eSk6+rzrWAslXCbYohb
 BJfzcQfDdx+lsyhjeTx9
 =wOP3
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.19-1' into nfsd for-3.19 branch

Mainly what I need is 860a0d9e51 "sunrpc: add some tracepoints in
svc_rqst handling functions", which subsequent server rpc patches from
jlayton depend on.  I'm merging this later tag on the assumption that's
more likely to be a tested and stable point.
2014-12-09 11:12:26 -05:00
Jeff Layton 8d65ef760d sunrpc: eliminate the XPT_DETACHED flag
All it does is indicate whether a xprt has already been deleted from
a list or not, which is unnecessary since we use list_del_init and it's
always set and checked under the sv_lock anyway.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-01 12:45:26 -07:00
Jeff Layton 860a0d9e51 sunrpc: add some tracepoints in svc_rqst handling functions
...just around svc_send, svc_recv and svc_process for now.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-11-24 12:53:34 -05:00
J. Bruce Fields ae89254da6 SUNRPC: Fix compile on non-x86
current_task appears to be x86-only, oops.

Let's just delete this check entirely:

Any developer that adds a new user without setting rq_task will get a
crash the first time they test it.  I also don't think there are
normally any important locks held here, and I can't see any other reason
why killing a server thread would bring the whole box down.

So the effort to fail gracefully here looks like overkill.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 983c684466 "SUNRPC: get rid of the request wait queue"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-28 15:51:35 -04:00
Trond Myklebust 0c0746d03e SUNRPC: More optimisations of svc_xprt_enqueue()
Just move the transport locking out of the spin lock protected area
altogether.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-17 12:00:11 -04:00
Trond Myklebust a4aa8054a6 SUNRPC: Fix broken kthread_should_stop test in svc_get_next_xprt
We should definitely not be exiting svc_get_next_xprt() with the
thread enqueued. Fix this by ensuring that we fall through to
the dequeue.
Also move the test itself outside the spin lock protected section.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-17 12:00:11 -04:00
Trond Myklebust 983c684466 SUNRPC: get rid of the request wait queue
We're always _only_ waking up tasks from within the sp_threads list, so
we know that they are enqueued and alive. The rq_wait waitqueue is just
a distraction with extra atomic semantics.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-17 12:00:11 -04:00
Trond Myklebust 106f359cf4 SUNRPC: Do not grab pool->sp_lock unnecessarily in svc_get_next_xprt
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-17 12:00:10 -04:00
Trond Myklebust 9e5b208dc9 SUNRPC: Do not override wspace tests in svc_handle_xprt
We already determined that there was enough wspace when we
called svc_xprt_enqueue.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-17 12:00:10 -04:00
Trond Myklebust 518776800c SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-29 16:10:20 -04:00
Trond Myklebust 0971374e28 SUNRPC: Reduce contention in svc_xprt_enqueue()
Ensure that all calls to svc_xprt_enqueue() except svc_xprt_received()
check the value of XPT_BUSY, before attempting to grab spinlocks etc.
This is to avoid situations such as the following "perf" trace,
which shows heavy contention on the pool spinlock:

    54.15%            nfsd  [kernel.kallsyms]        [k] _raw_spin_lock_bh
                      |
                      --- _raw_spin_lock_bh
                         |
                         |--71.43%-- svc_xprt_enqueue
                         |          |
                         |          |--50.31%-- svc_reserve
                         |          |
                         |          |--31.35%-- svc_xprt_received
                         |          |
                         |          |--18.34%-- svc_tcp_data_ready
...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-29 16:10:15 -04:00
J. Bruce Fields 2825a7f907 nfsd4: allow encoding across page boundaries
After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:54 -04:00
Trond Myklebust c789102c20 SUNRPC: Fix a module reference leak in svc_handle_xprt
If the accept() call fails, we need to put the module reference.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-22 15:57:22 -04:00
Chuck Lever 16e4d93f6d NFSD: Ignore client's source port on RDMA transports
An NFS/RDMA client's source port is meaningless for RDMA transports.
The transport layer typically sets the source port value on the
connection to a random ephemeral port.

Currently, NFS server administrators must specify the "insecure"
export option to enable clients to access exports via RDMA.

But this means NFS clients can access such an export via IP using an
ephemeral port, which may not be desirable.

This patch eliminates the need to specify the "insecure" export
option to allow NFS/RDMA clients access to an export.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-22 15:55:48 -04:00
Rashika Kheria e1d83ee673 net: Mark functions as static in net/sunrpc/svc_xprt.c
Mark functions as static in net/sunrpc/svc_xprt.c because they are not
used outside this file.

This eliminates the following warning in net/sunrpc/svc_xprt.c:
net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes]
net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-09 17:32:50 -08:00
J. Bruce Fields cc630d9f47 svcrpc: fix rpc server shutdown races
Rewrite server shutdown to remove the assumption that there are no
longer any threads running (no longer true, for example, when shutting
down the service in one network namespace while it's still running in
others).

Do that by doing what we'd do in normal circumstances: just CLOSE each
socket, then enqueue it.

Since there may not be threads to handle the resulting queued xprts,
also run a simplified version of the svc_recv() loop run by a server to
clean up any closed xprts afterwards.

Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-17 10:53:51 -05:00
J. Bruce Fields e75bafbff2 svcrpc: make svc_age_temp_xprts enqueue under sv_lock
svc_age_temp_xprts expires xprts in a two-step process: first it takes
the sv_lock and moves the xprts to expire off their server-wide list
(sv_tempsocks or sv_permsocks) to a local list.  Then it drops the
sv_lock and enqueues and puts each one.

I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the
sv_lock and sp_lock are not otherwise nested anywhere (and documentation
at the top of this file claims it's correct to nest these with sp_lock
inside.)

Cc: stable@kernel.org
Tested-by: Jason Tibbitts <tibbs@math.uh.edu>
Tested-by: Paweł Sikora <pawel.sikora@agmk.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-02-17 10:53:16 -05:00
Andriy Skulysh 35525b7978 sunrpc: Fix lockd sleeping until timeout
There is a race in enqueueing thread to a pool and
waking up a thread.
lockd doesn't wake up on reception of lock granted callback
if svc_wake_up() is called before lockd's thread is added
to a pool.

Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:39 -05:00
Weston Andros Adamson 0104729807 SUNRPC: remove BUG_ON in svc_delete_xprt
Replace BUG_ON() with WARN_ON_ONCE().

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-04 14:43:42 -05:00
Weston Andros Adamson b25cd058f2 SUNRPC: remove BUG_ONs checking RPCSVC_MAXPAGES
Replace two bounds checking BUG_ON() calls with WARN_ON_ONCE() and resetting
the requested size to RPCSVC_MAXPAGES.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-04 14:43:41 -05:00
Weston Andros Adamson ff1fdb9b80 SUNRPC: remove BUG_ON in svc_xprt_received
Replace BUG_ON() with a WARN_ON_ONCE() and early return.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-04 14:43:41 -05:00
J. Bruce Fields 65b2e6656b svcrpc: split up svc_handle_xprt
Move initialization of newly accepted socket into a helper.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:42:02 -04:00
J. Bruce Fields 6797fa5a01 svcrpc: break up svc_recv
Matter of taste, I suppose, but svc_recv breaks up naturally into:

	allocate pages and setup arg
	dequeue (wait for, if necessary) next socket
	do something with that socket

And I find it easier to read when it doesn't go on for pages and pages.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:42:01 -04:00
J. Bruce Fields 6741019c82 svcrpc: make svc_xprt_received static
Note this isn't used outside svc_xprt.c.

May as well move it so we don't need a declaration while we're here.

Also remove an outdated comment.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:42:01 -04:00
J. Bruce Fields 9f9d2ebe69 svcrpc: make xpo_recvfrom return only >=0
The only errors returned from xpo_recvfrom have been -EAGAIN and
-EAFNOSUPPORT.  The latter was removed by a previous patch.  That leaves
only -EAGAIN, which is treated just like 0 by the caller (svc_recv).

So, just ditch -EAGAIN and return 0 instead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:41:07 -04:00
J. Bruce Fields 39b5530137 svcrpc: share some setup of listening sockets
There's some duplicate code here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:07:48 -04:00
J. Bruce Fields c334196694 svcrpc: make svc_create_xprt enqueue on clearing XPT_BUSY
Whenever we clear XPT_BUSY we should call svc_xprt_enqueue().  Without
that we may fail to notice any events (such as new connections) that
arrived while XPT_BUSY was set.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 17:07:36 -04:00
J. Bruce Fields 719f8bcc88 svcrpc: fix xpt_list traversal locking on shutdown
Server threads are not running at this point, but svc_age_temp_xprts
still may be, so we need this locking.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-21 14:08:40 -04:00
J. Bruce Fields d10f27a750 svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
The rpc server tries to ensure that there will be room to send a reply
before it receives a request.

It does this by tracking, in xpt_reserved, an upper bound on the total
size of the replies that is has already committed to for the socket.

Currently it is adding in the estimate for a new reply *before* it
checks whether there is space available.  If it finds that there is not
space, it then subtracts the estimate back out.

This may lead the subsequent svc_xprt_enqueue to decide that there is
space after all.

The results is a svc_recv() that will repeatedly return -EAGAIN, causing
server threads to loop without doing any actual work.

Cc: stable@vger.kernel.org
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-20 18:39:19 -04:00
J. Bruce Fields f06f00a24d svcrpc: sends on closed socket should stop immediately
svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
However, the XPT_CLOSE won't be acted on immediately.  Meanwhile other
threads could send further replies before the socket is really shut
down.  This can manifest as data corruption: for example, if a truncated
read reply is followed by another rpc reply, that second reply will look
to the client like further read data.

Symptoms were data corruption preceded by svc_tcp_sendto logging
something like

	kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket

Cc: stable@vger.kernel.org
Reported-by: Malahal Naineni <malahal@us.ibm.com>
Tested-by: Malahal Naineni <malahal@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-20 18:38:59 -04:00
Linus Torvalds 419f431949 Merge branch 'for-3.5' of git://linux-nfs.org/~bfields/linux
Pull the rest of the nfsd commits from Bruce Fields:
 "... and then I cherry-picked the remainder of the patches from the
  head of my previous branch"

This is the rest of the original nfsd branch, rebased without the
delegation stuff that I thought really needed to be redone.

I don't like rebasing things like this in general, but in this situation
this was the lesser of two evils.

* 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
  nfsd4: fix, consolidate client_has_state
  nfsd4: don't remove rebooted client record until confirmation
  nfsd4: remove some dprintk's and a comment
  nfsd4: return "real" sequence id in confirmed case
  nfsd4: fix exchange_id to return confirm flag
  nfsd4: clarify that renewing expired client is a bug
  nfsd4: simpler ordering of setclientid_confirm checks
  nfsd4: setclientid: remove pointless assignment
  nfsd4: fix error return in non-matching-creds case
  nfsd4: fix setclientid_confirm same_cred check
  nfsd4: merge 3 setclientid cases to 2
  nfsd4: pull out common code from setclientid cases
  nfsd4: merge last two setclientid cases
  nfsd4: setclientid/confirm comment cleanup
  nfsd4: setclientid remove unnecessary terms from a logical expression
  nfsd4: move rq_flavor into svc_cred
  nfsd4: stricter cred comparison for setclientid/exchange_id
  nfsd4: move principal name into svc_cred
  nfsd4: allow removing clients not holding state
  nfsd4: rearrange exchange_id logic to simplify
  ...
2012-06-01 08:32:58 -07:00
J. Bruce Fields 3ddbe8794f svcrpc: fix a comment typo
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:49 -04:00
Jeff Layton 91c427ac3a sunrpc: do array overrun check in svc_recv before allocating pages
There's little point in waiting until after we allocate all of the pages
to see if we're going to overrun the array. In the event that this
calculation is really off we could end up scribbling over a bunch of
memory and make it tougher to debug.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-05-31 20:29:41 -04:00
Joe Perches e87cc4728f net: Convert net_ratelimit uses to net_<level>_ratelimited
Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-05-15 13:45:03 -04:00
Stanislav Kinsbursky 7b147f1ff2 SUNRPC: service destruction in network namespace context
v2: Added comment to BUG_ON's in svc_destroy() to make code looks clearer.

This patch introduces network namespace filter for service destruction
function.
Nothing special here - just do exactly the same operations, but only for
tranports in passed networks namespace context.
BTW, BUG_ON() checks for empty service transports lists were returned into
svc_destroy() function. This is because of swithing generic svc_close_all() to
networks namespace dependable svc_close_net().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15 00:19:45 -05:00
Stanislav Kinsbursky 3a22bf506c SUNRPC: clear svc transports lists helper introduced
This patch moves service transports deletion from service sockets lists to
separated function.
This is a precursor patch, which would be usefull with service shutdown in
network namespace context, introduced later in the series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15 00:19:45 -05:00
Stanislav Kinsbursky 6f5133652e SUNRPC: clear svc pools lists helper introduced
This patch moves removing of service transport from it's pools ready lists to
separated function. Also this clear is now done with list_for_each_entry_safe()
helper.
This is a precursor patch, which would be usefull with service shutdown in
network namespace context, introduced later in the series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15 00:19:44 -05:00
Stanislav Kinsbursky 4cb54ca206 SUNRPC: search for service transports in network namespace context
Service transports are parametrized by network namespace. And thus lookup of
transport instance have to take network namespace into account.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
2012-01-31 19:28:19 -05:00
Linus Torvalds 0b48d42235 Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux
* 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: nfsd4_create_clid_dir return value is unused
  NFSD: Change name of extended attribute containing junction
  svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown
  svcrpc: fix double-free on shutdown of nfsd after changing pool mode
  nfsd4: be forgiving in the absence of the recovery directory
  nfsd4: fix spurious 4.1 post-reboot failures
  NFSD: forget_delegations should use list_for_each_entry_safe
  NFSD: Only reinitilize the recall_lru list under the recall lock
  nfsd4: initialize special stateid's at compile time
  NFSd: use network-namespace-aware cache registering routines
  SUNRPC: create svc_xprt in proper network namespace
  svcrpc: update outdated BKL comment
  nfsd41: allow non-reclaim open-by-fh's in 4.1
  svcrpc: avoid memory-corruption on pool shutdown
  svcrpc: destroy server sockets all at once
  svcrpc: make svc_delete_xprt static
  nfsd: Fix oops when parsing a 0 length export
  nfsd4: Use kmemdup rather than duplicating its implementation
  nfsd4: add a separate (lockowner, inode) lookup
  nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error
  ...
2012-01-14 12:26:41 -08:00
Eric Dumazet dfd56b8b38 net: use IS_ENABLED(CONFIG_IPV6)
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11 18:25:16 -05:00
Stanislav Kinsbursky bd4620ddf6 SUNRPC: create svc_xprt in proper network namespace
This patch makes svc_xprt inherit network namespace link from its socket.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:20:42 -05:00
J. Bruce Fields b4f36f88b3 svcrpc: avoid memory-corruption on pool shutdown
Socket callbacks use svc_xprt_enqueue() to add an xprt to a
pool->sp_sockets list.  In normal operation a server thread will later
come along and take the xprt off that list.  On shutdown, after all the
threads have exited, we instead manually walk the sv_tempsocks and
sv_permsocks lists to find all the xprt's and delete them.

So the sp_sockets lists don't really matter any more.  As a result,
we've mostly just ignored them and hoped they would go away.

Which has gotten us into trouble; witness for example ebc63e531c
"svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben
Greear noticing that a still-running svc_xprt_enqueue() could re-add an
xprt to an sp_sockets list just before it was deleted.  The fix was to
remove it from the list at the end of svc_delete_xprt().  But that only
made corruption less likely--I can see nothing that prevents a
svc_xprt_enqueue() from adding another xprt to the list at the same
moment that we're removing this xprt from the list.  In fact, despite
the earlier xpo_detach(), I don't even see what guarantees that
svc_xprt_enqueue() couldn't still be running on this xprt.

So, instead, note that svc_xprt_enqueue() essentially does:
	lock sp_lock
		if XPT_BUSY unset
			add to sp_sockets
	unlock sp_lock

So, if we do:

	set XPT_BUSY on every xprt.
	Empty every sp_sockets list, under the sp_socks locks.

Then we're left knowing that the sp_sockets lists are all empty and will
stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under
the sp_lock and see it set.

And *then* we can continue deleting the xprt's.

(Thanks to Jeff Layton for being correctly suspicious of this code....)

Cc: Ben Greear <greearb@candelatech.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:18:58 -05:00
J. Bruce Fields 2fefb8a09e svcrpc: destroy server sockets all at once
There's no reason I can see that we need to call sv_shutdown between
closing the two lists of sockets.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:18:52 -05:00
J. Bruce Fields 7710ec36b6 svcrpc: make svc_delete_xprt static
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:18:51 -05:00
Paul Gortmaker 3a9a231d97 net: Fix files explicitly needing to include module.h
With calls to modular infrastructure, these files really
needs the full module.h header.  Call it out so some of the
cleanups of implicit and unrequired includes elsewhere can be
cleaned up.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:28 -04:00
Mi Jinlong 849a1cf13d SUNRPC: Replace svc_addr_u by sockaddr_storage
For IPv6 local address, lockd can not callback to client for
missing scope id when binding address at inet6_bind:

 324       if (addr_type & IPV6_ADDR_LINKLOCAL) {
 325               if (addr_len >= sizeof(struct sockaddr_in6) &&
 326                   addr->sin6_scope_id) {
 327                       /* Override any existing binding, if another one
 328                        * is supplied by user.
 329                        */
 330                       sk->sk_bound_dev_if = addr->sin6_scope_id;
 331               }
 332
 333               /* Binding to link-local address requires an interface */
 334               if (!sk->sk_bound_dev_if) {
 335                       err = -EINVAL;
 336                       goto out_unlock;
 337               }

Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info
besides address.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-14 08:21:48 -04:00
J. Bruce Fields ebc63e531c svcrpc: fix list-corrupting race on nfsd shutdown
After commit 3262c816a3 "[PATCH] knfsd:
split svc_serv into pools", svc_delete_xprt (then svc_delete_socket) no
longer removed its xpt_ready (then sk_ready) field from whatever list it
was on, noting that there was no point since the whole list was about to
be destroyed anyway.

That was mostly true, but forgot that a few svc_xprt_enqueue()'s might
still be hanging around playing with the about-to-be-destroyed list, and
could get themselves into trouble writing to freed memory if we left
this xprt on the list after freeing it.

(This is actually functionally identical to a patch made first by Ben
Greear, but with more comments.)

Cc: stable@kernel.org
Cc: gnb@fmeh.org
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:46 -04:00
J. Bruce Fields 99de8ea962 rpc: keep backchannel xprt as long as server connection
Multiple backchannels can share the same tcp connection; from rfc 5661 section
2.10.3.1:

	A connection's association with a session is not exclusive.  A
	connection associated with the channel(s) of one session may be
	simultaneously associated with the channel(s) of other sessions
	including sessions associated with other client IDs.

However, multiple backchannels share a connection, they must all share
the same xid stream (hence the same rpc_xprt); the only way we have to
match replies with calls at the rpc layer is using the xid.

So, keep the rpc_xprt around as long as the connection lasts, in case
we're asked to use the connection as a backchannel again.

Requests to create new backchannel clients over a given server
connection should results in creating new clients that reuse the
existing rpc_xprt.

But to start, just reject attempts to associate multiple rpc_xprt's with
the same underlying bc_xprt.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:10 -05:00
J. Bruce Fields 9e701c6109 svcrpc: simpler request dropping
Currently we use -EAGAIN returns to determine when to drop a deferred
request.  On its own, that is error-prone, as it makes us treat -EAGAIN
returns from other functions specially to prevent inadvertent dropping.

So, use a flag on the request instead.

Returning an error on request deferral is still required, to prevent
further processing, but we no longer need worry that an error return on
its own could result in a drop.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:22 -05:00
NeilBrown 7c96aef759 sunrpc: remove xpt_pool
The xpt_pool field is only used for reporting BUGs.
And it isn't used correctly.

In particular, when it is cleared in svc_xprt_received before
XPT_BUSY is cleared, there is no guarantee that either the
compiler or the CPU might not re-order to two assignments, just
setting xpt_pool to NULL after XPT_BUSY is cleared.

If a different cpu were running svc_xprt_enqueue at this moment,
it might see XPT_BUSY clear and then xpt_pool non-NULL, and
so BUG.

This could be fixed by calling
  smp_mb__before_clear_bit()
before the clear_bit.  However as xpt_pool isn't really used,
it seems safest to simply remove xpt_pool.

Another alternate would be to change the clear_bit to
clear_bit_unlock, and the test_and_set_bit to test_and_set_bit_lock.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-17 15:48:18 -05:00
J. Bruce Fields ec66ee3797 Merge commit 'v2.6.37-rc6' into for-2.6.38 2010-12-17 13:29:07 -05:00
NeilBrown ed2849d3ec sunrpc: prevent use-after-free on clearing XPT_BUSY
When an xprt is created, it has a refcount of 1, and XPT_BUSY is set.
The refcount is *not* owned by the thread that created the xprt
(as is clear from the fact that creators never put the reference).
Rather, it is owned by the absence of XPT_DEAD.  Once XPT_DEAD is set,
(And XPT_BUSY is clear) that initial reference is dropped and the xprt
can be freed.

So when a creator clears XPT_BUSY it is dropping its only reference and
so must not touch the xprt again.

However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
reference on a new xprt), calls svc_xprt_recieved.  This clears
XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
This is dangerous and has been seen to leave svc_xprt_enqueue working
with an xprt containing garbage.

So we need to hold an extra counted reference over that call to
svc_xprt_received.

For safety, any time we clear XPT_BUSY and then use the xprt again, we
first get a reference, and the put it again afterwards.

Note that svc_close_all does not need this extra protection as there are
no threads running, and the final free can only be called asynchronously
from such a thread.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-07 20:39:55 -05:00
J. Bruce Fields 9c335c0b8d svcrpc: fix wspace-checking race
We call svc_xprt_enqueue() after something happens which we think may
require handling from a server thread.  To avoid such events being lost,
svc_xprt_enqueue() must guarantee that there will be a svc_serv() call
from a server thread following any such event.  It does that by either
waking up a server thread itself, or checking that XPT_BUSY is set (in
which case somebody else is doing it).

But the check of XPT_BUSY could occur just as someone finishes
processing some other event, and just before they clear XPT_BUSY.

Therefore it's important not to clear XPT_BUSY without subsequently
doing another svc_export_enqueue() to check whether the xprt should be
requeued.

The xpo_wspace() check in svc_xprt_enqueue() breaks this rule, allowing
an event to be missed in situations like:

	data arrives
	call svc_tcp_data_ready():
	call svc_xprt_enqueue():
	set BUSY
	find no write space
				svc_reserve():
				free up write space
				call svc_enqueue():
				test BUSY
	clear BUSY

So, instead, check wspace in the same places that the state flags are
checked: before taking BUSY, and in svc_receive().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:12 -05:00
J. Bruce Fields b176331627 svcrpc: svc_close_xprt comment
Neil Brown had to explain to me why we do this here; record the answer
for posterity.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:12 -05:00
J. Bruce Fields f8c0d226fe svcrpc: simplify svc_close_all
There's no need to be fooling with XPT_BUSY now that all the threads
are gone.

The list_del_init() here could execute at the same time as the
svc_xprt_enqueue()'s list_add_tail(), with undefined results.  We don't
really care at this point, but it might result in a spurious
list-corruption warning or something.

And svc_close() isn't adding any value; just call svc_delete_xprt()
directly.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:11 -05:00
J. Bruce Fields ca7896cd83 nfsd4: centralize more calls to svc_xprt_received
Follow up on b48fa6b991 by moving all the
svc_xprt_received() calls for the main xprt to one place.  The clearing
of XPT_BUSY here is critical to the correctness of the server, so I'd
prefer it to be obvious where we do it.

The only substantive result is moving svc_xprt_received() after
svc_receive_deferred().  Other than a (likely insignificant) delay
waking up the next thread, that should be harmless.

Also reshuffle the exit code a little to skip a few other steps that we
don't care about the in the svc_delete_xprt() case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:11 -05:00
J. Bruce Fields 62bac4af3d svcrpc: don't set then immediately clear XPT_DEFERRED
There's no harm to doing this, since the only caller will immediately
call svc_enqueue() afterwards, ensuring we don't miss the remaining
deferred requests just because XPT_DEFERRED was briefly cleared.

But why not just do this the simple way?

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-11-19 18:35:11 -05:00
Arnd Bergmann 451a3c24b0 BKL: remove extraneous #include <smp_lock.h>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-17 08:59:32 -08:00
J. Bruce Fields 01dba075d5 svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
If any xprt marked DEAD is also left BUSY for the rest of its life, then
the XPT_DEAD check here is superfluous--we'll get the same result from
the XPT_BUSY check just after.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-10-25 17:59:33 -04:00