Log ack.rwind in the rxrpc_tx_ack tracepoint. This value is useful to see
as it represents flow-control information to the peer.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Move the connection setup of client calls to the I/O thread so that a whole
load of locking and barrierage can be eliminated. This necessitates the
app thread waiting for connection to complete before it can begin
encrypting data.
This also completes the fix for a race that exists between call connection
and call disconnection whereby the data transmission code adds the call to
the peer error distribution list after the call has been disconnected (say
by the rxrpc socket getting closed).
The fix is to complete the process of moving call connection, data
transmission and call disconnection into the I/O thread and thus forcibly
serialising them.
Note that the issue may predate the overhaul to an I/O thread model that
were included in the merge window for v6.2, but the timing is very much
changed by the change given below.
Fixes: cf37b59875 ("rxrpc: Move DATA transmission into call processor work item")
Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
All the setters of call->state are now in the I/O thread and thus the state
lock is now unnecessary.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Offload the completion of the challenge/response cycle on a service
connection to the I/O thread. After the RESPONSE packet has been
successfully decrypted and verified by the work queue, offloading the
changing of the call states to the I/O thread makes iteration over the
conn's channel list simpler.
Do this by marking the RESPONSE skbuff and putting it onto the receive
queue for the I/O thread to collect. We put it on the front of the queue
as we've already received the packet for it.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Tidy up the abort generation infrastructure in the following ways:
(1) Create an enum and string mapping table to list the reasons an abort
might be generated in tracing.
(2) Replace the 3-char string with the values from (1) in the places that
use that to log the abort source. This gets rid of a memcpy() in the
tracepoint.
(3) Subsume the rxrpc_rx_eproto tracepoint with the rxrpc_abort tracepoint
and use values from (1) to indicate the trace reason.
(4) Always make a call to an abort function at the point of the abort
rather than stashing the values into variables and using goto to get
to a place where it reported. The C optimiser will collapse the calls
together as appropriate. The abort functions return a value that can
be returned directly if appropriate.
Note that this extends into afs also at the points where that generates an
abort. To aid with this, the afs sources need to #define
RXRPC_TRACE_ONLY_DEFINE_ENUMS before including the rxrpc tracing header
because they don't have access to the rxrpc internal structures that some
of the tracepoints make use of.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Clean up connection abort, using the connection state_lock to gate access
to change that state, and use an rxrpc_call_completion value to indicate
the difference between local and remote aborts as these can be pasted
directly into the call state.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Provide a means by which an event notification can be sent to a connection
through such that the I/O thread can pick it up and handle it rather than
doing it in a separate workqueue.
This is then used to move the deferred final ACK of a call into the I/O
thread rather than a separate work queue as part of the drive to do all
transmission from the I/O thread.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Call the rxrpc_conn_retransmit_call() directly from rxrpc_input_packet()
rather than calling it via connection event handling.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
None of the spinlocks in rxrpc need a _bh annotation now as the RCU
callback routines no longer take spinlocks and the bulk of the packet
wrangling code is now run in the I/O thread, not softirq context.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Move the functions from the call->processor and local->processor work items
into the domain of the I/O thread.
The call event processor, now called from the I/O thread, then takes over
the job of cranking the call state machine, processing incoming packets and
transmitting DATA, ACK and ABORT packets. In a future patch,
rxrpc_send_ACK() will transmit the ACK on the spot rather than queuing it
for later transmission.
The call event processor becomes purely received-skb driven. It only
transmits things in response to events. We use "pokes" to queue a dummy
skb to make it do things like start/resume transmitting data. Timer expiry
also results in pokes.
The connection event processor, becomes similar, though crypto events, such
as dealing with CHALLENGE and RESPONSE packets is offloaded to a work item
to avoid doing crypto in the I/O thread.
The local event processor is removed and VERSION response packets are
generated directly from the packet parser. Similarly, ABORTs generated in
response to protocol errors will be transmitted immediately rather than
being pushed onto a queue for later transmission.
Changes:
========
ver #2)
- Fix a couple of introduced lock context imbalances.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Currently, rxrpc gives the connection's work item a ref on the connection
when it queues it - and this is called from the timer expiration function.
The problem comes when queue_work() fails (ie. the work item is already
queued): the timer routine must put the ref - but this may cause the
cleanup code to run.
This has the unfortunate effect that the cleanup code may then be run in
softirq context - which means that any spinlocks it might need to touch
have to be guarded to disable softirqs (ie. they need a "_bh" suffix).
(1) Don't give a ref to the work item.
(2) Simplify handling of service connections by adding a separate active
count so that the refcount isn't also used for this.
(3) Connection destruction for both client and service connections can
then be cleaned up by putting rxrpc_put_connection() out of line and
making a tidy progression through the destruction code (offloaded to a
workqueue if put from softirq or processor function context). The RCU
part of the cleanup then only deals with the freeing at the end.
(4) Make rxrpc_queue_conn() return immediately if it sees the active count
is -1 rather then queuing the connection.
(5) Make sure that the cleanup routine waits for the work item to
complete.
(6) Stash the rxrpc_net pointer in the conn struct so that the rcu free
routine can use it, even if the local endpoint has been freed.
Unfortunately, neither the timer nor the work item can simply get around
the problem by just using refcount_inc_not_zero() as the waits would still
have to be done, and there would still be the possibility of having to put
the ref in the expiration function.
Note the connection work item is mostly going to go away with the main
event work being transferred to the I/O thread, so the wait in (6) will
become obsolete.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the sk_buff tracepoint.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the rxrpc_conn tracepoint
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the rxrpc_local tracepoint
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Extract the code from a received rx ABORT packet much earlier and in a
single place and harmonise the responses to malformed ABORT packets.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Remove the rxrpc_conn_parameters struct from the rxrpc_connection and
rxrpc_bundle structs and emplace the members directly. These are going to
get filled in from the rxrpc_call struct in future.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Remove the kproto() and _proto() debugging macros in preference to using
tracepoints for this.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Merge the ->prime_packet_security() into the ->init_connection_security()
hook as they're always called together.
Signed-off-by: David Howells <dhowells@redhat.com>
Don't retain a pointer to the server key in the connection, but rather get
it on demand when the server has to deal with a response packet.
This is necessary to implement RxGK (GSSAPI-mediated transport class),
where we can't know which key we'll need until we've challenged the client
and got back the response.
This also means that we don't need to do a key search in the accept path in
softirq mode.
Also, whilst we're at it, allow the security class to ask for a kvno and
encoding-type variant of a server key as RxGK needs different keys for
different encoding types. Keys of this type have an extra bit in the
description:
"<service-id>:<security-index>:<kvno>:<enctype>"
Signed-off-by: David Howells <dhowells@redhat.com>
rxrpc-type keys can have multiple tokens attached for different security
classes. Currently, rxrpc always picks the first one, whether or not the
security class it indicates is supported.
Add preliminary support for choosing which security class will be used
(this will need to be directed from a higher layer) and go through the
tokens to find one that's supported.
Signed-off-by: David Howells <dhowells@redhat.com>
Fix the loss of transmission of a call's final ack when a socket gets shut
down. This means that the server will retransmit the last data packet or
send a ping ack and then get an ICMP indicating the port got closed. The
server will then view this as a failure.
Fixes: 3136ef49a1 ("rxrpc: Delay terminal ACK transmission on a client call")
Signed-off-by: David Howells <dhowells@redhat.com>
Small conflict around locking in rxrpc_process_event() -
channel_lock moved to bundle in next, while state lock
needs _bh() from net.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a new incoming call arrives at an userspace rxrpc socket on a new
connection that has a security class set, the code currently pushes it onto
the accept queue to hold a ref on it for the socket. This doesn't work,
however, as recvmsg() pops it off, notices that it's in the SERVER_SECURING
state and discards the ref. This means that the call runs out of refs too
early and the kernel oopses.
By contrast, a kernel rxrpc socket manually pre-charges the incoming call
pool with calls that already have user call IDs assigned, so they are ref'd
by the call tree on the socket.
Change the mode of operation for userspace rxrpc server sockets to work
like this too. Although this is a UAPI change, server sockets aren't
currently functional.
Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
conn->state_lock may be taken in softirq mode, but a previous patch
replaced an outer lock in the response-packet event handling code, and lost
the _bh from that when doing so.
Fix this by applying the _bh annotation to the state_lock locking.
Fixes: a1399f8bb0 ("rxrpc: Call channels should have separate call number spaces")
Signed-off-by: David Howells <dhowells@redhat.com>
Rewrite the rxrpc client connection manager so that it can support multiple
connections for a given security key to a peer. The following changes are
made:
(1) For each open socket, the code currently maintains an rbtree with the
connections placed into it, keyed by communications parameters. This
is tricky to maintain as connections can be culled from the tree or
replaced within it. Connections can require replacement for a number
of reasons, e.g. their IDs span too great a range for the IDR data
type to represent efficiently, the call ID numbers on that conn would
overflow or the conn got aborted.
This is changed so that there's now a connection bundle object placed
in the tree, keyed on the same parameters. The bundle, however, does
not need to be replaced.
(2) An rxrpc_bundle object can now manage the available channels for a set
of parallel connections. The lock that manages this is moved there
from the rxrpc_connection struct (channel_lock).
(3) There'a a dummy bundle for all incoming connections to share so that
they have a channel_lock too. It might be better to give each
incoming connection its own bundle. This bundle is not needed to
manage which channels incoming calls are made on because that's the
solely at whim of the client.
(4) The restrictions on how many client connections are around are
removed. Instead, a previous patch limits the number of client calls
that can be allocated. Ordinarily, client connections are reaped
after 2 minutes on the idle queue, but when more than a certain number
of connections are in existence, the reaper starts reaping them after
2s of idleness instead to get the numbers back down.
It could also be made such that new call allocations are forced to
wait until the number of outstanding connections subsides.
Signed-off-by: David Howells <dhowells@redhat.com>
Under some circumstances, rxrpc will fail a transmit a packet through the
underlying UDP socket (ie. UDP sendmsg returns an error). This may result
in a call getting stuck.
In the instance being seen, where AFS tries to send a probe to the Volume
Location server, tracepoints show the UDP Tx failure (in this case returing
error 99 EADDRNOTAVAIL) and then nothing more:
afs_make_vl_call: c=0000015d VL.GetCapabilities
rxrpc_call: c=0000015d NWc u=1 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000dd89ee8a
rxrpc_call: c=0000015d Gus u=2 sp=rxrpc_new_client_call+0x14f/0x580 [rxrpc] a=00000000e20e4b08
rxrpc_call: c=0000015d SEE u=2 sp=rxrpc_activate_one_channel+0x7b/0x1c0 [rxrpc] a=00000000e20e4b08
rxrpc_call: c=0000015d CON u=2 sp=rxrpc_kernel_begin_call+0x106/0x170 [rxrpc] a=00000000e20e4b08
rxrpc_tx_fail: c=0000015d r=1 ret=-99 CallDataNofrag
The problem is that if the initial packet fails and the retransmission
timer hasn't been started, the call is set to completed and an error is
returned from rxrpc_send_data_packet() to rxrpc_queue_packet(). Though
rxrpc_instant_resend() is called, this does nothing because the call is
marked completed.
So rxrpc_notify_socket() isn't called and the error is passed back up to
rxrpc_send_data(), rxrpc_kernel_send_data() and thence to afs_make_call()
and afs_vl_get_capabilities() where it is simply ignored because it is
assumed that the result of a probe will be collected asynchronously.
Fileserver probing is similarly affected via afs_fs_get_capabilities().
Fix this by always issuing a notification in __rxrpc_set_call_completion()
if it shifts a call to the completed state, even if an error is also
returned to the caller through the function return value.
Also put in a little bit of optimisation to avoid taking the call
state_lock and disabling softirqs if the call is already in the completed
state and remove some now redundant rxrpc_notify_socket() calls.
Fixes: f5c17aaeb2 ("rxrpc: Calls should only have one terminal state")
Reported-by: Gerry Seidman <gerry@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
The introduction of a split between the reference count on rxrpc_local
objects and the usage count didn't quite go far enough. A number of kernel
work items need to make use of the socket to perform transmission. These
also need to get an active count on the local object to prevent the socket
from being closed.
Fix this by getting the active count in those places.
Also split out the raw active count get/put functions as these places tend
to hold refs on the rxrpc_local object already, so getting and putting an
extra object ref is just a waste of time.
The problem can lead to symptoms like:
BUG: kernel NULL pointer dereference, address: 0000000000000018
..
CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51
...
RIP: 0010:selinux_socket_sendmsg+0x5/0x13
...
Call Trace:
security_socket_sendmsg+0x2c/0x3e
sock_sendmsg+0x1a/0x46
rxrpc_send_keepalive+0x131/0x1ae
rxrpc_peer_keepalive_worker+0x219/0x34b
process_one_work+0x18e/0x271
worker_thread+0x1a3/0x247
kthread+0xe6/0xeb
ret_from_fork+0x1f/0x30
Fixes: 730c5fd42c ("rxrpc: Fix local endpoint refcounting")
Signed-off-by: David Howells <dhowells@redhat.com>
Fix rxrpc_new_incoming_call() to check that we have a suitable service key
available for the combination of service ID and security class of a new
incoming call - and to reject calls for which we don't.
This causes an assertion like the following to appear:
rxrpc: Assertion failed - 6(0x6) == 12(0xc) is false
kernel BUG at net/rxrpc/call_object.c:456!
Where call->state is RXRPC_CALL_SERVER_SECURING (6) rather than
RXRPC_CALL_COMPLETE (12).
Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Use the previously-added transmit-phase skbuff private flag to simplify the
socket buffer tracing a bit. Which phase the skbuff comes from can now be
divined from the skb rather than having to be guessed from the call state.
We can also reduce the number of rxrpc_skb_trace values by eliminating the
difference between Tx and Rx in the symbols.
Signed-off-by: David Howells <dhowells@redhat.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trace received calls that are aborted due to a connection abort, typically
because of authentication failure. Without this, connection aborts don't
show up in the trace log.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix connection-level abort handling to cache the abort and error codes
properly so that a new incoming call can be properly aborted if it races
with the parent connection being aborted by another CPU.
The abort_code and error parameters can then be dropped from
rxrpc_abort_calls().
Fixes: f5c17aaeb2 ("rxrpc: Calls should only have one terminal state")
Signed-off-by: David Howells <dhowells@redhat.com>
AF_RXRPC has a keepalive message generator that generates a message for a
peer ~20s after the last transmission to that peer to keep firewall ports
open. The implementation is incorrect in the following ways:
(1) It mixes up ktime_t and time64_t types.
(2) It uses ktime_get_real(), the output of which may jump forward or
backward due to adjustments to the time of day.
(3) If the current time jumps forward too much or jumps backwards, the
generator function will crank the base of the time ring round one slot
at a time (ie. a 1s period) until it catches up, spewing out VERSION
packets as it goes.
Fix the problem by:
(1) Only using time64_t. There's no need for sub-second resolution.
(2) Use ktime_get_seconds() rather than ktime_get_real() so that time
isn't perceived to go backwards.
(3) Simplifying rxrpc_peer_keepalive_worker() by splitting it into two
parts:
(a) The "worker" function that manages the buckets and the timer.
(b) The "dispatch" function that takes the pending peers and
potentially transmits a keepalive packet before putting them back
in the ring into the slot appropriate to the revised last-Tx time.
(4) Taking everything that's pending out of the ring and splicing it into
a temporary collector list for processing.
In the case that there's been a significant jump forward, the ring
gets entirely emptied and then the time base can be warped forward
before the peers are processed.
The warping can't happen if the ring isn't empty because the slot a
peer is in is keepalive-time dependent, relative to the base time.
(5) Limit the number of iterations of the bucket array when scanning it.
(6) Set the timer to skip any empty slots as there's no point waking up if
there's nothing to do yet.
This can be triggered by an incoming call from a server after a reboot with
AF_RXRPC and AFS built into the kernel causing a peer record to be set up
before userspace is started. The system clock is then adjusted by
userspace, thereby potentially causing the keepalive generator to have a
meltdown - which leads to a message like:
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:23]
...
Workqueue: krxrpcd rxrpc_peer_keepalive_worker
EIP: lock_acquire+0x69/0x80
...
Call Trace:
? rxrpc_peer_keepalive_worker+0x5e/0x350
? _raw_spin_lock_bh+0x29/0x60
? rxrpc_peer_keepalive_worker+0x5e/0x350
? rxrpc_peer_keepalive_worker+0x5e/0x350
? __lock_acquire+0x3d3/0x870
? process_one_work+0x110/0x340
? process_one_work+0x166/0x340
? process_one_work+0x110/0x340
? worker_thread+0x39/0x3c0
? kthread+0xdb/0x110
? cancel_delayed_work+0x90/0x90
? kthread_stop+0x70/0x70
? ret_from_fork+0x19/0x24
Fixes: ace45bec6d ("rxrpc: Fix firewall route keepalive")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trace successful packet transmission (kernel_sendmsg() succeeded, that is)
in AF_RXRPC. We can share the enum that defines the transmission points
with the trace_rxrpc_tx_fail() tracepoint, so rename its constants to be
applicable to both.
Also, save the internal call->debug_id in the rxrpc_channel struct so that
it can be used in retransmission trace lines.
Signed-off-by: David Howells <dhowells@redhat.com>
When retransmitting the final ACK or ABORT packet for a call, the cid field
in the packet header is set to the connection's cid, but this is incorrect
as it also needs to include the channel number on that connection that the
call was made on.
Fix this by OR'ing in the channel number.
Note that this fixes the bug that:
commit 1a025028d4
rxrpc: Fix handling of call quietly cancelled out on server
works around. I'm not intending to revert that as it will help protect
against problems that might occur on the server.
Fixes: 3136ef49a1 ("rxrpc: Delay terminal ACK transmission on a client call")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the firewall route keepalive part of AF_RXRPC which is currently
function incorrectly by replying to VERSION REPLY packets from the server
with VERSION REQUEST packets.
Instead, send VERSION REPLY packets to the peers of service connections to
act as keep-alives 20s after the latest packet was transmitted to that
peer.
Also, just discard VERSION REPLY packets rather than replying to them.
Signed-off-by: David Howells <dhowells@redhat.com>
In rxrpc and afs, use the debug_ids that are monotonically allocated to
various objects as they're allocated rather than pointers as kernel
pointers are now hashed making them less useful. Further, the debug ids
aren't reused anywhere nearly as quickly.
In addition, allow kernel services that use rxrpc, such as afs, to take
numbers from the rxrpc counter, assign them to their own call struct and
pass them in to rxrpc for both client and service calls so that the trace
lines for each will have the same ID tag.
Signed-off-by: David Howells <dhowells@redhat.com>
Repeat terminal ACKs and now terminal ACKs are now generated from the
connection event processor rather from call handling as this allows us to
discard client call structures as soon as possible and free up the channel
for a follow on call.
However, in ACKs so generated, the additional information trailer is
malformed because the padding that's meant to be in the middle isn't
included in what's transmitted.
Fix it so that the 3 bytes of padding are included in the transmission.
Further, the trailer is misaligned because of the padding, so assigment to
the u16 and u32 fields inside it might cause problems on some arches, so
fix this by breaking the padding and the trailer out of the packed struct.
(This also deals with potential compiler weirdies where some of the nested
structs are packed and some aren't).
The symptoms can be seen in wireshark as terminal DUPLICATE or IDLE ACK
packets in which the Max MTU, Interface MTU and rwind fields have weird
values and the Max Packets field is apparently missing.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Delay terminal ACK transmission on a client call by deferring it to the
connection processor. This allows it to be skipped if we can send the next
call instead, the first DATA packet of which will implicitly ack this call.
Signed-off-by: David Howells <dhowells@redhat.com>
Keep the rxrpc_connection struct's idea of the service ID that is exposed
in the protocol separate from the service ID that's used as a lookup key.
This allows the protocol service ID on a client connection to get upgraded
without making the connection unfindable for other client calls that also
would like to use the upgraded connection.
The connection's actual service ID is then returned through recvmsg() by
way of msg_name.
Whilst we're at it, we get rid of the last_service_id field from each
channel. The service ID is per-connection, not per-call and an entire
connection is upgraded in one go.
Signed-off-by: David Howells <dhowells@redhat.com>
Add a tracepoint (rxrpc_rx_proto) to record protocol errors in received
packets. The following changes are made:
(1) Add a function, __rxrpc_abort_eproto(), to note a protocol error on a
call and mark the call aborted. This is wrapped by
rxrpc_abort_eproto() that makes the why string usable in trace.
(2) Add trace_rxrpc_rx_proto() or rxrpc_abort_eproto() to protocol error
generation points, replacing rxrpc_abort_call() with the latter.
(3) Only send an abort packet in rxkad_verify_packet*() if we actually
managed to abort the call.
Note that a trace event is also emitted if a kernel user (e.g. afs) tries
to send data through a call when it's not in the transmission phase, though
it's not technically a receive event.
Signed-off-by: David Howells <dhowells@redhat.com>
Use negative error codes in struct rxrpc_call::error because that's what
the kernel normally deals with and to make the code consistent. We only
turn them positive when transcribing into a cmsg for userspace recvmsg.
Signed-off-by: David Howells <dhowells@redhat.com>
If we receive a BUSY packet for a call we think we've just completed, the
packet is handed off to the connection processor to deal with - but the
connection processor doesn't expect a BUSY packet and so flags a protocol
error.
Fix this by simply ignoring the BUSY packet for the moment.
The symptom of this may appear as a system call failing with EPROTO. This
may be triggered by pressing ctrl-C under some circumstances.
This comes about we abort calls due to interruption by a signal (which we
shouldn't do, but that's going to be a large fix and mostly in fs/afs/).
What happens is that we abort the call and may also abort follow up calls
too (this needs offloading somehoe). So we see a transmission of something
like the following sequence of packets:
DATA for call N
ABORT call N
DATA for call N+1
ABORT call N+1
in very quick succession on the same channel. However, the peer may have
deferred the processing of the ABORT from the call N to a background thread
and thus sees the DATA message from the call N+1 coming in before it has
cleared the channel. Thus it sends a BUSY packet[*].
[*] Note that some implementations (OpenAFS, for example) mark the BUSY
packet with one plus the callNumber of the call prior to call N.
Ordinarily, this would be call N, but there's no requirement for the
calls on a channel to be numbered strictly sequentially (the number is
required to increase).
This is wrong and means that the callNumber in the BUSY packet should
be ignored (it really ought to be N+1 since that's what it's in
response to).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement RxRPC slow-start, which is similar to RFC 5681 for TCP. A
tracepoint is added to log the state of the congestion management algorithm
and the decisions it makes.
Notes:
(1) Since we send fixed-size DATA packets (apart from the final packet in
each phase), counters and calculations are in terms of packets rather
than bytes.
(2) The ACK packet carries the equivalent of TCP SACK.
(3) The FLIGHT_SIZE calculation in RFC 5681 doesn't seem particularly
suited to SACK of a small number of packets. It seems that, almost
inevitably, by the time three 'duplicate' ACKs have been seen, we have
narrowed the loss down to one or two missing packets, and the
FLIGHT_SIZE calculation ends up as 2.
(4) In rxrpc_resend(), if there was no data that apparently needed
retransmission, we transmit a PING ACK to ask the peer to tell us what
its Rx window state is.
Signed-off-by: David Howells <dhowells@redhat.com>
Add a tracepoint to log transmission of DATA packets (including loss
injection).
Adjust the ACK transmission tracepoint to include the packet serial number
and to line this up with the DATA transmission display.
Signed-off-by: David Howells <dhowells@redhat.com>