Merge branch 'net-ReST-part-two'
Mauro Carvalho Chehab says: ==================== net: manually convert files to ReST format - part 2 That's the second part of my work to convert the networking text files into ReST. it is based on today's linux-next (next-20200430). The full series (including those ones) are at: https://git.linuxtv.org/mchehab/experimental.git/log/?h=net-docs I should be sending the remaining patches (another /38 series) after getting those merged at -next. The documents, converted to HTML via the building system are at: https://www.infradead.org/~mchehab/kernel_docs/networking/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
commit
07f81727c1
Documentation
admin-guide
filesystems
networking
bonding.rstcan.rstchecksum-offloads.rstindex.rstip-sysctl.rstl2tp.rstlapb-module.rstltpc.rstmac80211-injection.rstmpls-sysctl.rstmultiqueue.rstnetconsole.rstnetdev-features.rstnetdevices.rstnetfilter-sysctl.rstnetif-msg.rstnetif-msg.txtnf_conntrack-sysctl.rstnf_flowtable.rstopenvswitch.rstoperstates.rstpacket_mmap.rstpacket_mmap.txtphonet.rstpktgen.rstplip.rstppp_generic.rstproc_net_tcp.rstradiotap-headers.rstray_cs.rstrds.rstregulatory.rstrxrpc.rstsctp.rstsecid.rstseg6-sysctl.rstseg6-sysctl.txtskfp.rststrparser.rstswitchdev.rsttc-actions-env-rules.rsttc-actions-env-rules.txttcp-thin.rstteam.rsttimestamping.rsttproxy.rst
drivers
include
net
samples/pktgen
|
@ -638,7 +638,7 @@
|
|||
|
||||
See Documentation/admin-guide/serial-console.rst for more
|
||||
information. See
|
||||
Documentation/networking/netconsole.txt for an
|
||||
Documentation/networking/netconsole.rst for an
|
||||
alternative.
|
||||
|
||||
uart[8250],io,<addr>[,options]
|
||||
|
|
|
@ -54,7 +54,7 @@ You will need to create a new device to use ``/dev/console``. The official
|
|||
``/dev/console`` is now character device 5,1.
|
||||
|
||||
(You can also use a network device as a console. See
|
||||
``Documentation/networking/netconsole.txt`` for information on that.)
|
||||
``Documentation/networking/netconsole.rst`` for information on that.)
|
||||
|
||||
Here's an example that will use ``/dev/ttyS1`` (COM2) as the console.
|
||||
Replace the sample values as needed.
|
||||
|
|
|
@ -70,7 +70,7 @@ list of volume location server IP addresses::
|
|||
The first module is the AF_RXRPC network protocol driver. This provides the
|
||||
RxRPC remote operation protocol and may also be accessed from userspace. See:
|
||||
|
||||
Documentation/networking/rxrpc.txt
|
||||
Documentation/networking/rxrpc.rst
|
||||
|
||||
The second module is the kerberos RxRPC security driver, and the third module
|
||||
is the actual filesystem driver for the AFS filesystem.
|
||||
|
|
|
@ -1639,7 +1639,7 @@ can safely be sent over either interface. Such configurations may be achieved
|
|||
using the traffic control utilities inherent in linux.
|
||||
|
||||
By default the bonding driver is multiqueue aware and 16 queues are created
|
||||
when the driver initializes (see Documentation/networking/multiqueue.txt
|
||||
when the driver initializes (see Documentation/networking/multiqueue.rst
|
||||
for details). If more or less queues are desired the module parameter
|
||||
tx_queues can be used to change this value. There is no sysfs parameter
|
||||
available as the allocation is done at module init time.
|
||||
|
|
|
@ -1058,7 +1058,7 @@ drivers you mainly have to deal with:
|
|||
- TX: Put the CAN frame from the socket buffer to the CAN controller.
|
||||
- RX: Put the CAN frame from the CAN controller to the socket buffer.
|
||||
|
||||
See e.g. at Documentation/networking/netdevices.txt . The differences
|
||||
See e.g. at Documentation/networking/netdevices.rst . The differences
|
||||
for writing CAN network device driver are described below:
|
||||
|
||||
|
||||
|
|
|
@ -59,7 +59,7 @@ recomputed for each resulting segment. See the skbuff.h comment (section 'E')
|
|||
for more details.
|
||||
|
||||
A driver declares its offload capabilities in netdev->hw_features; see
|
||||
Documentation/networking/netdev-features.txt for more. Note that a device
|
||||
Documentation/networking/netdev-features.rst for more. Note that a device
|
||||
which only advertises NETIF_F_IP[V6]_CSUM must still obey the csum_start and
|
||||
csum_offset given in the SKB; if it tries to deduce these itself in hardware
|
||||
(as some NICs do) the driver should check that the values in the SKB match
|
||||
|
|
|
@ -74,6 +74,43 @@ Contents:
|
|||
ipvlan
|
||||
ipvs-sysctl
|
||||
kcm
|
||||
l2tp
|
||||
lapb-module
|
||||
ltpc
|
||||
mac80211-injection
|
||||
mpls-sysctl
|
||||
multiqueue
|
||||
netconsole
|
||||
netdev-features
|
||||
netdevices
|
||||
netfilter-sysctl
|
||||
netif-msg
|
||||
nf_conntrack-sysctl
|
||||
nf_flowtable
|
||||
openvswitch
|
||||
operstates
|
||||
packet_mmap
|
||||
phonet
|
||||
pktgen
|
||||
plip
|
||||
ppp_generic
|
||||
proc_net_tcp
|
||||
radiotap-headers
|
||||
ray_cs
|
||||
rds
|
||||
regulatory
|
||||
rxrpc
|
||||
sctp
|
||||
secid
|
||||
seg6-sysctl
|
||||
skfp
|
||||
strparser
|
||||
switchdev
|
||||
tc-actions-env-rules
|
||||
tcp-thin
|
||||
team
|
||||
timestamping
|
||||
tproxy
|
||||
|
||||
.. only:: subproject and html
|
||||
|
||||
|
|
|
@ -886,7 +886,7 @@ tcp_thin_linear_timeouts - BOOLEAN
|
|||
initiated. This improves retransmission latency for
|
||||
non-aggressive thin streams, often found to be time-dependent.
|
||||
For more information on thin streams, see
|
||||
Documentation/networking/tcp-thin.txt
|
||||
Documentation/networking/tcp-thin.rst
|
||||
|
||||
Default: 0
|
||||
|
||||
|
|
|
@ -1,3 +1,9 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
====
|
||||
L2TP
|
||||
====
|
||||
|
||||
This document describes how to use the kernel's L2TP drivers to
|
||||
provide L2TP functionality. L2TP is a protocol that tunnels one or
|
||||
more sessions over an IP tunnel. It is commonly used for VPNs
|
||||
|
@ -121,14 +127,16 @@ Userspace may control behavior of the tunnel or session using
|
|||
setsockopt and ioctl on the PPPoX socket. The following socket
|
||||
options are supported:-
|
||||
|
||||
DEBUG - bitmask of debug message categories. See below.
|
||||
SENDSEQ - 0 => don't send packets with sequence numbers
|
||||
1 => send packets with sequence numbers
|
||||
RECVSEQ - 0 => receive packet sequence numbers are optional
|
||||
1 => drop receive packets without sequence numbers
|
||||
LNSMODE - 0 => act as LAC.
|
||||
1 => act as LNS.
|
||||
REORDERTO - reorder timeout (in millisecs). If 0, don't try to reorder.
|
||||
========= ===========================================================
|
||||
DEBUG bitmask of debug message categories. See below.
|
||||
SENDSEQ - 0 => don't send packets with sequence numbers
|
||||
- 1 => send packets with sequence numbers
|
||||
RECVSEQ - 0 => receive packet sequence numbers are optional
|
||||
- 1 => drop receive packets without sequence numbers
|
||||
LNSMODE - 0 => act as LAC.
|
||||
- 1 => act as LNS.
|
||||
REORDERTO reorder timeout (in millisecs). If 0, don't try to reorder.
|
||||
========= ===========================================================
|
||||
|
||||
Only the DEBUG option is supported by the special tunnel management
|
||||
PPPoX socket.
|
||||
|
@ -177,20 +185,22 @@ setsockopt on the PPPoX socket to set a debug mask.
|
|||
|
||||
The following debug mask bits are available:
|
||||
|
||||
================ ==============================
|
||||
L2TP_MSG_DEBUG verbose debug (if compiled in)
|
||||
L2TP_MSG_CONTROL userspace - kernel interface
|
||||
L2TP_MSG_SEQ sequence numbers handling
|
||||
L2TP_MSG_DATA data packets
|
||||
================ ==============================
|
||||
|
||||
If enabled, files under a l2tp debugfs directory can be used to dump
|
||||
kernel state about L2TP tunnels and sessions. To access it, the
|
||||
debugfs filesystem must first be mounted.
|
||||
debugfs filesystem must first be mounted::
|
||||
|
||||
# mount -t debugfs debugfs /debug
|
||||
# mount -t debugfs debugfs /debug
|
||||
|
||||
Files under the l2tp directory can then be accessed.
|
||||
Files under the l2tp directory can then be accessed::
|
||||
|
||||
# cat /debug/l2tp/tunnels
|
||||
# cat /debug/l2tp/tunnels
|
||||
|
||||
The debugfs files should not be used by applications to obtain L2TP
|
||||
state information because the file format is subject to change. It is
|
||||
|
@ -211,14 +221,14 @@ iproute2's ip utility to support this.
|
|||
|
||||
To create an L2TPv3 ethernet pseudowire between local host 192.168.1.1
|
||||
and peer 192.168.1.2, using IP addresses 10.5.1.1 and 10.5.1.2 for the
|
||||
tunnel endpoints:-
|
||||
tunnel endpoints::
|
||||
|
||||
# ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \
|
||||
udp_dport 5000 encap udp local 192.168.1.1 remote 192.168.1.2
|
||||
# ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1
|
||||
# ip -s -d show dev l2tpeth0
|
||||
# ip addr add 10.5.1.2/32 peer 10.5.1.1/32 dev l2tpeth0
|
||||
# ip li set dev l2tpeth0 up
|
||||
# ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \
|
||||
udp_dport 5000 encap udp local 192.168.1.1 remote 192.168.1.2
|
||||
# ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1
|
||||
# ip -s -d show dev l2tpeth0
|
||||
# ip addr add 10.5.1.2/32 peer 10.5.1.1/32 dev l2tpeth0
|
||||
# ip li set dev l2tpeth0 up
|
||||
|
||||
Choose IP addresses to be the address of a local IP interface and that
|
||||
of the remote system. The IP addresses of the l2tpeth0 interface can be
|
||||
|
@ -228,75 +238,78 @@ Repeat the above at the peer, with ports, tunnel/session ids and IP
|
|||
addresses reversed. The tunnel and session IDs can be any non-zero
|
||||
32-bit number, but the values must be reversed at the peer.
|
||||
|
||||
======================== ===================
|
||||
Host 1 Host2
|
||||
======================== ===================
|
||||
udp_sport=5000 udp_sport=5001
|
||||
udp_dport=5001 udp_dport=5000
|
||||
tunnel_id=42 tunnel_id=45
|
||||
peer_tunnel_id=45 peer_tunnel_id=42
|
||||
session_id=128 session_id=5196755
|
||||
peer_session_id=5196755 peer_session_id=128
|
||||
======================== ===================
|
||||
|
||||
When done at both ends of the tunnel, it should be possible to send
|
||||
data over the network. e.g.
|
||||
data over the network. e.g.::
|
||||
|
||||
# ping 10.5.1.1
|
||||
# ping 10.5.1.1
|
||||
|
||||
|
||||
Sample Userspace Code
|
||||
=====================
|
||||
|
||||
1. Create tunnel management PPPoX socket
|
||||
1. Create tunnel management PPPoX socket::
|
||||
|
||||
kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP);
|
||||
if (kernel_fd >= 0) {
|
||||
struct sockaddr_pppol2tp sax;
|
||||
struct sockaddr_in const *peer_addr;
|
||||
kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP);
|
||||
if (kernel_fd >= 0) {
|
||||
struct sockaddr_pppol2tp sax;
|
||||
struct sockaddr_in const *peer_addr;
|
||||
|
||||
peer_addr = l2tp_tunnel_get_peer_addr(tunnel);
|
||||
memset(&sax, 0, sizeof(sax));
|
||||
sax.sa_family = AF_PPPOX;
|
||||
sax.sa_protocol = PX_PROTO_OL2TP;
|
||||
sax.pppol2tp.fd = udp_fd; /* fd of tunnel UDP socket */
|
||||
sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr;
|
||||
sax.pppol2tp.addr.sin_port = peer_addr->sin_port;
|
||||
sax.pppol2tp.addr.sin_family = AF_INET;
|
||||
sax.pppol2tp.s_tunnel = tunnel_id;
|
||||
sax.pppol2tp.s_session = 0; /* special case: mgmt socket */
|
||||
sax.pppol2tp.d_tunnel = 0;
|
||||
sax.pppol2tp.d_session = 0; /* special case: mgmt socket */
|
||||
peer_addr = l2tp_tunnel_get_peer_addr(tunnel);
|
||||
memset(&sax, 0, sizeof(sax));
|
||||
sax.sa_family = AF_PPPOX;
|
||||
sax.sa_protocol = PX_PROTO_OL2TP;
|
||||
sax.pppol2tp.fd = udp_fd; /* fd of tunnel UDP socket */
|
||||
sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr;
|
||||
sax.pppol2tp.addr.sin_port = peer_addr->sin_port;
|
||||
sax.pppol2tp.addr.sin_family = AF_INET;
|
||||
sax.pppol2tp.s_tunnel = tunnel_id;
|
||||
sax.pppol2tp.s_session = 0; /* special case: mgmt socket */
|
||||
sax.pppol2tp.d_tunnel = 0;
|
||||
sax.pppol2tp.d_session = 0; /* special case: mgmt socket */
|
||||
|
||||
if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) {
|
||||
perror("connect failed");
|
||||
result = -errno;
|
||||
goto err;
|
||||
}
|
||||
}
|
||||
if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) {
|
||||
perror("connect failed");
|
||||
result = -errno;
|
||||
goto err;
|
||||
}
|
||||
}
|
||||
|
||||
2. Create session PPPoX data socket
|
||||
2. Create session PPPoX data socket::
|
||||
|
||||
struct sockaddr_pppol2tp sax;
|
||||
int fd;
|
||||
struct sockaddr_pppol2tp sax;
|
||||
int fd;
|
||||
|
||||
/* Note, the target socket must be bound already, else it will not be ready */
|
||||
sax.sa_family = AF_PPPOX;
|
||||
sax.sa_protocol = PX_PROTO_OL2TP;
|
||||
sax.pppol2tp.fd = tunnel_fd;
|
||||
sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr;
|
||||
sax.pppol2tp.addr.sin_port = addr->sin_port;
|
||||
sax.pppol2tp.addr.sin_family = AF_INET;
|
||||
sax.pppol2tp.s_tunnel = tunnel_id;
|
||||
sax.pppol2tp.s_session = session_id;
|
||||
sax.pppol2tp.d_tunnel = peer_tunnel_id;
|
||||
sax.pppol2tp.d_session = peer_session_id;
|
||||
/* Note, the target socket must be bound already, else it will not be ready */
|
||||
sax.sa_family = AF_PPPOX;
|
||||
sax.sa_protocol = PX_PROTO_OL2TP;
|
||||
sax.pppol2tp.fd = tunnel_fd;
|
||||
sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr;
|
||||
sax.pppol2tp.addr.sin_port = addr->sin_port;
|
||||
sax.pppol2tp.addr.sin_family = AF_INET;
|
||||
sax.pppol2tp.s_tunnel = tunnel_id;
|
||||
sax.pppol2tp.s_session = session_id;
|
||||
sax.pppol2tp.d_tunnel = peer_tunnel_id;
|
||||
sax.pppol2tp.d_session = peer_session_id;
|
||||
|
||||
/* session_fd is the fd of the session's PPPoL2TP socket.
|
||||
* tunnel_fd is the fd of the tunnel UDP socket.
|
||||
*/
|
||||
fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
|
||||
if (fd < 0 ) {
|
||||
return -errno;
|
||||
}
|
||||
return 0;
|
||||
/* session_fd is the fd of the session's PPPoL2TP socket.
|
||||
* tunnel_fd is the fd of the tunnel UDP socket.
|
||||
*/
|
||||
fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
|
||||
if (fd < 0 ) {
|
||||
return -errno;
|
||||
}
|
||||
return 0;
|
||||
|
||||
Internal Implementation
|
||||
=======================
|
|
@ -1,8 +1,14 @@
|
|||
The Linux LAPB Module Interface 1.3
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Jonathan Naylor 29.12.96
|
||||
===============================
|
||||
The Linux LAPB Module Interface
|
||||
===============================
|
||||
|
||||
Changed (Henner Eisen, 2000-10-29): int return value for data_indication()
|
||||
Version 1.3
|
||||
|
||||
Jonathan Naylor 29.12.96
|
||||
|
||||
Changed (Henner Eisen, 2000-10-29): int return value for data_indication()
|
||||
|
||||
The LAPB module will be a separately compiled module for use by any parts of
|
||||
the Linux operating system that require a LAPB service. This document
|
||||
|
@ -32,16 +38,16 @@ LAPB Initialisation Structure
|
|||
|
||||
This structure is used only once, in the call to lapb_register (see below).
|
||||
It contains information about the device driver that requires the services
|
||||
of the LAPB module.
|
||||
of the LAPB module::
|
||||
|
||||
struct lapb_register_struct {
|
||||
void (*connect_confirmation)(int token, int reason);
|
||||
void (*connect_indication)(int token, int reason);
|
||||
void (*disconnect_confirmation)(int token, int reason);
|
||||
void (*disconnect_indication)(int token, int reason);
|
||||
int (*data_indication)(int token, struct sk_buff *skb);
|
||||
void (*data_transmit)(int token, struct sk_buff *skb);
|
||||
};
|
||||
struct lapb_register_struct {
|
||||
void (*connect_confirmation)(int token, int reason);
|
||||
void (*connect_indication)(int token, int reason);
|
||||
void (*disconnect_confirmation)(int token, int reason);
|
||||
void (*disconnect_indication)(int token, int reason);
|
||||
int (*data_indication)(int token, struct sk_buff *skb);
|
||||
void (*data_transmit)(int token, struct sk_buff *skb);
|
||||
};
|
||||
|
||||
Each member of this structure corresponds to a function in the device driver
|
||||
that is called when a particular event in the LAPB module occurs. These will
|
||||
|
@ -54,19 +60,19 @@ LAPB Parameter Structure
|
|||
|
||||
This structure is used with the lapb_getparms and lapb_setparms functions
|
||||
(see below). They are used to allow the device driver to get and set the
|
||||
operational parameters of the LAPB implementation for a given connection.
|
||||
operational parameters of the LAPB implementation for a given connection::
|
||||
|
||||
struct lapb_parms_struct {
|
||||
unsigned int t1;
|
||||
unsigned int t1timer;
|
||||
unsigned int t2;
|
||||
unsigned int t2timer;
|
||||
unsigned int n2;
|
||||
unsigned int n2count;
|
||||
unsigned int window;
|
||||
unsigned int state;
|
||||
unsigned int mode;
|
||||
};
|
||||
struct lapb_parms_struct {
|
||||
unsigned int t1;
|
||||
unsigned int t1timer;
|
||||
unsigned int t2;
|
||||
unsigned int t2timer;
|
||||
unsigned int n2;
|
||||
unsigned int n2count;
|
||||
unsigned int window;
|
||||
unsigned int state;
|
||||
unsigned int mode;
|
||||
};
|
||||
|
||||
T1 and T2 are protocol timing parameters and are given in units of 100ms. N2
|
||||
is the maximum number of tries on the link before it is declared a failure.
|
||||
|
@ -78,11 +84,14 @@ link.
|
|||
The mode variable is a bit field used for setting (at present) three values.
|
||||
The bit fields have the following meanings:
|
||||
|
||||
====== =================================================
|
||||
Bit Meaning
|
||||
====== =================================================
|
||||
0 LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED).
|
||||
1 [SM]LP operation (0=LAPB_SLP 1=LAPB=MLP).
|
||||
2 DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE)
|
||||
3-31 Reserved, must be 0.
|
||||
====== =================================================
|
||||
|
||||
Extended LAPB operation indicates the use of extended sequence numbers and
|
||||
consequently larger window sizes, the default is standard LAPB operation.
|
||||
|
@ -99,8 +108,9 @@ Functions
|
|||
|
||||
The LAPB module provides a number of function entry points.
|
||||
|
||||
::
|
||||
|
||||
int lapb_register(void *token, struct lapb_register_struct);
|
||||
int lapb_register(void *token, struct lapb_register_struct);
|
||||
|
||||
This must be called before the LAPB module may be used. If the call is
|
||||
successful then LAPB_OK is returned. The token must be a unique identifier
|
||||
|
@ -111,33 +121,42 @@ For multiple LAPB links in a single device driver, multiple calls to
|
|||
lapb_register must be made. The format of the lapb_register_struct is given
|
||||
above. The return values are:
|
||||
|
||||
============= =============================
|
||||
LAPB_OK LAPB registered successfully.
|
||||
LAPB_BADTOKEN Token is already registered.
|
||||
LAPB_NOMEM Out of memory
|
||||
============= =============================
|
||||
|
||||
::
|
||||
|
||||
int lapb_unregister(void *token);
|
||||
int lapb_unregister(void *token);
|
||||
|
||||
This releases all the resources associated with a LAPB link. Any current
|
||||
LAPB link will be abandoned without further messages being passed. After
|
||||
this call, the value of token is no longer valid for any calls to the LAPB
|
||||
function. The valid return values are:
|
||||
|
||||
============= ===============================
|
||||
LAPB_OK LAPB unregistered successfully.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
============= ===============================
|
||||
|
||||
::
|
||||
|
||||
int lapb_getparms(void *token, struct lapb_parms_struct *parms);
|
||||
int lapb_getparms(void *token, struct lapb_parms_struct *parms);
|
||||
|
||||
This allows the device driver to get the values of the current LAPB
|
||||
variables, the lapb_parms_struct is described above. The valid return values
|
||||
are:
|
||||
|
||||
============= =============================
|
||||
LAPB_OK LAPB getparms was successful.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
============= =============================
|
||||
|
||||
::
|
||||
|
||||
int lapb_setparms(void *token, struct lapb_parms_struct *parms);
|
||||
int lapb_setparms(void *token, struct lapb_parms_struct *parms);
|
||||
|
||||
This allows the device driver to set the values of the current LAPB
|
||||
variables, the lapb_parms_struct is described above. The values of t1timer,
|
||||
|
@ -145,42 +164,54 @@ t2timer and n2count are ignored, likewise changing the mode bits when
|
|||
connected will be ignored. An error implies that none of the values have
|
||||
been changed. The valid return values are:
|
||||
|
||||
============= =================================================
|
||||
LAPB_OK LAPB getparms was successful.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_INVALUE One of the values was out of its allowable range.
|
||||
============= =================================================
|
||||
|
||||
::
|
||||
|
||||
int lapb_connect_request(void *token);
|
||||
int lapb_connect_request(void *token);
|
||||
|
||||
Initiate a connect using the current parameter settings. The valid return
|
||||
values are:
|
||||
|
||||
============== =================================
|
||||
LAPB_OK LAPB is starting to connect.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_CONNECTED LAPB module is already connected.
|
||||
============== =================================
|
||||
|
||||
::
|
||||
|
||||
int lapb_disconnect_request(void *token);
|
||||
int lapb_disconnect_request(void *token);
|
||||
|
||||
Initiate a disconnect. The valid return values are:
|
||||
|
||||
================= ===============================
|
||||
LAPB_OK LAPB is starting to disconnect.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_NOTCONNECTED LAPB module is not connected.
|
||||
================= ===============================
|
||||
|
||||
::
|
||||
|
||||
int lapb_data_request(void *token, struct sk_buff *skb);
|
||||
int lapb_data_request(void *token, struct sk_buff *skb);
|
||||
|
||||
Queue data with the LAPB module for transmitting over the link. If the call
|
||||
is successful then the skbuff is owned by the LAPB module and may not be
|
||||
used by the device driver again. The valid return values are:
|
||||
|
||||
================= =============================
|
||||
LAPB_OK LAPB has accepted the data.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
LAPB_NOTCONNECTED LAPB module is not connected.
|
||||
================= =============================
|
||||
|
||||
::
|
||||
|
||||
int lapb_data_received(void *token, struct sk_buff *skb);
|
||||
int lapb_data_received(void *token, struct sk_buff *skb);
|
||||
|
||||
Queue data with the LAPB module which has been received from the device. It
|
||||
is expected that the data passed to the LAPB module has skb->data pointing
|
||||
|
@ -188,9 +219,10 @@ to the beginning of the LAPB data. If the call is successful then the skbuff
|
|||
is owned by the LAPB module and may not be used by the device driver again.
|
||||
The valid return values are:
|
||||
|
||||
============= ===========================
|
||||
LAPB_OK LAPB has accepted the data.
|
||||
LAPB_BADTOKEN Invalid/unknown LAPB token.
|
||||
|
||||
============= ===========================
|
||||
|
||||
Callbacks
|
||||
---------
|
||||
|
@ -200,49 +232,58 @@ module to call when an event occurs. They are registered with the LAPB
|
|||
module with lapb_register (see above) in the structure lapb_register_struct
|
||||
(see above).
|
||||
|
||||
::
|
||||
|
||||
void (*connect_confirmation)(void *token, int reason);
|
||||
void (*connect_confirmation)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when a connection is established after
|
||||
being requested by a call to lapb_connect_request (see above). The reason is
|
||||
always LAPB_OK.
|
||||
|
||||
::
|
||||
|
||||
void (*connect_indication)(void *token, int reason);
|
||||
void (*connect_indication)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when the link is established by the remote
|
||||
system. The value of reason is always LAPB_OK.
|
||||
|
||||
::
|
||||
|
||||
void (*disconnect_confirmation)(void *token, int reason);
|
||||
void (*disconnect_confirmation)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when an event occurs after the device
|
||||
driver has called lapb_disconnect_request (see above). The reason indicates
|
||||
what has happened. In all cases the LAPB link can be regarded as being
|
||||
terminated. The values for reason are:
|
||||
|
||||
================= ====================================================
|
||||
LAPB_OK The LAPB link was terminated normally.
|
||||
LAPB_NOTCONNECTED The remote system was not connected.
|
||||
LAPB_TIMEDOUT No response was received in N2 tries from the remote
|
||||
system.
|
||||
================= ====================================================
|
||||
|
||||
::
|
||||
|
||||
void (*disconnect_indication)(void *token, int reason);
|
||||
void (*disconnect_indication)(void *token, int reason);
|
||||
|
||||
This is called by the LAPB module when the link is terminated by the remote
|
||||
system or another event has occurred to terminate the link. This may be
|
||||
returned in response to a lapb_connect_request (see above) if the remote
|
||||
system refused the request. The values for reason are:
|
||||
|
||||
================= ====================================================
|
||||
LAPB_OK The LAPB link was terminated normally by the remote
|
||||
system.
|
||||
LAPB_REFUSED The remote system refused the connect request.
|
||||
LAPB_NOTCONNECTED The remote system was not connected.
|
||||
LAPB_TIMEDOUT No response was received in N2 tries from the remote
|
||||
system.
|
||||
================= ====================================================
|
||||
|
||||
::
|
||||
|
||||
int (*data_indication)(void *token, struct sk_buff *skb);
|
||||
int (*data_indication)(void *token, struct sk_buff *skb);
|
||||
|
||||
This is called by the LAPB module when data has been received from the
|
||||
remote system that should be passed onto the next layer in the protocol
|
||||
|
@ -254,8 +295,9 @@ This method should return NET_RX_DROP (as defined in the header
|
|||
file include/linux/netdevice.h) if and only if the frame was dropped
|
||||
before it could be delivered to the upper layer.
|
||||
|
||||
::
|
||||
|
||||
void (*data_transmit)(void *token, struct sk_buff *skb);
|
||||
void (*data_transmit)(void *token, struct sk_buff *skb);
|
||||
|
||||
This is called by the LAPB module when data is to be transmitted to the
|
||||
remote system by the device driver. The skbuff becomes the property of the
|
|
@ -1,3 +1,9 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===========
|
||||
LTPC Driver
|
||||
===========
|
||||
|
||||
This is the ALPHA version of the ltpc driver.
|
||||
|
||||
In order to use it, you will need at least version 1.3.3 of the
|
||||
|
@ -15,7 +21,7 @@ yourself. (see "Card Configuration" below for how to determine or
|
|||
change the settings on your card)
|
||||
|
||||
When the driver is compiled into the kernel, you can add a line such
|
||||
as the following to your /etc/lilo.conf:
|
||||
as the following to your /etc/lilo.conf::
|
||||
|
||||
append="ltpc=0x240,9,1"
|
||||
|
||||
|
@ -25,13 +31,13 @@ the driver will try to determine them itself.
|
|||
|
||||
If you load the driver as a module, you can pass the parameters "io=",
|
||||
"irq=", and "dma=" on the command line with insmod or modprobe, or add
|
||||
them as options in a configuration file in /etc/modprobe.d/ directory:
|
||||
them as options in a configuration file in /etc/modprobe.d/ directory::
|
||||
|
||||
alias lt0 ltpc # autoload the module when the interface is configured
|
||||
options ltpc io=0x240 irq=9 dma=1
|
||||
|
||||
Before starting up the netatalk demons (perhaps in rc.local), you
|
||||
need to add a line such as:
|
||||
need to add a line such as::
|
||||
|
||||
/sbin/ifconfig lt0 127.0.0.42
|
||||
|
||||
|
@ -42,7 +48,7 @@ The appropriate netatalk configuration depends on whether you are
|
|||
attached to a network that includes AppleTalk routers or not. If,
|
||||
like me, you are simply connecting to your home Macintoshes and
|
||||
printers, you need to set up netatalk to "seed". The way I do this
|
||||
is to have the lines
|
||||
is to have the lines::
|
||||
|
||||
dummy -seed -phase 2 -net 2000 -addr 2000.26 -zone "1033"
|
||||
lt0 -seed -phase 1 -net 1033 -addr 1033.27 -zone "1033"
|
||||
|
@ -57,13 +63,13 @@ such.
|
|||
|
||||
If you are attached to an extended AppleTalk network, with routers on
|
||||
it, then you don't need to fool around with this -- the appropriate
|
||||
line in atalkd.conf is
|
||||
line in atalkd.conf is::
|
||||
|
||||
lt0 -phase 1
|
||||
|
||||
--------------------------------------
|
||||
|
||||
Card Configuration:
|
||||
Card Configuration
|
||||
==================
|
||||
|
||||
The interrupts and so forth are configured via the dipswitch on the
|
||||
board. Set the switches so as not to conflict with other hardware.
|
||||
|
@ -73,26 +79,32 @@ board. Set the switches so as not to conflict with other hardware.
|
|||
original documentation refers to IRQ2. Since you'll be running
|
||||
this on an AT (or later) class machine, that really means IRQ9.
|
||||
|
||||
=== ===========================================================
|
||||
SW1 IRQ 4
|
||||
SW2 IRQ 3
|
||||
SW3 IRQ 9 (2 in original card documentation only applies to XT)
|
||||
=== ===========================================================
|
||||
|
||||
|
||||
DMA -- choose DMA 1 or 3, and set both corresponding switches.
|
||||
|
||||
=== =====
|
||||
SW4 DMA 3
|
||||
SW5 DMA 1
|
||||
SW6 DMA 3
|
||||
SW7 DMA 1
|
||||
=== =====
|
||||
|
||||
|
||||
I/O address -- choose one.
|
||||
|
||||
=== =========
|
||||
SW8 220 / 240
|
||||
=== =========
|
||||
|
||||
--------------------------------------
|
||||
|
||||
IP:
|
||||
IP
|
||||
==
|
||||
|
||||
Yes, it is possible to do IP over LocalTalk. However, you can't just
|
||||
treat the LocalTalk device like an ordinary Ethernet device, even if
|
||||
|
@ -102,9 +114,9 @@ Instead, you follow the same procedure as for doing IP in EtherTalk.
|
|||
See Documentation/networking/ipddp.rst for more information about the
|
||||
kernel driver and userspace tools needed.
|
||||
|
||||
--------------------------------------
|
||||
|
||||
BUGS:
|
||||
Bugs
|
||||
====
|
||||
|
||||
IRQ autoprobing often doesn't work on a cold boot. To get around
|
||||
this, either compile the driver as a module, or pass the parameters
|
||||
|
@ -120,12 +132,13 @@ It may theoretically be possible to use two LTPC cards in the same
|
|||
machine, but this is unsupported, so if you really want to do this,
|
||||
you'll probably have to hack the initialization code a bit.
|
||||
|
||||
______________________________________
|
||||
|
||||
THANKS:
|
||||
Thanks to Alan Cox for helpful discussions early on in this
|
||||
Thanks
|
||||
======
|
||||
|
||||
Thanks to Alan Cox for helpful discussions early on in this
|
||||
work, and to Denis Hainsworth for doing the bleeding-edge testing.
|
||||
|
||||
-- Bradford Johnson <bradford@math.umn.edu>
|
||||
Bradford Johnson <bradford@math.umn.edu>
|
||||
|
||||
-- Updated 11/09/1998 by David Huggins-Daines <dhd@debian.org>
|
||||
Updated 11/09/1998 by David Huggins-Daines <dhd@debian.org>
|
|
@ -1,16 +1,19 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=========================================
|
||||
How to use packet injection with mac80211
|
||||
=========================================
|
||||
|
||||
mac80211 now allows arbitrary packets to be injected down any Monitor Mode
|
||||
interface from userland. The packet you inject needs to be composed in the
|
||||
following format:
|
||||
following format::
|
||||
|
||||
[ radiotap header ]
|
||||
[ ieee80211 header ]
|
||||
[ payload ]
|
||||
|
||||
The radiotap format is discussed in
|
||||
./Documentation/networking/radiotap-headers.txt.
|
||||
./Documentation/networking/radiotap-headers.rst.
|
||||
|
||||
Despite many radiotap parameters being currently defined, most only make sense
|
||||
to appear on received packets. The following information is parsed from the
|
||||
|
@ -18,15 +21,19 @@ radiotap headers and used to control injection:
|
|||
|
||||
* IEEE80211_RADIOTAP_FLAGS
|
||||
|
||||
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
|
||||
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
|
||||
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
|
||||
========================= ===========================================
|
||||
IEEE80211_RADIOTAP_F_FCS FCS will be removed and recalculated
|
||||
IEEE80211_RADIOTAP_F_WEP frame will be encrypted if key available
|
||||
IEEE80211_RADIOTAP_F_FRAG frame will be fragmented if longer than the
|
||||
current fragmentation threshold.
|
||||
========================= ===========================================
|
||||
|
||||
* IEEE80211_RADIOTAP_TX_FLAGS
|
||||
|
||||
IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
|
||||
============================= ========================================
|
||||
IEEE80211_RADIOTAP_F_TX_NOACK frame should be sent without waiting for
|
||||
an ACK even if it is a unicast frame
|
||||
============================= ========================================
|
||||
|
||||
* IEEE80211_RADIOTAP_RATE
|
||||
|
||||
|
@ -37,8 +44,10 @@ radiotap headers and used to control injection:
|
|||
HT rate for the transmission (only for devices without own rate control).
|
||||
Also some flags are parsed
|
||||
|
||||
IEEE80211_RADIOTAP_MCS_SGI: use short guard interval
|
||||
IEEE80211_RADIOTAP_MCS_BW_40: send in HT40 mode
|
||||
============================ ========================
|
||||
IEEE80211_RADIOTAP_MCS_SGI use short guard interval
|
||||
IEEE80211_RADIOTAP_MCS_BW_40 send in HT40 mode
|
||||
============================ ========================
|
||||
|
||||
* IEEE80211_RADIOTAP_DATA_RETRIES
|
||||
|
||||
|
@ -51,17 +60,17 @@ radiotap headers and used to control injection:
|
|||
without own rate control). Also other fields are parsed
|
||||
|
||||
flags field
|
||||
IEEE80211_RADIOTAP_VHT_FLAG_SGI: use short guard interval
|
||||
IEEE80211_RADIOTAP_VHT_FLAG_SGI: use short guard interval
|
||||
|
||||
bandwidth field
|
||||
1: send using 40MHz channel width
|
||||
4: send using 80MHz channel width
|
||||
11: send using 160MHz channel width
|
||||
* 1: send using 40MHz channel width
|
||||
* 4: send using 80MHz channel width
|
||||
* 11: send using 160MHz channel width
|
||||
|
||||
The injection code can also skip all other currently defined radiotap fields
|
||||
facilitating replay of captured radiotap headers directly.
|
||||
|
||||
Here is an example valid radiotap header defining some parameters
|
||||
Here is an example valid radiotap header defining some parameters::
|
||||
|
||||
0x00, 0x00, // <-- radiotap version
|
||||
0x0b, 0x00, // <- radiotap header length
|
||||
|
@ -71,7 +80,7 @@ Here is an example valid radiotap header defining some parameters
|
|||
0x01 //<-- antenna
|
||||
|
||||
The ieee80211 header follows immediately afterwards, looking for example like
|
||||
this:
|
||||
this::
|
||||
|
||||
0x08, 0x01, 0x00, 0x00,
|
||||
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
|
||||
|
@ -84,10 +93,10 @@ Then lastly there is the payload.
|
|||
After composing the packet contents, it is sent by send()-ing it to a logical
|
||||
mac80211 interface that is in Monitor mode. Libpcap can also be used,
|
||||
(which is easier than doing the work to bind the socket to the right
|
||||
interface), along the following lines:
|
||||
interface), along the following lines:::
|
||||
|
||||
ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf);
|
||||
...
|
||||
...
|
||||
r = pcap_inject(ppcap, u8aSendBuffer, nLength);
|
||||
|
||||
You can also find a link to a complete inject application here:
|
|
@ -1,4 +1,11 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
====================
|
||||
MPLS Sysfs variables
|
||||
====================
|
||||
|
||||
/proc/sys/net/mpls/* Variables:
|
||||
===============================
|
||||
|
||||
platform_labels - INTEGER
|
||||
Number of entries in the platform label table. It is not
|
||||
|
@ -17,6 +24,7 @@ platform_labels - INTEGER
|
|||
no longer fit in the table.
|
||||
|
||||
Possible values: 0 - 1048575
|
||||
|
||||
Default: 0
|
||||
|
||||
ip_ttl_propagate - BOOL
|
||||
|
@ -27,8 +35,8 @@ ip_ttl_propagate - BOOL
|
|||
If disabled, the MPLS transport network will appear as a
|
||||
single hop to transit traffic.
|
||||
|
||||
0 - disabled / RFC 3443 [Short] Pipe Model
|
||||
1 - enabled / RFC 3443 Uniform Model (default)
|
||||
* 0 - disabled / RFC 3443 [Short] Pipe Model
|
||||
* 1 - enabled / RFC 3443 Uniform Model (default)
|
||||
|
||||
default_ttl - INTEGER
|
||||
Default TTL value to use for MPLS packets where it cannot be
|
||||
|
@ -36,6 +44,7 @@ default_ttl - INTEGER
|
|||
or ip_ttl_propagate has been disabled.
|
||||
|
||||
Possible values: 1 - 255
|
||||
|
||||
Default: 255
|
||||
|
||||
conf/<interface>/input - BOOL
|
||||
|
@ -44,5 +53,5 @@ conf/<interface>/input - BOOL
|
|||
If disabled, packets will be discarded without further
|
||||
processing.
|
||||
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
* 0 - disabled (default)
|
||||
* not 0 - enabled
|
|
@ -1,17 +1,17 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
HOWTO for multiqueue network device support
|
||||
===========================================
|
||||
===========================================
|
||||
HOWTO for multiqueue network device support
|
||||
===========================================
|
||||
|
||||
Section 1: Base driver requirements for implementing multiqueue support
|
||||
=======================================================================
|
||||
|
||||
Intro: Kernel support for multiqueue devices
|
||||
---------------------------------------------------------
|
||||
|
||||
Kernel support for multiqueue devices is always present.
|
||||
|
||||
Section 1: Base driver requirements for implementing multiqueue support
|
||||
-----------------------------------------------------------------------
|
||||
|
||||
Base drivers are required to use the new alloc_etherdev_mq() or
|
||||
alloc_netdev_mq() functions to allocate the subqueues for the device. The
|
||||
underlying kernel API will take care of the allocation and deallocation of
|
||||
|
@ -26,8 +26,7 @@ comes online or when it's completely shut down (unregister_netdev(), etc.).
|
|||
|
||||
|
||||
Section 2: Qdisc support for multiqueue devices
|
||||
|
||||
-----------------------------------------------
|
||||
===============================================
|
||||
|
||||
Currently two qdiscs are optimized for multiqueue devices. The first is the
|
||||
default pfifo_fast qdisc. This qdisc supports one qdisc per hardware queue.
|
||||
|
@ -46,22 +45,22 @@ will be queued to the band associated with the hardware queue.
|
|||
|
||||
|
||||
Section 3: Brief howto using MULTIQ for multiqueue devices
|
||||
---------------------------------------------------------------
|
||||
==========================================================
|
||||
|
||||
The userspace command 'tc,' part of the iproute2 package, is used to configure
|
||||
qdiscs. To add the MULTIQ qdisc to your network device, assuming the device
|
||||
is called eth0, run the following command:
|
||||
is called eth0, run the following command::
|
||||
|
||||
# tc qdisc add dev eth0 root handle 1: multiq
|
||||
# tc qdisc add dev eth0 root handle 1: multiq
|
||||
|
||||
The qdisc will allocate the number of bands to equal the number of queues that
|
||||
the device reports, and bring the qdisc online. Assuming eth0 has 4 Tx
|
||||
queues, the band mapping would look like:
|
||||
queues, the band mapping would look like::
|
||||
|
||||
band 0 => queue 0
|
||||
band 1 => queue 1
|
||||
band 2 => queue 2
|
||||
band 3 => queue 3
|
||||
band 0 => queue 0
|
||||
band 1 => queue 1
|
||||
band 2 => queue 2
|
||||
band 3 => queue 3
|
||||
|
||||
Traffic will begin flowing through each queue based on either the simple_tx_hash
|
||||
function or based on netdev->select_queue() if you have it defined.
|
||||
|
@ -69,11 +68,11 @@ function or based on netdev->select_queue() if you have it defined.
|
|||
The behavior of tc filters remains the same. However a new tc action,
|
||||
skbedit, has been added. Assuming you wanted to route all traffic to a
|
||||
specific host, for example 192.168.0.3, through a specific queue you could use
|
||||
this action and establish a filter such as:
|
||||
this action and establish a filter such as::
|
||||
|
||||
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
|
||||
match ip dst 192.168.0.3 \
|
||||
action skbedit queue_mapping 3
|
||||
tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
|
||||
match ip dst 192.168.0.3 \
|
||||
action skbedit queue_mapping 3
|
||||
|
||||
Author: Alexander Duyck <alexander.h.duyck@intel.com>
|
||||
Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
|
||||
:Author: Alexander Duyck <alexander.h.duyck@intel.com>
|
||||
:Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
|
|
@ -1,7 +1,16 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==========
|
||||
Netconsole
|
||||
==========
|
||||
|
||||
|
||||
started by Ingo Molnar <mingo@redhat.com>, 2001.09.17
|
||||
|
||||
2.6 port and netpoll api by Matt Mackall <mpm@selenic.com>, Sep 9 2003
|
||||
|
||||
IPv6 support by Cong Wang <xiyou.wangcong@gmail.com>, Jan 1 2013
|
||||
|
||||
Extended console support by Tejun Heo <tj@kernel.org>, May 1 2015
|
||||
|
||||
Please send bug reports to Matt Mackall <mpm@selenic.com>
|
||||
|
@ -23,34 +32,34 @@ Sender and receiver configuration:
|
|||
==================================
|
||||
|
||||
It takes a string configuration parameter "netconsole" in the
|
||||
following format:
|
||||
following format::
|
||||
|
||||
netconsole=[+][src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
|
||||
|
||||
where
|
||||
+ if present, enable extended console support
|
||||
src-port source for UDP packets (defaults to 6665)
|
||||
src-ip source IP to use (interface address)
|
||||
dev network interface (eth0)
|
||||
tgt-port port for logging agent (6666)
|
||||
tgt-ip IP address for logging agent
|
||||
tgt-macaddr ethernet MAC address for logging agent (broadcast)
|
||||
+ if present, enable extended console support
|
||||
src-port source for UDP packets (defaults to 6665)
|
||||
src-ip source IP to use (interface address)
|
||||
dev network interface (eth0)
|
||||
tgt-port port for logging agent (6666)
|
||||
tgt-ip IP address for logging agent
|
||||
tgt-macaddr ethernet MAC address for logging agent (broadcast)
|
||||
|
||||
Examples:
|
||||
Examples::
|
||||
|
||||
linux netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
|
||||
|
||||
or
|
||||
or::
|
||||
|
||||
insmod netconsole netconsole=@/,@10.0.0.2/
|
||||
|
||||
or using IPv6
|
||||
or using IPv6::
|
||||
|
||||
insmod netconsole netconsole=@/,@fd00:1:2:3::1/
|
||||
|
||||
It also supports logging to multiple remote agents by specifying
|
||||
parameters for the multiple agents separated by semicolons and the
|
||||
complete string enclosed in "quotes", thusly:
|
||||
complete string enclosed in "quotes", thusly::
|
||||
|
||||
modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
|
||||
|
||||
|
@ -67,14 +76,19 @@ for example:
|
|||
|
||||
On distributions using a BSD-based netcat version (e.g. Fedora,
|
||||
openSUSE and Ubuntu) the listening port must be specified without
|
||||
the -p switch:
|
||||
the -p switch::
|
||||
|
||||
'nc -u -l -p <port>' / 'nc -u -l <port>' or
|
||||
'netcat -u -l -p <port>' / 'netcat -u -l <port>'
|
||||
nc -u -l -p <port>' / 'nc -u -l <port>
|
||||
|
||||
or::
|
||||
|
||||
netcat -u -l -p <port>' / 'netcat -u -l <port>
|
||||
|
||||
3) socat
|
||||
|
||||
'socat udp-recv:<port> -'
|
||||
::
|
||||
|
||||
socat udp-recv:<port> -
|
||||
|
||||
Dynamic reconfiguration:
|
||||
========================
|
||||
|
@ -92,7 +106,7 @@ netconsole module (or kernel, if netconsole is built-in).
|
|||
Some examples follow (where configfs is mounted at the /sys/kernel/config
|
||||
mountpoint).
|
||||
|
||||
To add a remote logging target (target names can be arbitrary):
|
||||
To add a remote logging target (target names can be arbitrary)::
|
||||
|
||||
cd /sys/kernel/config/netconsole/
|
||||
mkdir target1
|
||||
|
@ -102,12 +116,13 @@ above) and are disabled by default -- they must first be enabled by writing
|
|||
"1" to the "enabled" attribute (usually after setting parameters accordingly)
|
||||
as described below.
|
||||
|
||||
To remove a target:
|
||||
To remove a target::
|
||||
|
||||
rmdir /sys/kernel/config/netconsole/othertarget/
|
||||
|
||||
The interface exposes these parameters of a netconsole target to userspace:
|
||||
|
||||
============== ================================= ============
|
||||
enabled Is this target currently enabled? (read-write)
|
||||
extended Extended mode enabled (read-write)
|
||||
dev_name Local network interface name (read-write)
|
||||
|
@ -117,12 +132,13 @@ The interface exposes these parameters of a netconsole target to userspace:
|
|||
remote_ip Remote agent's IP address (read-write)
|
||||
local_mac Local interface's MAC address (read-only)
|
||||
remote_mac Remote agent's MAC address (read-write)
|
||||
============== ================================= ============
|
||||
|
||||
The "enabled" attribute is also used to control whether the parameters of
|
||||
a target can be updated or not -- you can modify the parameters of only
|
||||
disabled targets (i.e. if "enabled" is 0).
|
||||
|
||||
To update a target's parameters:
|
||||
To update a target's parameters::
|
||||
|
||||
cat enabled # check if enabled is 1
|
||||
echo 0 > enabled # disable the target (if required)
|
||||
|
@ -140,12 +156,12 @@ Extended console:
|
|||
|
||||
If '+' is prefixed to the configuration line or "extended" config file
|
||||
is set to 1, extended console support is enabled. An example boot
|
||||
param follows.
|
||||
param follows::
|
||||
|
||||
linux netconsole=+4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
|
||||
|
||||
Log messages are transmitted with extended metadata header in the
|
||||
following format which is the same as /dev/kmsg.
|
||||
following format which is the same as /dev/kmsg::
|
||||
|
||||
<level>,<sequnum>,<timestamp>,<contflag>;<message text>
|
||||
|
||||
|
@ -155,12 +171,12 @@ newline is used as the delimeter.
|
|||
|
||||
If a message doesn't fit in certain number of bytes (currently 1000),
|
||||
the message is split into multiple fragments by netconsole. These
|
||||
fragments are transmitted with "ncfrag" header field added.
|
||||
fragments are transmitted with "ncfrag" header field added::
|
||||
|
||||
ncfrag=<byte-offset>/<total-bytes>
|
||||
|
||||
For example, assuming a lot smaller chunk size, a message "the first
|
||||
chunk, the 2nd chunk." may be split as follows.
|
||||
chunk, the 2nd chunk." may be split as follows::
|
||||
|
||||
6,416,1758426,-,ncfrag=0/31;the first chunk,
|
||||
6,416,1758426,-,ncfrag=16/31; the 2nd chunk.
|
||||
|
@ -168,39 +184,52 @@ chunk, the 2nd chunk." may be split as follows.
|
|||
Miscellaneous notes:
|
||||
====================
|
||||
|
||||
WARNING: the default target ethernet setting uses the broadcast
|
||||
ethernet address to send packets, which can cause increased load on
|
||||
other systems on the same ethernet segment.
|
||||
.. Warning::
|
||||
|
||||
TIP: some LAN switches may be configured to suppress ethernet broadcasts
|
||||
so it is advised to explicitly specify the remote agents' MAC addresses
|
||||
from the config parameters passed to netconsole.
|
||||
the default target ethernet setting uses the broadcast
|
||||
ethernet address to send packets, which can cause increased load on
|
||||
other systems on the same ethernet segment.
|
||||
|
||||
TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
|
||||
.. Tip::
|
||||
|
||||
ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
|
||||
some LAN switches may be configured to suppress ethernet broadcasts
|
||||
so it is advised to explicitly specify the remote agents' MAC addresses
|
||||
from the config parameters passed to netconsole.
|
||||
|
||||
TIP: in case the remote logging agent is on a separate LAN subnet than
|
||||
the sender, it is suggested to try specifying the MAC address of the
|
||||
default gateway (you may use /sbin/route -n to find it out) as the
|
||||
remote MAC address instead.
|
||||
.. Tip::
|
||||
|
||||
NOTE: the network device (eth1 in the above case) can run any kind
|
||||
of other network traffic, netconsole is not intrusive. Netconsole
|
||||
might cause slight delays in other traffic if the volume of kernel
|
||||
messages is high, but should have no other impact.
|
||||
to find out the MAC address of, say, 10.0.0.2, you may try using::
|
||||
|
||||
NOTE: if you find that the remote logging agent is not receiving or
|
||||
printing all messages from the sender, it is likely that you have set
|
||||
the "console_loglevel" parameter (on the sender) to only send high
|
||||
priority messages to the console. You can change this at runtime using:
|
||||
ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
|
||||
|
||||
dmesg -n 8
|
||||
.. Tip::
|
||||
|
||||
or by specifying "debug" on the kernel command line at boot, to send
|
||||
all kernel messages to the console. A specific value for this parameter
|
||||
can also be set using the "loglevel" kernel boot option. See the
|
||||
dmesg(8) man page and Documentation/admin-guide/kernel-parameters.rst for details.
|
||||
in case the remote logging agent is on a separate LAN subnet than
|
||||
the sender, it is suggested to try specifying the MAC address of the
|
||||
default gateway (you may use /sbin/route -n to find it out) as the
|
||||
remote MAC address instead.
|
||||
|
||||
.. note::
|
||||
|
||||
the network device (eth1 in the above case) can run any kind
|
||||
of other network traffic, netconsole is not intrusive. Netconsole
|
||||
might cause slight delays in other traffic if the volume of kernel
|
||||
messages is high, but should have no other impact.
|
||||
|
||||
.. note::
|
||||
|
||||
if you find that the remote logging agent is not receiving or
|
||||
printing all messages from the sender, it is likely that you have set
|
||||
the "console_loglevel" parameter (on the sender) to only send high
|
||||
priority messages to the console. You can change this at runtime using::
|
||||
|
||||
dmesg -n 8
|
||||
|
||||
or by specifying "debug" on the kernel command line at boot, to send
|
||||
all kernel messages to the console. A specific value for this parameter
|
||||
can also be set using the "loglevel" kernel boot option. See the
|
||||
dmesg(8) man page and Documentation/admin-guide/kernel-parameters.rst
|
||||
for details.
|
||||
|
||||
Netconsole was designed to be as instantaneous as possible, to
|
||||
enable the logging of even the most critical kernel bugs. It works
|
|
@ -1,3 +1,6 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=====================================================
|
||||
Netdev features mess and how to get out from it alive
|
||||
=====================================================
|
||||
|
||||
|
@ -6,8 +9,8 @@ Author:
|
|||
|
||||
|
||||
|
||||
Part I: Feature sets
|
||||
======================
|
||||
Part I: Feature sets
|
||||
====================
|
||||
|
||||
Long gone are the days when a network card would just take and give packets
|
||||
verbatim. Today's devices add multiple features and bugs (read: offloads)
|
||||
|
@ -39,8 +42,8 @@ one used internally by network core:
|
|||
|
||||
|
||||
|
||||
Part II: Controlling enabled features
|
||||
=======================================
|
||||
Part II: Controlling enabled features
|
||||
=====================================
|
||||
|
||||
When current feature set (netdev->features) is to be changed, new set
|
||||
is calculated and filtered by calling ndo_fix_features callback
|
||||
|
@ -65,8 +68,8 @@ driver except by means of ndo_fix_features callback.
|
|||
|
||||
|
||||
|
||||
Part III: Implementation hints
|
||||
================================
|
||||
Part III: Implementation hints
|
||||
==============================
|
||||
|
||||
* ndo_fix_features:
|
||||
|
||||
|
@ -94,8 +97,8 @@ Errors returned are not (and cannot be) propagated anywhere except dmesg.
|
|||
|
||||
|
||||
|
||||
Part IV: Features
|
||||
===================
|
||||
Part IV: Features
|
||||
=================
|
||||
|
||||
For current list of features, see include/linux/netdev_features.h.
|
||||
This section describes semantics of some of them.
|
|
@ -1,5 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=====================================
|
||||
Network Devices, the Kernel, and You!
|
||||
=====================================
|
||||
|
||||
|
||||
Introduction
|
||||
|
@ -75,11 +78,12 @@ ndo_start_xmit:
|
|||
Don't use it for new drivers.
|
||||
|
||||
Context: Process with BHs disabled or BH (timer),
|
||||
will be called with interrupts disabled by netconsole.
|
||||
will be called with interrupts disabled by netconsole.
|
||||
|
||||
Return codes:
|
||||
o NETDEV_TX_OK everything ok.
|
||||
o NETDEV_TX_BUSY Cannot transmit packet, try later
|
||||
Return codes:
|
||||
|
||||
* NETDEV_TX_OK everything ok.
|
||||
* NETDEV_TX_BUSY Cannot transmit packet, try later
|
||||
Usually a bug, means queue start/stop flow control is broken in
|
||||
the driver. Note: the driver must NOT put the skb in its DMA ring.
|
||||
|
||||
|
@ -95,10 +99,13 @@ ndo_set_rx_mode:
|
|||
struct napi_struct synchronization rules
|
||||
========================================
|
||||
napi->poll:
|
||||
Synchronization: NAPI_STATE_SCHED bit in napi->state. Device
|
||||
Synchronization:
|
||||
NAPI_STATE_SCHED bit in napi->state. Device
|
||||
driver's ndo_stop method will invoke napi_disable() on
|
||||
all NAPI instances which will do a sleeping poll on the
|
||||
NAPI_STATE_SCHED napi->state bit, waiting for all pending
|
||||
NAPI activity to cease.
|
||||
Context: softirq
|
||||
will be called with interrupts disabled by netconsole.
|
||||
|
||||
Context:
|
||||
softirq
|
||||
will be called with interrupts disabled by netconsole.
|
|
@ -1,8 +1,15 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=========================
|
||||
Netfilter Sysfs variables
|
||||
=========================
|
||||
|
||||
/proc/sys/net/netfilter/* Variables:
|
||||
====================================
|
||||
|
||||
nf_log_all_netns - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
- 0 - disabled (default)
|
||||
- not 0 - enabled
|
||||
|
||||
By default, only init_net namespace can log packets into kernel log
|
||||
with LOG target; this aims to prevent containers from flooding host
|
|
@ -0,0 +1,95 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===============
|
||||
NETIF Msg Level
|
||||
===============
|
||||
|
||||
The design of the network interface message level setting.
|
||||
|
||||
History
|
||||
-------
|
||||
|
||||
The design of the debugging message interface was guided and
|
||||
constrained by backwards compatibility previous practice. It is useful
|
||||
to understand the history and evolution in order to understand current
|
||||
practice and relate it to older driver source code.
|
||||
|
||||
From the beginning of Linux, each network device driver has had a local
|
||||
integer variable that controls the debug message level. The message
|
||||
level ranged from 0 to 7, and monotonically increased in verbosity.
|
||||
|
||||
The message level was not precisely defined past level 3, but were
|
||||
always implemented within +-1 of the specified level. Drivers tended
|
||||
to shed the more verbose level messages as they matured.
|
||||
|
||||
- 0 Minimal messages, only essential information on fatal errors.
|
||||
- 1 Standard messages, initialization status. No run-time messages
|
||||
- 2 Special media selection messages, generally timer-driver.
|
||||
- 3 Interface starts and stops, including normal status messages
|
||||
- 4 Tx and Rx frame error messages, and abnormal driver operation
|
||||
- 5 Tx packet queue information, interrupt events.
|
||||
- 6 Status on each completed Tx packet and received Rx packets
|
||||
- 7 Initial contents of Tx and Rx packets
|
||||
|
||||
Initially this message level variable was uniquely named in each driver
|
||||
e.g. "lance_debug", so that a kernel symbolic debugger could locate and
|
||||
modify the setting. When kernel modules became common, the variables
|
||||
were consistently renamed to "debug" and allowed to be set as a module
|
||||
parameter.
|
||||
|
||||
This approach worked well. However there is always a demand for
|
||||
additional features. Over the years the following emerged as
|
||||
reasonable and easily implemented enhancements
|
||||
|
||||
- Using an ioctl() call to modify the level.
|
||||
- Per-interface rather than per-driver message level setting.
|
||||
- More selective control over the type of messages emitted.
|
||||
|
||||
The netif_msg recommendation adds these features with only a minor
|
||||
complexity and code size increase.
|
||||
|
||||
The recommendation is the following points
|
||||
|
||||
- Retaining the per-driver integer variable "debug" as a module
|
||||
parameter with a default level of '1'.
|
||||
|
||||
- Adding a per-interface private variable named "msg_enable". The
|
||||
variable is a bit map rather than a level, and is initialized as::
|
||||
|
||||
1 << debug
|
||||
|
||||
Or more precisely::
|
||||
|
||||
debug < 0 ? 0 : 1 << min(sizeof(int)-1, debug)
|
||||
|
||||
Messages should changes from::
|
||||
|
||||
if (debug > 1)
|
||||
printk(MSG_DEBUG "%s: ...
|
||||
|
||||
to::
|
||||
|
||||
if (np->msg_enable & NETIF_MSG_LINK)
|
||||
printk(MSG_DEBUG "%s: ...
|
||||
|
||||
|
||||
The set of message levels is named
|
||||
|
||||
|
||||
========= =================== ============
|
||||
Old level Name Bit position
|
||||
========= =================== ============
|
||||
0 NETIF_MSG_DRV 0x0001
|
||||
1 NETIF_MSG_PROBE 0x0002
|
||||
2 NETIF_MSG_LINK 0x0004
|
||||
2 NETIF_MSG_TIMER 0x0004
|
||||
3 NETIF_MSG_IFDOWN 0x0008
|
||||
3 NETIF_MSG_IFUP 0x0008
|
||||
4 NETIF_MSG_RX_ERR 0x0010
|
||||
4 NETIF_MSG_TX_ERR 0x0010
|
||||
5 NETIF_MSG_TX_QUEUED 0x0020
|
||||
5 NETIF_MSG_INTR 0x0020
|
||||
6 NETIF_MSG_TX_DONE 0x0040
|
||||
6 NETIF_MSG_RX_STATUS 0x0040
|
||||
7 NETIF_MSG_PKTDATA 0x0080
|
||||
========= =================== ============
|
|
@ -1,79 +0,0 @@
|
|||
|
||||
________________
|
||||
NETIF Msg Level
|
||||
|
||||
The design of the network interface message level setting.
|
||||
|
||||
History
|
||||
|
||||
The design of the debugging message interface was guided and
|
||||
constrained by backwards compatibility previous practice. It is useful
|
||||
to understand the history and evolution in order to understand current
|
||||
practice and relate it to older driver source code.
|
||||
|
||||
From the beginning of Linux, each network device driver has had a local
|
||||
integer variable that controls the debug message level. The message
|
||||
level ranged from 0 to 7, and monotonically increased in verbosity.
|
||||
|
||||
The message level was not precisely defined past level 3, but were
|
||||
always implemented within +-1 of the specified level. Drivers tended
|
||||
to shed the more verbose level messages as they matured.
|
||||
0 Minimal messages, only essential information on fatal errors.
|
||||
1 Standard messages, initialization status. No run-time messages
|
||||
2 Special media selection messages, generally timer-driver.
|
||||
3 Interface starts and stops, including normal status messages
|
||||
4 Tx and Rx frame error messages, and abnormal driver operation
|
||||
5 Tx packet queue information, interrupt events.
|
||||
6 Status on each completed Tx packet and received Rx packets
|
||||
7 Initial contents of Tx and Rx packets
|
||||
|
||||
Initially this message level variable was uniquely named in each driver
|
||||
e.g. "lance_debug", so that a kernel symbolic debugger could locate and
|
||||
modify the setting. When kernel modules became common, the variables
|
||||
were consistently renamed to "debug" and allowed to be set as a module
|
||||
parameter.
|
||||
|
||||
This approach worked well. However there is always a demand for
|
||||
additional features. Over the years the following emerged as
|
||||
reasonable and easily implemented enhancements
|
||||
Using an ioctl() call to modify the level.
|
||||
Per-interface rather than per-driver message level setting.
|
||||
More selective control over the type of messages emitted.
|
||||
|
||||
The netif_msg recommendation adds these features with only a minor
|
||||
complexity and code size increase.
|
||||
|
||||
The recommendation is the following points
|
||||
Retaining the per-driver integer variable "debug" as a module
|
||||
parameter with a default level of '1'.
|
||||
|
||||
Adding a per-interface private variable named "msg_enable". The
|
||||
variable is a bit map rather than a level, and is initialized as
|
||||
1 << debug
|
||||
Or more precisely
|
||||
debug < 0 ? 0 : 1 << min(sizeof(int)-1, debug)
|
||||
|
||||
Messages should changes from
|
||||
if (debug > 1)
|
||||
printk(MSG_DEBUG "%s: ...
|
||||
to
|
||||
if (np->msg_enable & NETIF_MSG_LINK)
|
||||
printk(MSG_DEBUG "%s: ...
|
||||
|
||||
|
||||
The set of message levels is named
|
||||
Old level Name Bit position
|
||||
0 NETIF_MSG_DRV 0x0001
|
||||
1 NETIF_MSG_PROBE 0x0002
|
||||
2 NETIF_MSG_LINK 0x0004
|
||||
2 NETIF_MSG_TIMER 0x0004
|
||||
3 NETIF_MSG_IFDOWN 0x0008
|
||||
3 NETIF_MSG_IFUP 0x0008
|
||||
4 NETIF_MSG_RX_ERR 0x0010
|
||||
4 NETIF_MSG_TX_ERR 0x0010
|
||||
5 NETIF_MSG_TX_QUEUED 0x0020
|
||||
5 NETIF_MSG_INTR 0x0020
|
||||
6 NETIF_MSG_TX_DONE 0x0040
|
||||
6 NETIF_MSG_RX_STATUS 0x0040
|
||||
7 NETIF_MSG_PKTDATA 0x0080
|
||||
|
|
@ -1,8 +1,15 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===================================
|
||||
Netfilter Conntrack Sysfs variables
|
||||
===================================
|
||||
|
||||
/proc/sys/net/netfilter/nf_conntrack_* Variables:
|
||||
=================================================
|
||||
|
||||
nf_conntrack_acct - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
- 0 - disabled (default)
|
||||
- not 0 - enabled
|
||||
|
||||
Enable connection tracking flow accounting. 64-bit byte and packet
|
||||
counters per flow are added.
|
||||
|
@ -16,8 +23,8 @@ nf_conntrack_buckets - INTEGER
|
|||
This sysctl is only writeable in the initial net namespace.
|
||||
|
||||
nf_conntrack_checksum - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
- 0 - disabled
|
||||
- not 0 - enabled (default)
|
||||
|
||||
Verify checksum of incoming packets. Packets with bad checksums are
|
||||
in INVALID state. If this is enabled, such packets will not be
|
||||
|
@ -27,8 +34,8 @@ nf_conntrack_count - INTEGER (read-only)
|
|||
Number of currently allocated flow entries.
|
||||
|
||||
nf_conntrack_events - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
- 0 - disabled
|
||||
- not 0 - enabled (default)
|
||||
|
||||
If this option is enabled, the connection tracking code will
|
||||
provide userspace with connection tracking events via ctnetlink.
|
||||
|
@ -62,8 +69,8 @@ nf_conntrack_generic_timeout - INTEGER (seconds)
|
|||
protocols.
|
||||
|
||||
nf_conntrack_helper - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
- 0 - disabled (default)
|
||||
- not 0 - enabled
|
||||
|
||||
Enable automatic conntrack helper assignment.
|
||||
If disabled it is required to set up iptables rules to assign
|
||||
|
@ -81,14 +88,14 @@ nf_conntrack_icmpv6_timeout - INTEGER (seconds)
|
|||
Default for ICMP6 timeout.
|
||||
|
||||
nf_conntrack_log_invalid - INTEGER
|
||||
0 - disable (default)
|
||||
1 - log ICMP packets
|
||||
6 - log TCP packets
|
||||
17 - log UDP packets
|
||||
33 - log DCCP packets
|
||||
41 - log ICMPv6 packets
|
||||
136 - log UDPLITE packets
|
||||
255 - log packets of any protocol
|
||||
- 0 - disable (default)
|
||||
- 1 - log ICMP packets
|
||||
- 6 - log TCP packets
|
||||
- 17 - log UDP packets
|
||||
- 33 - log DCCP packets
|
||||
- 41 - log ICMPv6 packets
|
||||
- 136 - log UDPLITE packets
|
||||
- 255 - log packets of any protocol
|
||||
|
||||
Log invalid packets of a type specified by value.
|
||||
|
||||
|
@ -97,15 +104,15 @@ nf_conntrack_max - INTEGER
|
|||
nf_conntrack_buckets value * 4.
|
||||
|
||||
nf_conntrack_tcp_be_liberal - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
- 0 - disabled (default)
|
||||
- not 0 - enabled
|
||||
|
||||
Be conservative in what you do, be liberal in what you accept from others.
|
||||
If it's non-zero, we mark only out of window RST segments as INVALID.
|
||||
|
||||
nf_conntrack_tcp_loose - BOOLEAN
|
||||
0 - disabled
|
||||
not 0 - enabled (default)
|
||||
- 0 - disabled
|
||||
- not 0 - enabled (default)
|
||||
|
||||
If it is set to zero, we disable picking up already established
|
||||
connections.
|
||||
|
@ -148,8 +155,8 @@ nf_conntrack_tcp_timeout_unacknowledged - INTEGER (seconds)
|
|||
default 300
|
||||
|
||||
nf_conntrack_timestamp - BOOLEAN
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
- 0 - disabled (default)
|
||||
- not 0 - enabled
|
||||
|
||||
Enable connection tracking flow timestamping.
|
||||
|
|
@ -1,3 +1,6 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
====================================
|
||||
Netfilter's flowtable infrastructure
|
||||
====================================
|
||||
|
||||
|
@ -31,15 +34,17 @@ to use this new alternative forwarding path via nftables policy.
|
|||
This is represented in Fig.1, which describes the classic forwarding path
|
||||
including the Netfilter hooks and the flowtable fastpath bypass.
|
||||
|
||||
userspace process
|
||||
^ |
|
||||
| |
|
||||
_____|____ ____\/___
|
||||
/ \ / \
|
||||
| input | | output |
|
||||
\__________/ \_________/
|
||||
^ |
|
||||
| |
|
||||
::
|
||||
|
||||
userspace process
|
||||
^ |
|
||||
| |
|
||||
_____|____ ____\/___
|
||||
/ \ / \
|
||||
| input | | output |
|
||||
\__________/ \_________/
|
||||
^ |
|
||||
| |
|
||||
_________ __________ --------- _____\/_____
|
||||
/ \ / \ |Routing | / \
|
||||
--> ingress ---> prerouting ---> |decision| | postrouting |--> neigh_xmit
|
||||
|
@ -59,7 +64,7 @@ including the Netfilter hooks and the flowtable fastpath bypass.
|
|||
\ / |
|
||||
|__yes_________________fastpath bypass ____________________________|
|
||||
|
||||
Fig.1 Netfilter hooks and flowtable interactions
|
||||
Fig.1 Netfilter hooks and flowtable interactions
|
||||
|
||||
The flowtable entry also stores the NAT configuration, so all packets are
|
||||
mangled according to the NAT policy that matches the initial packets that went
|
||||
|
@ -72,18 +77,18 @@ Example configuration
|
|||
---------------------
|
||||
|
||||
Enabling the flowtable bypass is relatively easy, you only need to create a
|
||||
flowtable and add one rule to your forward chain.
|
||||
flowtable and add one rule to your forward chain::
|
||||
|
||||
table inet x {
|
||||
table inet x {
|
||||
flowtable f {
|
||||
hook ingress priority 0; devices = { eth0, eth1 };
|
||||
}
|
||||
chain y {
|
||||
type filter hook forward priority 0; policy accept;
|
||||
ip protocol tcp flow offload @f
|
||||
counter packets 0 bytes 0
|
||||
}
|
||||
}
|
||||
chain y {
|
||||
type filter hook forward priority 0; policy accept;
|
||||
ip protocol tcp flow offload @f
|
||||
counter packets 0 bytes 0
|
||||
}
|
||||
}
|
||||
|
||||
This example adds the flowtable 'f' to the ingress hook of the eth0 and eth1
|
||||
netdevices. You can create as many flowtables as you want in case you need to
|
||||
|
@ -101,12 +106,12 @@ forwarding bypass.
|
|||
More reading
|
||||
------------
|
||||
|
||||
This documentation is based on the LWN.net articles [1][2]. Rafal Milecki also
|
||||
made a very complete and comprehensive summary called "A state of network
|
||||
This documentation is based on the LWN.net articles [1]_\ [2]_. Rafal Milecki
|
||||
also made a very complete and comprehensive summary called "A state of network
|
||||
acceleration" that describes how things were before this infrastructure was
|
||||
mailined [3] and it also makes a rough summary of this work [4].
|
||||
mailined [3]_ and it also makes a rough summary of this work [4]_.
|
||||
|
||||
[1] https://lwn.net/Articles/738214/
|
||||
[2] https://lwn.net/Articles/742164/
|
||||
[3] http://lists.infradead.org/pipermail/lede-dev/2018-January/010830.html
|
||||
[4] http://lists.infradead.org/pipermail/lede-dev/2018-January/010829.html
|
||||
.. [1] https://lwn.net/Articles/738214/
|
||||
.. [2] https://lwn.net/Articles/742164/
|
||||
.. [3] http://lists.infradead.org/pipermail/lede-dev/2018-January/010830.html
|
||||
.. [4] http://lists.infradead.org/pipermail/lede-dev/2018-January/010829.html
|
|
@ -1,3 +1,6 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=============================================
|
||||
Open vSwitch datapath developer documentation
|
||||
=============================================
|
||||
|
||||
|
@ -80,13 +83,13 @@ The <linux/openvswitch.h> header file defines the exact format of the
|
|||
flow key attributes. For informal explanatory purposes here, we write
|
||||
them as comma-separated strings, with parentheses indicating arguments
|
||||
and nesting. For example, the following could represent a flow key
|
||||
corresponding to a TCP packet that arrived on vport 1:
|
||||
corresponding to a TCP packet that arrived on vport 1::
|
||||
|
||||
in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
|
||||
eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
|
||||
frag=no), tcp(src=49163, dst=80)
|
||||
|
||||
Often we ellipsize arguments not important to the discussion, e.g.:
|
||||
Often we ellipsize arguments not important to the discussion, e.g.::
|
||||
|
||||
in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
|
||||
|
||||
|
@ -151,20 +154,20 @@ Some care is needed to really maintain forward and backward
|
|||
compatibility for applications that follow the rules listed under
|
||||
"Flow key compatibility" above.
|
||||
|
||||
The basic rule is obvious:
|
||||
The basic rule is obvious::
|
||||
|
||||
------------------------------------------------------------------
|
||||
==================================================================
|
||||
New network protocol support must only supplement existing flow
|
||||
key attributes. It must not change the meaning of already defined
|
||||
flow key attributes.
|
||||
------------------------------------------------------------------
|
||||
==================================================================
|
||||
|
||||
This rule does have less-obvious consequences so it is worth working
|
||||
through a few examples. Suppose, for example, that the kernel module
|
||||
did not already implement VLAN parsing. Instead, it just interpreted
|
||||
the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
|
||||
packet. The flow key for any packet with an 802.1Q header would look
|
||||
essentially like this, ignoring metadata:
|
||||
essentially like this, ignoring metadata::
|
||||
|
||||
eth(...), eth_type(0x8100)
|
||||
|
||||
|
@ -172,7 +175,7 @@ Naively, to add VLAN support, it makes sense to add a new "vlan" flow
|
|||
key attribute to contain the VLAN tag, then continue to decode the
|
||||
encapsulated headers beyond the VLAN tag using the existing field
|
||||
definitions. With this change, a TCP packet in VLAN 10 would have a
|
||||
flow key much like this:
|
||||
flow key much like this::
|
||||
|
||||
eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
|
||||
|
||||
|
@ -187,7 +190,7 @@ across kernel versions even though it follows the compatibility rules.
|
|||
|
||||
The solution is to use a set of nested attributes. This is, for
|
||||
example, why 802.1Q support uses nested attributes. A TCP packet in
|
||||
VLAN 10 is actually expressed as:
|
||||
VLAN 10 is actually expressed as::
|
||||
|
||||
eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
|
||||
ip(proto=6, ...), tcp(...)))
|
||||
|
@ -215,14 +218,14 @@ For example, consider a packet that contains an IP header that
|
|||
indicates protocol 6 for TCP, but which is truncated just after the IP
|
||||
header, so that the TCP header is missing. The flow key for this
|
||||
packet would include a tcp attribute with all-zero src and dst, like
|
||||
this:
|
||||
this::
|
||||
|
||||
eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
|
||||
|
||||
As another example, consider a packet with an Ethernet type of 0x8100,
|
||||
indicating that a VLAN TCI should follow, but which is truncated just
|
||||
after the Ethernet type. The flow key for this packet would include
|
||||
an all-zero-bits vlan and an empty encap attribute, like this:
|
||||
an all-zero-bits vlan and an empty encap attribute, like this::
|
||||
|
||||
eth(...), eth_type(0x8100), vlan(0), encap()
|
||||
|
|
@ -1,5 +1,12 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==================
|
||||
Operational States
|
||||
==================
|
||||
|
||||
|
||||
1. Introduction
|
||||
===============
|
||||
|
||||
Linux distinguishes between administrative and operational state of an
|
||||
interface. Administrative state is the result of "ip link set dev
|
||||
|
@ -20,6 +27,7 @@ and changeable from userspace under certain rules.
|
|||
|
||||
|
||||
2. Querying from userspace
|
||||
==========================
|
||||
|
||||
Both admin and operational state can be queried via the netlink
|
||||
operation RTM_GETLINK. It is also possible to subscribe to RTNLGRP_LINK
|
||||
|
@ -30,16 +38,20 @@ These values contain interface state:
|
|||
|
||||
ifinfomsg::if_flags & IFF_UP:
|
||||
Interface is admin up
|
||||
|
||||
ifinfomsg::if_flags & IFF_RUNNING:
|
||||
Interface is in RFC2863 operational state UP or UNKNOWN. This is for
|
||||
backward compatibility, routing daemons, dhcp clients can use this
|
||||
flag to determine whether they should use the interface.
|
||||
|
||||
ifinfomsg::if_flags & IFF_LOWER_UP:
|
||||
Driver has signaled netif_carrier_on()
|
||||
|
||||
ifinfomsg::if_flags & IFF_DORMANT:
|
||||
Driver has signaled netif_dormant_on()
|
||||
|
||||
TLV IFLA_OPERSTATE
|
||||
------------------
|
||||
|
||||
contains RFC2863 state of the interface in numeric representation:
|
||||
|
||||
|
@ -47,26 +59,33 @@ IF_OPER_UNKNOWN (0):
|
|||
Interface is in unknown state, neither driver nor userspace has set
|
||||
operational state. Interface must be considered for user data as
|
||||
setting operational state has not been implemented in every driver.
|
||||
|
||||
IF_OPER_NOTPRESENT (1):
|
||||
Unused in current kernel (notpresent interfaces normally disappear),
|
||||
just a numerical placeholder.
|
||||
|
||||
IF_OPER_DOWN (2):
|
||||
Interface is unable to transfer data on L1, f.e. ethernet is not
|
||||
plugged or interface is ADMIN down.
|
||||
|
||||
IF_OPER_LOWERLAYERDOWN (3):
|
||||
Interfaces stacked on an interface that is IF_OPER_DOWN show this
|
||||
state (f.e. VLAN).
|
||||
|
||||
IF_OPER_TESTING (4):
|
||||
Unused in current kernel.
|
||||
|
||||
IF_OPER_DORMANT (5):
|
||||
Interface is L1 up, but waiting for an external event, f.e. for a
|
||||
protocol to establish. (802.1X)
|
||||
|
||||
IF_OPER_UP (6):
|
||||
Interface is operational up and can be used.
|
||||
|
||||
This TLV can also be queried via sysfs.
|
||||
|
||||
TLV IFLA_LINKMODE
|
||||
-----------------
|
||||
|
||||
contains link policy. This is needed for userspace interaction
|
||||
described below.
|
||||
|
@ -75,6 +94,7 @@ This TLV can also be queried via sysfs.
|
|||
|
||||
|
||||
3. Kernel driver API
|
||||
====================
|
||||
|
||||
Kernel drivers have access to two flags that map to IFF_LOWER_UP and
|
||||
IFF_DORMANT. These flags can be set from everywhere, even from
|
||||
|
@ -126,6 +146,7 @@ netif_carrier_ok() && !netif_dormant():
|
|||
|
||||
|
||||
4. Setting from userspace
|
||||
=========================
|
||||
|
||||
Applications have to use the netlink interface to influence the
|
||||
RFC2863 operational state of an interface. Setting IFLA_LINKMODE to 1
|
||||
|
@ -139,18 +160,18 @@ are multicasted on the netlink group RTNLGRP_LINK.
|
|||
|
||||
So basically a 802.1X supplicant interacts with the kernel like this:
|
||||
|
||||
-subscribe to RTNLGRP_LINK
|
||||
-set IFLA_LINKMODE to 1 via RTM_SETLINK
|
||||
-query RTM_GETLINK once to get initial state
|
||||
-if initial flags are not (IFF_LOWER_UP && !IFF_DORMANT), wait until
|
||||
netlink multicast signals this state
|
||||
-do 802.1X, eventually abort if flags go down again
|
||||
-send RTM_SETLINK to set operstate to IF_OPER_UP if authentication
|
||||
succeeds, IF_OPER_DORMANT otherwise
|
||||
-see how operstate and IFF_RUNNING is echoed via netlink multicast
|
||||
-set interface back to IF_OPER_DORMANT if 802.1X reauthentication
|
||||
fails
|
||||
-restart if kernel changes IFF_LOWER_UP or IFF_DORMANT flag
|
||||
- subscribe to RTNLGRP_LINK
|
||||
- set IFLA_LINKMODE to 1 via RTM_SETLINK
|
||||
- query RTM_GETLINK once to get initial state
|
||||
- if initial flags are not (IFF_LOWER_UP && !IFF_DORMANT), wait until
|
||||
netlink multicast signals this state
|
||||
- do 802.1X, eventually abort if flags go down again
|
||||
- send RTM_SETLINK to set operstate to IF_OPER_UP if authentication
|
||||
succeeds, IF_OPER_DORMANT otherwise
|
||||
- see how operstate and IFF_RUNNING is echoed via netlink multicast
|
||||
- set interface back to IF_OPER_DORMANT if 802.1X reauthentication
|
||||
fails
|
||||
- restart if kernel changes IFF_LOWER_UP or IFF_DORMANT flag
|
||||
|
||||
if supplicant goes down, bring back IFLA_LINKMODE to 0 and
|
||||
IFLA_OPERSTATE to a sane value.
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -1,3 +1,7 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: <isonum.txt>
|
||||
|
||||
============================
|
||||
Linux Phonet protocol family
|
||||
============================
|
||||
|
||||
|
@ -11,6 +15,7 @@ device attached to the modem. The modem takes care of routing.
|
|||
|
||||
Phonet packets can be exchanged through various hardware connections
|
||||
depending on the device, such as:
|
||||
|
||||
- USB with the CDC Phonet interface,
|
||||
- infrared,
|
||||
- Bluetooth,
|
||||
|
@ -21,7 +26,7 @@ depending on the device, such as:
|
|||
Packets format
|
||||
--------------
|
||||
|
||||
Phonet packets have a common header as follows:
|
||||
Phonet packets have a common header as follows::
|
||||
|
||||
struct phonethdr {
|
||||
uint8_t pn_media; /* Media type (link-layer identifier) */
|
||||
|
@ -72,7 +77,7 @@ only the (default) Linux FIFO qdisc should be used with them.
|
|||
Network layer
|
||||
-------------
|
||||
|
||||
The Phonet socket address family maps the Phonet packet header:
|
||||
The Phonet socket address family maps the Phonet packet header::
|
||||
|
||||
struct sockaddr_pn {
|
||||
sa_family_t spn_family; /* AF_PHONET */
|
||||
|
@ -94,6 +99,8 @@ protocol from the PF_PHONET family. Each socket is bound to one of the
|
|||
2^10 object IDs available, and can send and receive packets with any
|
||||
other peer.
|
||||
|
||||
::
|
||||
|
||||
struct sockaddr_pn addr = { .spn_family = AF_PHONET, };
|
||||
ssize_t len;
|
||||
socklen_t addrlen = sizeof(addr);
|
||||
|
@ -105,7 +112,7 @@ other peer.
|
|||
|
||||
sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr));
|
||||
len = recvfrom(fd, buf, sizeof(buf), 0,
|
||||
(struct sockaddr *)&addr, &addrlen);
|
||||
(struct sockaddr *)&addr, &addrlen);
|
||||
|
||||
This protocol follows the SOCK_DGRAM connection-less semantics.
|
||||
However, connect() and getpeername() are not supported, as they did
|
||||
|
@ -116,7 +123,7 @@ Resource subscription
|
|||
---------------------
|
||||
|
||||
A Phonet datagram socket can be subscribed to any number of 8-bits
|
||||
Phonet resources, as follow:
|
||||
Phonet resources, as follow::
|
||||
|
||||
uint32_t res = 0xXX;
|
||||
ioctl(fd, SIOCPNADDRESOURCE, &res);
|
||||
|
@ -137,6 +144,8 @@ socket paradigm. The listening socket is bound to an unique free object
|
|||
ID. Each listening socket can handle up to 255 simultaneous
|
||||
connections, one per accept()'d socket.
|
||||
|
||||
::
|
||||
|
||||
int lfd, cfd;
|
||||
|
||||
lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
|
||||
|
@ -161,7 +170,7 @@ Connections are traditionally established between two endpoints by a
|
|||
As of Linux kernel version 2.6.39, it is also possible to connect
|
||||
two endpoints directly, using connect() on the active side. This is
|
||||
intended to support the newer Nokia Wireless Modem API, as found in
|
||||
e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform:
|
||||
e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform::
|
||||
|
||||
struct sockaddr_spn spn;
|
||||
int fd;
|
||||
|
@ -177,38 +186,45 @@ e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform:
|
|||
close(fd);
|
||||
|
||||
|
||||
WARNING:
|
||||
When polling a connected pipe socket for writability, there is an
|
||||
intrinsic race condition whereby writability might be lost between the
|
||||
polling and the writing system calls. In this case, the socket will
|
||||
block until write becomes possible again, unless non-blocking mode
|
||||
is enabled.
|
||||
.. Warning:
|
||||
|
||||
When polling a connected pipe socket for writability, there is an
|
||||
intrinsic race condition whereby writability might be lost between the
|
||||
polling and the writing system calls. In this case, the socket will
|
||||
block until write becomes possible again, unless non-blocking mode
|
||||
is enabled.
|
||||
|
||||
|
||||
The pipe protocol provides two socket options at the SOL_PNPIPE level:
|
||||
|
||||
PNPIPE_ENCAP accepts one integer value (int) of:
|
||||
|
||||
PNPIPE_ENCAP_NONE: The socket operates normally (default).
|
||||
PNPIPE_ENCAP_NONE:
|
||||
The socket operates normally (default).
|
||||
|
||||
PNPIPE_ENCAP_IP: The socket is used as a backend for a virtual IP
|
||||
PNPIPE_ENCAP_IP:
|
||||
The socket is used as a backend for a virtual IP
|
||||
interface. This requires CAP_NET_ADMIN capability. GPRS data
|
||||
support on Nokia modems can use this. Note that the socket cannot
|
||||
be reliably poll()'d or read() from while in this mode.
|
||||
|
||||
PNPIPE_IFINDEX is a read-only integer value. It contains the
|
||||
interface index of the network interface created by PNPIPE_ENCAP,
|
||||
or zero if encapsulation is off.
|
||||
PNPIPE_IFINDEX
|
||||
is a read-only integer value. It contains the
|
||||
interface index of the network interface created by PNPIPE_ENCAP,
|
||||
or zero if encapsulation is off.
|
||||
|
||||
PNPIPE_HANDLE is a read-only integer value. It contains the underlying
|
||||
identifier ("pipe handle") of the pipe. This is only defined for
|
||||
socket descriptors that are already connected or being connected.
|
||||
PNPIPE_HANDLE
|
||||
is a read-only integer value. It contains the underlying
|
||||
identifier ("pipe handle") of the pipe. This is only defined for
|
||||
socket descriptors that are already connected or being connected.
|
||||
|
||||
|
||||
Authors
|
||||
-------
|
||||
|
||||
Linux Phonet was initially written by Sakari Ailus.
|
||||
|
||||
Other contributors include Mikä Liljeberg, Andras Domokos,
|
||||
Carlos Chinea and Rémi Denis-Courmont.
|
||||
Copyright (C) 2008 Nokia Corporation.
|
||||
|
||||
Copyright |copy| 2008 Nokia Corporation.
|
|
@ -1,7 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
|
||||
HOWTO for the linux packet generator
|
||||
------------------------------------
|
||||
====================================
|
||||
HOWTO for the linux packet generator
|
||||
====================================
|
||||
|
||||
Enable CONFIG_NET_PKTGEN to compile and build pktgen either in-kernel
|
||||
or as a module. A module is preferred; modprobe pktgen if needed. Once
|
||||
|
@ -9,17 +10,18 @@ running, pktgen creates a thread for each CPU with affinity to that CPU.
|
|||
Monitoring and controlling is done via /proc. It is easiest to select a
|
||||
suitable sample script and configure that.
|
||||
|
||||
On a dual CPU:
|
||||
On a dual CPU::
|
||||
|
||||
ps aux | grep pkt
|
||||
root 129 0.3 0.0 0 0 ? SW 2003 523:20 [kpktgend_0]
|
||||
root 130 0.3 0.0 0 0 ? SW 2003 509:50 [kpktgend_1]
|
||||
ps aux | grep pkt
|
||||
root 129 0.3 0.0 0 0 ? SW 2003 523:20 [kpktgend_0]
|
||||
root 130 0.3 0.0 0 0 ? SW 2003 509:50 [kpktgend_1]
|
||||
|
||||
|
||||
For monitoring and control pktgen creates:
|
||||
For monitoring and control pktgen creates::
|
||||
|
||||
/proc/net/pktgen/pgctrl
|
||||
/proc/net/pktgen/kpktgend_X
|
||||
/proc/net/pktgen/ethX
|
||||
/proc/net/pktgen/ethX
|
||||
|
||||
|
||||
Tuning NIC for max performance
|
||||
|
@ -28,7 +30,8 @@ Tuning NIC for max performance
|
|||
The default NIC settings are (likely) not tuned for pktgen's artificial
|
||||
overload type of benchmarking, as this could hurt the normal use-case.
|
||||
|
||||
Specifically increasing the TX ring buffer in the NIC:
|
||||
Specifically increasing the TX ring buffer in the NIC::
|
||||
|
||||
# ethtool -G ethX tx 1024
|
||||
|
||||
A larger TX ring can improve pktgen's performance, while it can hurt
|
||||
|
@ -46,7 +49,8 @@ This cleanup issue is specifically the case for the driver ixgbe
|
|||
and the cleanup interval is affected by the ethtool --coalesce setting
|
||||
of parameter "rx-usecs".
|
||||
|
||||
For ixgbe use e.g. "30" resulting in approx 33K interrupts/sec (1/30*10^6):
|
||||
For ixgbe use e.g. "30" resulting in approx 33K interrupts/sec (1/30*10^6)::
|
||||
|
||||
# ethtool -C ethX rx-usecs 30
|
||||
|
||||
|
||||
|
@ -55,7 +59,7 @@ Kernel threads
|
|||
Pktgen creates a thread for each CPU with affinity to that CPU.
|
||||
Which is controlled through procfile /proc/net/pktgen/kpktgend_X.
|
||||
|
||||
Example: /proc/net/pktgen/kpktgend_0
|
||||
Example: /proc/net/pktgen/kpktgend_0::
|
||||
|
||||
Running:
|
||||
Stopped: eth4@0
|
||||
|
@ -64,6 +68,7 @@ Example: /proc/net/pktgen/kpktgend_0
|
|||
Most important are the devices assigned to the thread.
|
||||
|
||||
The two basic thread commands are:
|
||||
|
||||
* add_device DEVICE@NAME -- adds a single device
|
||||
* rem_device_all -- remove all associated devices
|
||||
|
||||
|
@ -73,7 +78,7 @@ be unique.
|
|||
|
||||
To support adding the same device to multiple threads, which is useful
|
||||
with multi queue NICs, the device naming scheme is extended with "@":
|
||||
device@something
|
||||
device@something
|
||||
|
||||
The part after "@" can be anything, but it is custom to use the thread
|
||||
number.
|
||||
|
@ -83,30 +88,30 @@ Viewing devices
|
|||
|
||||
The Params section holds configured information. The Current section
|
||||
holds running statistics. The Result is printed after a run or after
|
||||
interruption. Example:
|
||||
interruption. Example::
|
||||
|
||||
/proc/net/pktgen/eth4@0
|
||||
/proc/net/pktgen/eth4@0
|
||||
|
||||
Params: count 100000 min_pkt_size: 60 max_pkt_size: 60
|
||||
frags: 0 delay: 0 clone_skb: 64 ifname: eth4@0
|
||||
flows: 0 flowlen: 0
|
||||
queue_map_min: 0 queue_map_max: 0
|
||||
dst_min: 192.168.81.2 dst_max:
|
||||
src_min: src_max:
|
||||
src_mac: 90:e2:ba:0a:56:b4 dst_mac: 00:1b:21:3c:9d:f8
|
||||
udp_src_min: 9 udp_src_max: 109 udp_dst_min: 9 udp_dst_max: 9
|
||||
src_mac_count: 0 dst_mac_count: 0
|
||||
Flags: UDPSRC_RND NO_TIMESTAMP QUEUE_MAP_CPU
|
||||
Current:
|
||||
pkts-sofar: 100000 errors: 0
|
||||
started: 623913381008us stopped: 623913396439us idle: 25us
|
||||
seq_num: 100001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
|
||||
cur_saddr: 192.168.8.3 cur_daddr: 192.168.81.2
|
||||
cur_udp_dst: 9 cur_udp_src: 42
|
||||
cur_queue_map: 0
|
||||
flows: 0
|
||||
Result: OK: 15430(c15405+d25) usec, 100000 (60byte,0frags)
|
||||
6480562pps 3110Mb/sec (3110669760bps) errors: 0
|
||||
Params: count 100000 min_pkt_size: 60 max_pkt_size: 60
|
||||
frags: 0 delay: 0 clone_skb: 64 ifname: eth4@0
|
||||
flows: 0 flowlen: 0
|
||||
queue_map_min: 0 queue_map_max: 0
|
||||
dst_min: 192.168.81.2 dst_max:
|
||||
src_min: src_max:
|
||||
src_mac: 90:e2:ba:0a:56:b4 dst_mac: 00:1b:21:3c:9d:f8
|
||||
udp_src_min: 9 udp_src_max: 109 udp_dst_min: 9 udp_dst_max: 9
|
||||
src_mac_count: 0 dst_mac_count: 0
|
||||
Flags: UDPSRC_RND NO_TIMESTAMP QUEUE_MAP_CPU
|
||||
Current:
|
||||
pkts-sofar: 100000 errors: 0
|
||||
started: 623913381008us stopped: 623913396439us idle: 25us
|
||||
seq_num: 100001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0
|
||||
cur_saddr: 192.168.8.3 cur_daddr: 192.168.81.2
|
||||
cur_udp_dst: 9 cur_udp_src: 42
|
||||
cur_queue_map: 0
|
||||
flows: 0
|
||||
Result: OK: 15430(c15405+d25) usec, 100000 (60byte,0frags)
|
||||
6480562pps 3110Mb/sec (3110669760bps) errors: 0
|
||||
|
||||
|
||||
Configuring devices
|
||||
|
@ -114,11 +119,12 @@ Configuring devices
|
|||
This is done via the /proc interface, and most easily done via pgset
|
||||
as defined in the sample scripts.
|
||||
You need to specify PGDEV environment variable to use functions from sample
|
||||
scripts, i.e.:
|
||||
export PGDEV=/proc/net/pktgen/eth4@0
|
||||
source samples/pktgen/functions.sh
|
||||
scripts, i.e.::
|
||||
|
||||
Examples:
|
||||
export PGDEV=/proc/net/pktgen/eth4@0
|
||||
source samples/pktgen/functions.sh
|
||||
|
||||
Examples::
|
||||
|
||||
pg_ctrl start starts injection.
|
||||
pg_ctrl stop aborts injection. Also, ^C aborts generator.
|
||||
|
@ -126,17 +132,17 @@ Examples:
|
|||
pgset "clone_skb 1" sets the number of copies of the same packet
|
||||
pgset "clone_skb 0" use single SKB for all transmits
|
||||
pgset "burst 8" uses xmit_more API to queue 8 copies of the same
|
||||
packet and update HW tx queue tail pointer once.
|
||||
"burst 1" is the default
|
||||
packet and update HW tx queue tail pointer once.
|
||||
"burst 1" is the default
|
||||
pgset "pkt_size 9014" sets packet size to 9014
|
||||
pgset "frags 5" packet will consist of 5 fragments
|
||||
pgset "count 200000" sets number of packets to send, set to zero
|
||||
for continuous sends until explicitly stopped.
|
||||
for continuous sends until explicitly stopped.
|
||||
|
||||
pgset "delay 5000" adds delay to hard_start_xmit(). nanoseconds
|
||||
|
||||
pgset "dst 10.0.0.1" sets IP destination address
|
||||
(BEWARE! This generator is very aggressive!)
|
||||
(BEWARE! This generator is very aggressive!)
|
||||
|
||||
pgset "dst_min 10.0.0.1" Same as dst
|
||||
pgset "dst_max 10.0.0.254" Set the maximum destination IP.
|
||||
|
@ -149,46 +155,46 @@ Examples:
|
|||
|
||||
pgset "queue_map_min 0" Sets the min value of tx queue interval
|
||||
pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices
|
||||
To select queue 1 of a given device,
|
||||
use queue_map_min=1 and queue_map_max=1
|
||||
To select queue 1 of a given device,
|
||||
use queue_map_min=1 and queue_map_max=1
|
||||
|
||||
pgset "src_mac_count 1" Sets the number of MACs we'll range through.
|
||||
The 'minimum' MAC is what you set with srcmac.
|
||||
The 'minimum' MAC is what you set with srcmac.
|
||||
|
||||
pgset "dst_mac_count 1" Sets the number of MACs we'll range through.
|
||||
The 'minimum' MAC is what you set with dstmac.
|
||||
The 'minimum' MAC is what you set with dstmac.
|
||||
|
||||
pgset "flag [name]" Set a flag to determine behaviour. Current flags
|
||||
are: IPSRC_RND # IP source is random (between min/max)
|
||||
IPDST_RND # IP destination is random
|
||||
UDPSRC_RND, UDPDST_RND,
|
||||
MACSRC_RND, MACDST_RND
|
||||
TXSIZE_RND, IPV6,
|
||||
MPLS_RND, VID_RND, SVID_RND
|
||||
FLOW_SEQ,
|
||||
QUEUE_MAP_RND # queue map random
|
||||
QUEUE_MAP_CPU # queue map mirrors smp_processor_id()
|
||||
UDPCSUM,
|
||||
IPSEC # IPsec encapsulation (needs CONFIG_XFRM)
|
||||
NODE_ALLOC # node specific memory allocation
|
||||
NO_TIMESTAMP # disable timestamping
|
||||
are: IPSRC_RND # IP source is random (between min/max)
|
||||
IPDST_RND # IP destination is random
|
||||
UDPSRC_RND, UDPDST_RND,
|
||||
MACSRC_RND, MACDST_RND
|
||||
TXSIZE_RND, IPV6,
|
||||
MPLS_RND, VID_RND, SVID_RND
|
||||
FLOW_SEQ,
|
||||
QUEUE_MAP_RND # queue map random
|
||||
QUEUE_MAP_CPU # queue map mirrors smp_processor_id()
|
||||
UDPCSUM,
|
||||
IPSEC # IPsec encapsulation (needs CONFIG_XFRM)
|
||||
NODE_ALLOC # node specific memory allocation
|
||||
NO_TIMESTAMP # disable timestamping
|
||||
pgset 'flag ![name]' Clear a flag to determine behaviour.
|
||||
Note that you might need to use single quote in
|
||||
interactive mode, so that your shell wouldn't expand
|
||||
the specified flag as a history command.
|
||||
Note that you might need to use single quote in
|
||||
interactive mode, so that your shell wouldn't expand
|
||||
the specified flag as a history command.
|
||||
|
||||
pgset "spi [SPI_VALUE]" Set specific SA used to transform packet.
|
||||
|
||||
pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then
|
||||
cycle through the port range.
|
||||
cycle through the port range.
|
||||
|
||||
pgset "udp_src_max 9" set UDP source port max.
|
||||
pgset "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then
|
||||
cycle through the port range.
|
||||
cycle through the port range.
|
||||
pgset "udp_dst_max 9" set UDP destination port max.
|
||||
|
||||
pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example
|
||||
outer label=16,middle label=32,
|
||||
outer label=16,middle label=32,
|
||||
inner label=0 (IPv4 NULL)) Note that
|
||||
there must be no spaces between the
|
||||
arguments. Leading zeros are required.
|
||||
|
@ -232,10 +238,14 @@ A collection of tutorial scripts and helpers for pktgen is in the
|
|||
samples/pktgen directory. The helper parameters.sh file support easy
|
||||
and consistent parameter parsing across the sample scripts.
|
||||
|
||||
Usage example and help:
|
||||
Usage example and help::
|
||||
|
||||
./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2
|
||||
|
||||
Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
|
||||
Usage:::
|
||||
|
||||
./pktgen_sample01_simple.sh [-vx] -i ethX
|
||||
|
||||
-i : ($DEV) output interface/device (required)
|
||||
-s : ($PKT_SIZE) packet size
|
||||
-d : ($DEST_IP) destination IP
|
||||
|
@ -250,13 +260,13 @@ The global variables being set are also listed. E.g. the required
|
|||
interface/device parameter "-i" sets variable $DEV. Copy the
|
||||
pktgen_sampleXX scripts and modify them to fit your own needs.
|
||||
|
||||
The old scripts:
|
||||
The old scripts::
|
||||
|
||||
pktgen.conf-1-2 # 1 CPU 2 dev
|
||||
pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS
|
||||
pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6
|
||||
pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS
|
||||
pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows.
|
||||
pktgen.conf-1-2 # 1 CPU 2 dev
|
||||
pktgen.conf-1-1-rdos # 1 CPU 1 dev w. route DoS
|
||||
pktgen.conf-1-1-ip6 # 1 CPU 1 dev ipv6
|
||||
pktgen.conf-1-1-ip6-rdos # 1 CPU 1 dev ipv6 w. route DoS
|
||||
pktgen.conf-1-1-flows # 1 CPU 1 dev multiple flows.
|
||||
|
||||
|
||||
Interrupt affinity
|
||||
|
@ -271,10 +281,10 @@ to the running threads CPU (directly from smp_processor_id()).
|
|||
Enable IPsec
|
||||
============
|
||||
Default IPsec transformation with ESP encapsulation plus transport mode
|
||||
can be enabled by simply setting:
|
||||
can be enabled by simply setting::
|
||||
|
||||
pgset "flag IPSEC"
|
||||
pgset "flows 1"
|
||||
pgset "flag IPSEC"
|
||||
pgset "flows 1"
|
||||
|
||||
To avoid breaking existing testbed scripts for using AH type and tunnel mode,
|
||||
you can use "pgset spi SPI_VALUE" to specify which transformation mode
|
||||
|
@ -284,115 +294,117 @@ to employ.
|
|||
Current commands and configuration options
|
||||
==========================================
|
||||
|
||||
** Pgcontrol commands:
|
||||
**Pgcontrol commands**::
|
||||
|
||||
start
|
||||
stop
|
||||
reset
|
||||
start
|
||||
stop
|
||||
reset
|
||||
|
||||
** Thread commands:
|
||||
**Thread commands**::
|
||||
|
||||
add_device
|
||||
rem_device_all
|
||||
add_device
|
||||
rem_device_all
|
||||
|
||||
|
||||
** Device commands:
|
||||
**Device commands**::
|
||||
|
||||
count
|
||||
clone_skb
|
||||
burst
|
||||
debug
|
||||
count
|
||||
clone_skb
|
||||
burst
|
||||
debug
|
||||
|
||||
frags
|
||||
delay
|
||||
frags
|
||||
delay
|
||||
|
||||
src_mac_count
|
||||
dst_mac_count
|
||||
src_mac_count
|
||||
dst_mac_count
|
||||
|
||||
pkt_size
|
||||
min_pkt_size
|
||||
max_pkt_size
|
||||
pkt_size
|
||||
min_pkt_size
|
||||
max_pkt_size
|
||||
|
||||
queue_map_min
|
||||
queue_map_max
|
||||
skb_priority
|
||||
queue_map_min
|
||||
queue_map_max
|
||||
skb_priority
|
||||
|
||||
tos (ipv4)
|
||||
traffic_class (ipv6)
|
||||
tos (ipv4)
|
||||
traffic_class (ipv6)
|
||||
|
||||
mpls
|
||||
mpls
|
||||
|
||||
udp_src_min
|
||||
udp_src_max
|
||||
udp_src_min
|
||||
udp_src_max
|
||||
|
||||
udp_dst_min
|
||||
udp_dst_max
|
||||
udp_dst_min
|
||||
udp_dst_max
|
||||
|
||||
node
|
||||
node
|
||||
|
||||
flag
|
||||
IPSRC_RND
|
||||
IPDST_RND
|
||||
UDPSRC_RND
|
||||
UDPDST_RND
|
||||
MACSRC_RND
|
||||
MACDST_RND
|
||||
TXSIZE_RND
|
||||
IPV6
|
||||
MPLS_RND
|
||||
VID_RND
|
||||
SVID_RND
|
||||
FLOW_SEQ
|
||||
QUEUE_MAP_RND
|
||||
QUEUE_MAP_CPU
|
||||
UDPCSUM
|
||||
IPSEC
|
||||
NODE_ALLOC
|
||||
NO_TIMESTAMP
|
||||
flag
|
||||
IPSRC_RND
|
||||
IPDST_RND
|
||||
UDPSRC_RND
|
||||
UDPDST_RND
|
||||
MACSRC_RND
|
||||
MACDST_RND
|
||||
TXSIZE_RND
|
||||
IPV6
|
||||
MPLS_RND
|
||||
VID_RND
|
||||
SVID_RND
|
||||
FLOW_SEQ
|
||||
QUEUE_MAP_RND
|
||||
QUEUE_MAP_CPU
|
||||
UDPCSUM
|
||||
IPSEC
|
||||
NODE_ALLOC
|
||||
NO_TIMESTAMP
|
||||
|
||||
spi (ipsec)
|
||||
spi (ipsec)
|
||||
|
||||
dst_min
|
||||
dst_max
|
||||
dst_min
|
||||
dst_max
|
||||
|
||||
src_min
|
||||
src_max
|
||||
src_min
|
||||
src_max
|
||||
|
||||
dst_mac
|
||||
src_mac
|
||||
dst_mac
|
||||
src_mac
|
||||
|
||||
clear_counters
|
||||
clear_counters
|
||||
|
||||
src6
|
||||
dst6
|
||||
dst6_max
|
||||
dst6_min
|
||||
src6
|
||||
dst6
|
||||
dst6_max
|
||||
dst6_min
|
||||
|
||||
flows
|
||||
flowlen
|
||||
flows
|
||||
flowlen
|
||||
|
||||
rate
|
||||
ratep
|
||||
rate
|
||||
ratep
|
||||
|
||||
xmit_mode <start_xmit|netif_receive>
|
||||
xmit_mode <start_xmit|netif_receive>
|
||||
|
||||
vlan_cfi
|
||||
vlan_id
|
||||
vlan_p
|
||||
vlan_cfi
|
||||
vlan_id
|
||||
vlan_p
|
||||
|
||||
svlan_cfi
|
||||
svlan_id
|
||||
svlan_p
|
||||
svlan_cfi
|
||||
svlan_id
|
||||
svlan_p
|
||||
|
||||
|
||||
References:
|
||||
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/
|
||||
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
|
||||
|
||||
- ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/
|
||||
- tp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
|
||||
|
||||
Paper from Linux-Kongress in Erlangen 2004.
|
||||
ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
|
||||
- ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
|
||||
|
||||
Thanks to:
|
||||
|
||||
Grant Grundler for testing on IA-64 and parisc, Harald Welte, Lennert Buytenhek
|
||||
Stephen Hemminger, Andi Kleen, Dave Miller and many others.
|
||||
|
|
@ -1,4 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
================================================
|
||||
PLIP: The Parallel Line Internet Protocol Device
|
||||
================================================
|
||||
|
||||
Donald Becker (becker@super.org)
|
||||
I.D.A. Supercomputing Research Center, Bowie MD 20715
|
||||
|
@ -83,7 +87,7 @@ When the PLIP driver is used in IRQ mode, the timeout used for triggering a
|
|||
data transfer (the maximal time the PLIP driver would allow the other side
|
||||
before announcing a timeout, when trying to handshake a transfer of some
|
||||
data) is, by default, 500usec. As IRQ delivery is more or less immediate,
|
||||
this timeout is quite sufficient.
|
||||
this timeout is quite sufficient.
|
||||
|
||||
When in IRQ-less mode, the PLIP driver polls the parallel port HZ times
|
||||
per second (where HZ is typically 100 on most platforms, and 1024 on an
|
||||
|
@ -115,7 +119,7 @@ printer "null" cable to transfer data four bits at a time using
|
|||
data bit outputs connected to status bit inputs.
|
||||
|
||||
The second data transfer method relies on both machines having
|
||||
bi-directional parallel ports, rather than output-only ``printer''
|
||||
bi-directional parallel ports, rather than output-only ``printer``
|
||||
ports. This allows byte-wide transfers and avoids reconstructing
|
||||
nibbles into bytes, leading to much faster transfers.
|
||||
|
||||
|
@ -132,7 +136,7 @@ bits with standard status register implementation.
|
|||
|
||||
A cable that implements this protocol is available commercially as a
|
||||
"Null Printer" or "Turbo Laplink" cable. It can be constructed with
|
||||
two DB-25 male connectors symmetrically connected as follows:
|
||||
two DB-25 male connectors symmetrically connected as follows::
|
||||
|
||||
STROBE output 1*
|
||||
D0->ERROR 2 - 15 15 - 2
|
||||
|
@ -146,7 +150,8 @@ two DB-25 male connectors symmetrically connected as follows:
|
|||
SLCTIN 17 - 17
|
||||
extra grounds are 18*,19*,20*,21*,22*,23*,24*
|
||||
GROUND 25 - 25
|
||||
* Do not connect these pins on either end
|
||||
|
||||
* Do not connect these pins on either end
|
||||
|
||||
If the cable you are using has a metallic shield it should be
|
||||
connected to the metallic DB-25 shell at one end only.
|
||||
|
@ -155,14 +160,14 @@ Parallel Transfer Mode 1
|
|||
========================
|
||||
|
||||
The second data transfer method relies on both machines having
|
||||
bi-directional parallel ports, rather than output-only ``printer''
|
||||
bi-directional parallel ports, rather than output-only ``printer``
|
||||
ports. This allows byte-wide transfers, and avoids reconstructing
|
||||
nibbles into bytes. This cable should not be used on unidirectional
|
||||
``printer'' (as opposed to ``parallel'') ports or when the machine
|
||||
``printer`` (as opposed to ``parallel``) ports or when the machine
|
||||
isn't configured for PLIP, as it will result in output driver
|
||||
conflicts and the (unlikely) possibility of damage.
|
||||
|
||||
The cable for this transfer mode should be constructed as follows:
|
||||
The cable for this transfer mode should be constructed as follows::
|
||||
|
||||
STROBE->BUSY 1 - 11
|
||||
D0->D0 2 - 2
|
||||
|
@ -179,7 +184,8 @@ The cable for this transfer mode should be constructed as follows:
|
|||
GND->ERROR 18 - 15
|
||||
extra grounds are 19*,20*,21*,22*,23*,24*
|
||||
GROUND 25 - 25
|
||||
* Do not connect these pins on either end
|
||||
|
||||
* Do not connect these pins on either end
|
||||
|
||||
Once again, if the cable you are using has a metallic shield it should
|
||||
be connected to the metallic DB-25 shell at one end only.
|
||||
|
@ -188,7 +194,7 @@ PLIP Mode 0 transfer protocol
|
|||
=============================
|
||||
|
||||
The PLIP driver is compatible with the "Crynwr" parallel port transfer
|
||||
standard in Mode 0. That standard specifies the following protocol:
|
||||
standard in Mode 0. That standard specifies the following protocol::
|
||||
|
||||
send header nibble '0x8'
|
||||
count-low octet
|
||||
|
@ -196,20 +202,21 @@ standard in Mode 0. That standard specifies the following protocol:
|
|||
... data octets
|
||||
checksum octet
|
||||
|
||||
Each octet is sent as
|
||||
Each octet is sent as::
|
||||
|
||||
<wait for rx. '0x1?'> <send 0x10+(octet&0x0F)>
|
||||
<wait for rx. '0x0?'> <send 0x00+((octet>>4)&0x0F)>
|
||||
|
||||
To start a transfer the transmitting machine outputs a nibble 0x08.
|
||||
That raises the ACK line, triggering an interrupt in the receiving
|
||||
machine. The receiving machine disables interrupts and raises its own ACK
|
||||
line.
|
||||
line.
|
||||
|
||||
Restated:
|
||||
Restated::
|
||||
|
||||
(OUT is bit 0-4, OUT.j is bit j from OUT. IN likewise)
|
||||
Send_Byte:
|
||||
OUT := low nibble, OUT.4 := 1
|
||||
WAIT FOR IN.4 = 1
|
||||
OUT := high nibble, OUT.4 := 0
|
||||
WAIT FOR IN.4 = 0
|
||||
(OUT is bit 0-4, OUT.j is bit j from OUT. IN likewise)
|
||||
Send_Byte:
|
||||
OUT := low nibble, OUT.4 := 1
|
||||
WAIT FOR IN.4 = 1
|
||||
OUT := high nibble, OUT.4 := 0
|
||||
WAIT FOR IN.4 = 0
|
|
@ -1,8 +1,12 @@
|
|||
PPP Generic Driver and Channel Interface
|
||||
----------------------------------------
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Paul Mackerras
|
||||
========================================
|
||||
PPP Generic Driver and Channel Interface
|
||||
========================================
|
||||
|
||||
Paul Mackerras
|
||||
paulus@samba.org
|
||||
|
||||
7 Feb 2002
|
||||
|
||||
The generic PPP driver in linux-2.4 provides an implementation of the
|
||||
|
@ -19,7 +23,7 @@ functionality which is of use in any PPP implementation, including:
|
|||
* simple packet filtering
|
||||
|
||||
For sending and receiving PPP frames, the generic PPP driver calls on
|
||||
the services of PPP `channels'. A PPP channel encapsulates a
|
||||
the services of PPP ``channels``. A PPP channel encapsulates a
|
||||
mechanism for transporting PPP frames from one machine to another. A
|
||||
PPP channel implementation can be arbitrarily complex internally but
|
||||
has a very simple interface with the generic PPP code: it merely has
|
||||
|
@ -102,7 +106,7 @@ communications medium and prepare it to do PPP. For example, with an
|
|||
async tty, this can involve setting the tty speed and modes, issuing
|
||||
modem commands, and then going through some sort of dialog with the
|
||||
remote system to invoke PPP service there. We refer to this process
|
||||
as `discovery'. Then the user-level process tells the medium to
|
||||
as ``discovery``. Then the user-level process tells the medium to
|
||||
become a PPP channel and register itself with the generic PPP layer.
|
||||
The channel then has to report the channel number assigned to it back
|
||||
to the user-level process. From that point, the PPP negotiation code
|
||||
|
@ -111,8 +115,8 @@ negotiation, accessing the channel through the /dev/ppp interface.
|
|||
|
||||
At the interface to the PPP generic layer, PPP frames are stored in
|
||||
skbuff structures and start with the two-byte PPP protocol number.
|
||||
The frame does *not* include the 0xff `address' byte or the 0x03
|
||||
`control' byte that are optionally used in async PPP. Nor is there
|
||||
The frame does *not* include the 0xff ``address`` byte or the 0x03
|
||||
``control`` byte that are optionally used in async PPP. Nor is there
|
||||
any escaping of control characters, nor are there any FCS or framing
|
||||
characters included. That is all the responsibility of the channel
|
||||
code, if it is needed for the particular medium. That is, the skbuffs
|
||||
|
@ -121,16 +125,16 @@ protocol number and the data, and the skbuffs presented to ppp_input()
|
|||
must be in the same format.
|
||||
|
||||
The channel must provide an instance of a ppp_channel struct to
|
||||
represent the channel. The channel is free to use the `private' field
|
||||
however it wishes. The channel should initialize the `mtu' and
|
||||
`hdrlen' fields before calling ppp_register_channel() and not change
|
||||
them until after ppp_unregister_channel() returns. The `mtu' field
|
||||
represent the channel. The channel is free to use the ``private`` field
|
||||
however it wishes. The channel should initialize the ``mtu`` and
|
||||
``hdrlen`` fields before calling ppp_register_channel() and not change
|
||||
them until after ppp_unregister_channel() returns. The ``mtu`` field
|
||||
represents the maximum size of the data part of the PPP frames, that
|
||||
is, it does not include the 2-byte protocol number.
|
||||
|
||||
If the channel needs some headroom in the skbuffs presented to it for
|
||||
transmission (i.e., some space free in the skbuff data area before the
|
||||
start of the PPP frame), it should set the `hdrlen' field of the
|
||||
start of the PPP frame), it should set the ``hdrlen`` field of the
|
||||
ppp_channel struct to the amount of headroom required. The generic
|
||||
PPP layer will attempt to provide that much headroom but the channel
|
||||
should still check if there is sufficient headroom and copy the skbuff
|
||||
|
@ -322,6 +326,8 @@ an interface unit are:
|
|||
interface. The argument should be a pointer to an int containing
|
||||
the new flags value. The bits in the flags value that can be set
|
||||
are:
|
||||
|
||||
================ ========================================
|
||||
SC_COMP_TCP enable transmit TCP header compression
|
||||
SC_NO_TCP_CCID disable connection-id compression for
|
||||
TCP header compression
|
||||
|
@ -335,6 +341,7 @@ an interface unit are:
|
|||
SC_MP_SHORTSEQ expect short multilink sequence
|
||||
numbers on received multilink fragments
|
||||
SC_MP_XSHORTSEQ transmit short multilink sequence nos.
|
||||
================ ========================================
|
||||
|
||||
The values of these flags are defined in <linux/ppp-ioctl.h>. Note
|
||||
that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and
|
||||
|
@ -345,17 +352,20 @@ an interface unit are:
|
|||
interface unit. The argument should point to an int where the ioctl
|
||||
will store the flags value. As well as the values listed above for
|
||||
PPPIOCSFLAGS, the following bits may be set in the returned value:
|
||||
|
||||
================ =========================================
|
||||
SC_COMP_RUN CCP compressor is running
|
||||
SC_DECOMP_RUN CCP decompressor is running
|
||||
SC_DC_ERROR CCP decompressor detected non-fatal error
|
||||
SC_DC_FERROR CCP decompressor detected fatal error
|
||||
================ =========================================
|
||||
|
||||
* PPPIOCSCOMPRESS sets the parameters for packet compression or
|
||||
decompression. The argument should point to a ppp_option_data
|
||||
structure (defined in <linux/ppp-ioctl.h>), which contains a
|
||||
pointer/length pair which should describe a block of memory
|
||||
containing a CCP option specifying a compression method and its
|
||||
parameters. The ppp_option_data struct also contains a `transmit'
|
||||
parameters. The ppp_option_data struct also contains a ``transmit``
|
||||
field. If this is 0, the ioctl will affect the receive path,
|
||||
otherwise the transmit path.
|
||||
|
||||
|
@ -377,7 +387,7 @@ an interface unit are:
|
|||
ppp_idle structure (defined in <linux/ppp_defs.h>). If the
|
||||
CONFIG_PPP_FILTER option is enabled, the set of packets which reset
|
||||
the transmit and receive idle timers is restricted to those which
|
||||
pass the `active' packet filter.
|
||||
pass the ``active`` packet filter.
|
||||
Two versions of this command exist, to deal with user space
|
||||
expecting times as either 32-bit or 64-bit time_t seconds.
|
||||
|
||||
|
@ -391,31 +401,33 @@ an interface unit are:
|
|||
|
||||
* PPPIOCSNPMODE sets the network-protocol mode for a given network
|
||||
protocol. The argument should point to an npioctl struct (defined
|
||||
in <linux/ppp-ioctl.h>). The `protocol' field gives the PPP protocol
|
||||
number for the protocol to be affected, and the `mode' field
|
||||
in <linux/ppp-ioctl.h>). The ``protocol`` field gives the PPP protocol
|
||||
number for the protocol to be affected, and the ``mode`` field
|
||||
specifies what to do with packets for that protocol:
|
||||
|
||||
============= ==============================================
|
||||
NPMODE_PASS normal operation, transmit and receive packets
|
||||
NPMODE_DROP silently drop packets for this protocol
|
||||
NPMODE_ERROR drop packets and return an error on transmit
|
||||
NPMODE_QUEUE queue up packets for transmit, drop received
|
||||
packets
|
||||
============= ==============================================
|
||||
|
||||
At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as
|
||||
NPMODE_DROP.
|
||||
|
||||
* PPPIOCGNPMODE returns the network-protocol mode for a given
|
||||
protocol. The argument should point to an npioctl struct with the
|
||||
`protocol' field set to the PPP protocol number for the protocol of
|
||||
interest. On return the `mode' field will be set to the network-
|
||||
``protocol`` field set to the PPP protocol number for the protocol of
|
||||
interest. On return the ``mode`` field will be set to the network-
|
||||
protocol mode for that protocol.
|
||||
|
||||
* PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet
|
||||
* PPPIOCSPASS and PPPIOCSACTIVE set the ``pass`` and ``active`` packet
|
||||
filters. These ioctls are only available if the CONFIG_PPP_FILTER
|
||||
option is selected. The argument should point to a sock_fprog
|
||||
structure (defined in <linux/filter.h>) containing the compiled BPF
|
||||
instructions for the filter. Packets are dropped if they fail the
|
||||
`pass' filter; otherwise, if they fail the `active' filter they are
|
||||
``pass`` filter; otherwise, if they fail the ``active`` filter they are
|
||||
passed but they do not reset the transmit or receive idle timer.
|
||||
|
||||
* PPPIOCSMRRU enables or disables multilink processing for received
|
|
@ -1,15 +1,21 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
============================================
|
||||
The proc/net/tcp and proc/net/tcp6 variables
|
||||
============================================
|
||||
|
||||
This document describes the interfaces /proc/net/tcp and /proc/net/tcp6.
|
||||
Note that these interfaces are deprecated in favor of tcp_diag.
|
||||
|
||||
These /proc interfaces provide information about currently active TCP
|
||||
These /proc interfaces provide information about currently active TCP
|
||||
connections, and are implemented by tcp4_seq_show() in net/ipv4/tcp_ipv4.c
|
||||
and tcp6_seq_show() in net/ipv6/tcp_ipv6.c, respectively.
|
||||
|
||||
It will first list all listening TCP sockets, and next list all established
|
||||
TCP connections. A typical entry of /proc/net/tcp would look like this (split
|
||||
up into 3 parts because of the length of the line):
|
||||
TCP connections. A typical entry of /proc/net/tcp would look like this (split
|
||||
up into 3 parts because of the length of the line)::
|
||||
|
||||
46: 010310AC:9C4C 030310AC:1770 01
|
||||
46: 010310AC:9C4C 030310AC:1770 01
|
||||
| | | | | |--> connection state
|
||||
| | | | |------> remote TCP port number
|
||||
| | | |-------------> remote IPv4 address
|
||||
|
@ -17,7 +23,7 @@ up into 3 parts because of the length of the line):
|
|||
| |---------------------------> local IPv4 address
|
||||
|----------------------------------> number of entry
|
||||
|
||||
00000150:00000000 01:00000019 00000000
|
||||
00000150:00000000 01:00000019 00000000
|
||||
| | | | |--> number of unrecovered RTO timeouts
|
||||
| | | |----------> number of jiffies until timer expires
|
||||
| | |----------------> timer_active (see below)
|
||||
|
@ -25,7 +31,7 @@ up into 3 parts because of the length of the line):
|
|||
|-------------------------------> transmit-queue
|
||||
|
||||
1000 0 54165785 4 cd1e6040 25 4 27 3 -1
|
||||
| | | | | | | | | |--> slow start size threshold,
|
||||
| | | | | | | | | |--> slow start size threshold,
|
||||
| | | | | | | | | or -1 if the threshold
|
||||
| | | | | | | | | is >= 0xFFFF
|
||||
| | | | | | | | |----> sending congestion window
|
||||
|
@ -40,9 +46,12 @@ up into 3 parts because of the length of the line):
|
|||
|---------------------------------------------> uid
|
||||
|
||||
timer_active:
|
||||
|
||||
== ================================================================
|
||||
0 no timer is pending
|
||||
1 retransmit-timer is pending
|
||||
2 another timer (e.g. delayed ack or keepalive) is pending
|
||||
3 this is a socket in TIME_WAIT state. Not all fields will contain
|
||||
3 this is a socket in TIME_WAIT state. Not all fields will contain
|
||||
data (or even exist)
|
||||
4 zero window probe timer is pending
|
||||
== ================================================================
|
|
@ -1,3 +1,6 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===========================
|
||||
How to use radiotap headers
|
||||
===========================
|
||||
|
||||
|
@ -5,9 +8,9 @@ Pointer to the radiotap include file
|
|||
------------------------------------
|
||||
|
||||
Radiotap headers are variable-length and extensible, you can get most of the
|
||||
information you need to know on them from:
|
||||
information you need to know on them from::
|
||||
|
||||
./include/net/ieee80211_radiotap.h
|
||||
./include/net/ieee80211_radiotap.h
|
||||
|
||||
This document gives an overview and warns on some corner cases.
|
||||
|
||||
|
@ -21,6 +24,8 @@ of the it_present member of ieee80211_radiotap_header is set, it means that
|
|||
the header for argument index 0 (IEEE80211_RADIOTAP_TSFT) is present in the
|
||||
argument area.
|
||||
|
||||
::
|
||||
|
||||
< 8-byte ieee80211_radiotap_header >
|
||||
[ <possible argument bitmap extensions ... > ]
|
||||
[ <argument> ... ]
|
||||
|
@ -76,6 +81,8 @@ ieee80211_radiotap_header.
|
|||
Example valid radiotap header
|
||||
-----------------------------
|
||||
|
||||
::
|
||||
|
||||
0x00, 0x00, // <-- radiotap version + pad byte
|
||||
0x0b, 0x00, // <- radiotap header length
|
||||
0x04, 0x0c, 0x00, 0x00, // <-- bitmap
|
||||
|
@ -89,64 +96,64 @@ Using the Radiotap Parser
|
|||
|
||||
If you are having to parse a radiotap struct, you can radically simplify the
|
||||
job by using the radiotap parser that lives in net/wireless/radiotap.c and has
|
||||
its prototypes available in include/net/cfg80211.h. You use it like this:
|
||||
its prototypes available in include/net/cfg80211.h. You use it like this::
|
||||
|
||||
#include <net/cfg80211.h>
|
||||
#include <net/cfg80211.h>
|
||||
|
||||
/* buf points to the start of the radiotap header part */
|
||||
/* buf points to the start of the radiotap header part */
|
||||
|
||||
int MyFunction(u8 * buf, int buflen)
|
||||
{
|
||||
int pkt_rate_100kHz = 0, antenna = 0, pwr = 0;
|
||||
struct ieee80211_radiotap_iterator iterator;
|
||||
int ret = ieee80211_radiotap_iterator_init(&iterator, buf, buflen);
|
||||
int MyFunction(u8 * buf, int buflen)
|
||||
{
|
||||
int pkt_rate_100kHz = 0, antenna = 0, pwr = 0;
|
||||
struct ieee80211_radiotap_iterator iterator;
|
||||
int ret = ieee80211_radiotap_iterator_init(&iterator, buf, buflen);
|
||||
|
||||
while (!ret) {
|
||||
while (!ret) {
|
||||
|
||||
ret = ieee80211_radiotap_iterator_next(&iterator);
|
||||
ret = ieee80211_radiotap_iterator_next(&iterator);
|
||||
|
||||
if (ret)
|
||||
continue;
|
||||
if (ret)
|
||||
continue;
|
||||
|
||||
/* see if this argument is something we can use */
|
||||
/* see if this argument is something we can use */
|
||||
|
||||
switch (iterator.this_arg_index) {
|
||||
/*
|
||||
* You must take care when dereferencing iterator.this_arg
|
||||
* for multibyte types... the pointer is not aligned. Use
|
||||
* get_unaligned((type *)iterator.this_arg) to dereference
|
||||
* iterator.this_arg for type "type" safely on all arches.
|
||||
*/
|
||||
case IEEE80211_RADIOTAP_RATE:
|
||||
/* radiotap "rate" u8 is in
|
||||
* 500kbps units, eg, 0x02=1Mbps
|
||||
*/
|
||||
pkt_rate_100kHz = (*iterator.this_arg) * 5;
|
||||
break;
|
||||
switch (iterator.this_arg_index) {
|
||||
/*
|
||||
* You must take care when dereferencing iterator.this_arg
|
||||
* for multibyte types... the pointer is not aligned. Use
|
||||
* get_unaligned((type *)iterator.this_arg) to dereference
|
||||
* iterator.this_arg for type "type" safely on all arches.
|
||||
*/
|
||||
case IEEE80211_RADIOTAP_RATE:
|
||||
/* radiotap "rate" u8 is in
|
||||
* 500kbps units, eg, 0x02=1Mbps
|
||||
*/
|
||||
pkt_rate_100kHz = (*iterator.this_arg) * 5;
|
||||
break;
|
||||
|
||||
case IEEE80211_RADIOTAP_ANTENNA:
|
||||
/* radiotap uses 0 for 1st ant */
|
||||
antenna = *iterator.this_arg);
|
||||
break;
|
||||
case IEEE80211_RADIOTAP_ANTENNA:
|
||||
/* radiotap uses 0 for 1st ant */
|
||||
antenna = *iterator.this_arg);
|
||||
break;
|
||||
|
||||
case IEEE80211_RADIOTAP_DBM_TX_POWER:
|
||||
pwr = *iterator.this_arg;
|
||||
break;
|
||||
case IEEE80211_RADIOTAP_DBM_TX_POWER:
|
||||
pwr = *iterator.this_arg;
|
||||
break;
|
||||
|
||||
default:
|
||||
break;
|
||||
}
|
||||
} /* while more rt headers */
|
||||
default:
|
||||
break;
|
||||
}
|
||||
} /* while more rt headers */
|
||||
|
||||
if (ret != -ENOENT)
|
||||
return TXRX_DROP;
|
||||
if (ret != -ENOENT)
|
||||
return TXRX_DROP;
|
||||
|
||||
/* discard the radiotap header part */
|
||||
buf += iterator.max_length;
|
||||
buflen -= iterator.max_length;
|
||||
/* discard the radiotap header part */
|
||||
buf += iterator.max_length;
|
||||
buflen -= iterator.max_length;
|
||||
|
||||
...
|
||||
...
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
Andy Green <andy@warmcat.com>
|
|
@ -1,6 +1,14 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
.. include:: <isonum.txt>
|
||||
|
||||
=========================
|
||||
Raylink wireless LAN card
|
||||
=========================
|
||||
|
||||
September 21, 1999
|
||||
|
||||
Copyright (c) 1998 Corey Thomas (corey@world.std.com)
|
||||
Copyright |copy| 1998 Corey Thomas (corey@world.std.com)
|
||||
|
||||
This file is the documentation for the Raylink Wireless LAN card driver for
|
||||
Linux. The Raylink wireless LAN card is a PCMCIA card which provides IEEE
|
||||
|
@ -13,7 +21,7 @@ wireless LAN cards.
|
|||
|
||||
As of kernel 2.3.18, the ray_cs driver is part of the Linux kernel
|
||||
source. My web page for the development of ray_cs is at
|
||||
http://web.ralinktech.com/ralink/Home/Support/Linux.html
|
||||
http://web.ralinktech.com/ralink/Home/Support/Linux.html
|
||||
and I can be emailed at corey@world.std.com
|
||||
|
||||
The kernel driver is based on ray_cs-1.62.tgz
|
||||
|
@ -29,6 +37,7 @@ with nondefault parameters, they can be edited in
|
|||
will find them all.
|
||||
|
||||
Information on card services is available at:
|
||||
|
||||
http://pcmcia-cs.sourceforge.net/
|
||||
|
||||
|
||||
|
@ -39,72 +48,78 @@ the driver.
|
|||
Currently, ray_cs is not part of David Hinds card services package,
|
||||
so the following magic is required.
|
||||
|
||||
At the end of the /etc/pcmcia/config.opts file, add the line:
|
||||
source ./ray_cs.opts
|
||||
At the end of the /etc/pcmcia/config.opts file, add the line:
|
||||
source ./ray_cs.opts
|
||||
This will make card services read the ray_cs.opts file
|
||||
when starting. Create the file /etc/pcmcia/ray_cs.opts containing the
|
||||
following:
|
||||
following::
|
||||
|
||||
#### start of /etc/pcmcia/ray_cs.opts ###################
|
||||
# Configuration options for Raylink Wireless LAN PCMCIA card
|
||||
device "ray_cs"
|
||||
class "network" module "misc/ray_cs"
|
||||
#### start of /etc/pcmcia/ray_cs.opts ###################
|
||||
# Configuration options for Raylink Wireless LAN PCMCIA card
|
||||
device "ray_cs"
|
||||
class "network" module "misc/ray_cs"
|
||||
|
||||
card "RayLink PC Card WLAN Adapter"
|
||||
manfid 0x01a6, 0x0000
|
||||
bind "ray_cs"
|
||||
card "RayLink PC Card WLAN Adapter"
|
||||
manfid 0x01a6, 0x0000
|
||||
bind "ray_cs"
|
||||
|
||||
module "misc/ray_cs" opts ""
|
||||
#### end of /etc/pcmcia/ray_cs.opts #####################
|
||||
module "misc/ray_cs" opts ""
|
||||
#### end of /etc/pcmcia/ray_cs.opts #####################
|
||||
|
||||
|
||||
To join an existing network with
|
||||
different parameters, contact the network administrator for the
|
||||
different parameters, contact the network administrator for the
|
||||
configuration information, and edit /etc/pcmcia/ray_cs.opts.
|
||||
Add the parameters below between the empty quotes.
|
||||
|
||||
Parameters for ray_cs driver which may be specified in ray_cs.opts:
|
||||
|
||||
bc integer 0 = normal mode (802.11 timing)
|
||||
1 = slow down inter frame timing to allow
|
||||
operation with older breezecom access
|
||||
points.
|
||||
=============== =============== =============================================
|
||||
bc integer 0 = normal mode (802.11 timing),
|
||||
1 = slow down inter frame timing to allow
|
||||
operation with older breezecom access
|
||||
points.
|
||||
|
||||
beacon_period integer beacon period in Kilo-microseconds
|
||||
legal values = must be integer multiple
|
||||
of hop dwell
|
||||
default = 256
|
||||
beacon_period integer beacon period in Kilo-microseconds,
|
||||
|
||||
country integer 1 = USA (default)
|
||||
2 = Europe
|
||||
3 = Japan
|
||||
4 = Korea
|
||||
5 = Spain
|
||||
6 = France
|
||||
7 = Israel
|
||||
8 = Australia
|
||||
legal values = must be integer multiple
|
||||
of hop dwell
|
||||
|
||||
default = 256
|
||||
|
||||
country integer 1 = USA (default),
|
||||
2 = Europe,
|
||||
3 = Japan,
|
||||
4 = Korea,
|
||||
5 = Spain,
|
||||
6 = France,
|
||||
7 = Israel,
|
||||
8 = Australia
|
||||
|
||||
essid string ESS ID - network name to join
|
||||
|
||||
string with maximum length of 32 chars
|
||||
default value = "ADHOC_ESSID"
|
||||
|
||||
hop_dwell integer hop dwell time in Kilo-microseconds
|
||||
hop_dwell integer hop dwell time in Kilo-microseconds
|
||||
|
||||
legal values = 16,32,64,128(default),256
|
||||
|
||||
irq_mask integer linux standard 16 bit value 1bit/IRQ
|
||||
|
||||
lsb is IRQ 0, bit 1 is IRQ 1 etc.
|
||||
Used to restrict choice of IRQ's to use.
|
||||
Recommended method for controlling
|
||||
interrupts is in /etc/pcmcia/config.opts
|
||||
Recommended method for controlling
|
||||
interrupts is in /etc/pcmcia/config.opts
|
||||
|
||||
net_type integer 0 (default) = adhoc network,
|
||||
net_type integer 0 (default) = adhoc network,
|
||||
1 = infrastructure
|
||||
|
||||
phy_addr string string containing new MAC address in
|
||||
hex, must start with x eg
|
||||
x00008f123456
|
||||
|
||||
psm integer 0 = continuously active
|
||||
psm integer 0 = continuously active,
|
||||
1 = power save mode (not useful yet)
|
||||
|
||||
pc_debug integer (0-5) larger values for more verbose
|
||||
|
@ -114,14 +129,14 @@ ray_debug integer Replaced with pc_debug
|
|||
|
||||
ray_mem_speed integer defaults to 500
|
||||
|
||||
sniffer integer 0 = not sniffer (default)
|
||||
1 = sniffer which can be used to record all
|
||||
network traffic using tcpdump or similar,
|
||||
but no normal network use is allowed.
|
||||
sniffer integer 0 = not sniffer (default),
|
||||
1 = sniffer which can be used to record all
|
||||
network traffic using tcpdump or similar,
|
||||
but no normal network use is allowed.
|
||||
|
||||
translate integer 0 = no translation (encapsulate frames)
|
||||
translate integer 0 = no translation (encapsulate frames),
|
||||
1 = translation (RFC1042/802.1)
|
||||
|
||||
=============== =============== =============================================
|
||||
|
||||
More on sniffer mode:
|
||||
|
||||
|
@ -136,7 +151,7 @@ package which parses the 802.11 headers.
|
|||
|
||||
Known Problems and missing features
|
||||
|
||||
Does not work with non x86
|
||||
Does not work with non x86
|
||||
|
||||
Does not work with SMP
|
||||
|
|
@ -1,3 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==
|
||||
RDS
|
||||
===
|
||||
|
||||
Overview
|
||||
========
|
||||
|
@ -24,36 +29,39 @@ as IB.
|
|||
The high-level semantics of RDS from the application's point of view are
|
||||
|
||||
* Addressing
|
||||
RDS uses IPv4 addresses and 16bit port numbers to identify
|
||||
the end point of a connection. All socket operations that involve
|
||||
passing addresses between kernel and user space generally
|
||||
use a struct sockaddr_in.
|
||||
|
||||
The fact that IPv4 addresses are used does not mean the underlying
|
||||
transport has to be IP-based. In fact, RDS over IB uses a
|
||||
reliable IB connection; the IP address is used exclusively to
|
||||
locate the remote node's GID (by ARPing for the given IP).
|
||||
RDS uses IPv4 addresses and 16bit port numbers to identify
|
||||
the end point of a connection. All socket operations that involve
|
||||
passing addresses between kernel and user space generally
|
||||
use a struct sockaddr_in.
|
||||
|
||||
The port space is entirely independent of UDP, TCP or any other
|
||||
protocol.
|
||||
The fact that IPv4 addresses are used does not mean the underlying
|
||||
transport has to be IP-based. In fact, RDS over IB uses a
|
||||
reliable IB connection; the IP address is used exclusively to
|
||||
locate the remote node's GID (by ARPing for the given IP).
|
||||
|
||||
The port space is entirely independent of UDP, TCP or any other
|
||||
protocol.
|
||||
|
||||
* Socket interface
|
||||
RDS sockets work *mostly* as you would expect from a BSD
|
||||
socket. The next section will cover the details. At any rate,
|
||||
all I/O is performed through the standard BSD socket API.
|
||||
Some additions like zerocopy support are implemented through
|
||||
control messages, while other extensions use the getsockopt/
|
||||
setsockopt calls.
|
||||
|
||||
Sockets must be bound before you can send or receive data.
|
||||
This is needed because binding also selects a transport and
|
||||
attaches it to the socket. Once bound, the transport assignment
|
||||
does not change. RDS will tolerate IPs moving around (eg in
|
||||
a active-active HA scenario), but only as long as the address
|
||||
doesn't move to a different transport.
|
||||
RDS sockets work *mostly* as you would expect from a BSD
|
||||
socket. The next section will cover the details. At any rate,
|
||||
all I/O is performed through the standard BSD socket API.
|
||||
Some additions like zerocopy support are implemented through
|
||||
control messages, while other extensions use the getsockopt/
|
||||
setsockopt calls.
|
||||
|
||||
Sockets must be bound before you can send or receive data.
|
||||
This is needed because binding also selects a transport and
|
||||
attaches it to the socket. Once bound, the transport assignment
|
||||
does not change. RDS will tolerate IPs moving around (eg in
|
||||
a active-active HA scenario), but only as long as the address
|
||||
doesn't move to a different transport.
|
||||
|
||||
* sysctls
|
||||
RDS supports a number of sysctls in /proc/sys/net/rds
|
||||
|
||||
RDS supports a number of sysctls in /proc/sys/net/rds
|
||||
|
||||
|
||||
Socket Interface
|
||||
|
@ -66,89 +74,88 @@ Socket Interface
|
|||
options.
|
||||
|
||||
fd = socket(PF_RDS, SOCK_SEQPACKET, 0);
|
||||
This creates a new, unbound RDS socket.
|
||||
This creates a new, unbound RDS socket.
|
||||
|
||||
setsockopt(SOL_SOCKET): send and receive buffer size
|
||||
RDS honors the send and receive buffer size socket options.
|
||||
You are not allowed to queue more than SO_SNDSIZE bytes to
|
||||
a socket. A message is queued when sendmsg is called, and
|
||||
it leaves the queue when the remote system acknowledges
|
||||
its arrival.
|
||||
RDS honors the send and receive buffer size socket options.
|
||||
You are not allowed to queue more than SO_SNDSIZE bytes to
|
||||
a socket. A message is queued when sendmsg is called, and
|
||||
it leaves the queue when the remote system acknowledges
|
||||
its arrival.
|
||||
|
||||
The SO_RCVSIZE option controls the maximum receive queue length.
|
||||
This is a soft limit rather than a hard limit - RDS will
|
||||
continue to accept and queue incoming messages, even if that
|
||||
takes the queue length over the limit. However, it will also
|
||||
mark the port as "congested" and send a congestion update to
|
||||
the source node. The source node is supposed to throttle any
|
||||
processes sending to this congested port.
|
||||
The SO_RCVSIZE option controls the maximum receive queue length.
|
||||
This is a soft limit rather than a hard limit - RDS will
|
||||
continue to accept and queue incoming messages, even if that
|
||||
takes the queue length over the limit. However, it will also
|
||||
mark the port as "congested" and send a congestion update to
|
||||
the source node. The source node is supposed to throttle any
|
||||
processes sending to this congested port.
|
||||
|
||||
bind(fd, &sockaddr_in, ...)
|
||||
This binds the socket to a local IP address and port, and a
|
||||
transport, if one has not already been selected via the
|
||||
This binds the socket to a local IP address and port, and a
|
||||
transport, if one has not already been selected via the
|
||||
SO_RDS_TRANSPORT socket option
|
||||
|
||||
sendmsg(fd, ...)
|
||||
Sends a message to the indicated recipient. The kernel will
|
||||
transparently establish the underlying reliable connection
|
||||
if it isn't up yet.
|
||||
Sends a message to the indicated recipient. The kernel will
|
||||
transparently establish the underlying reliable connection
|
||||
if it isn't up yet.
|
||||
|
||||
An attempt to send a message that exceeds SO_SNDSIZE will
|
||||
return with -EMSGSIZE
|
||||
An attempt to send a message that exceeds SO_SNDSIZE will
|
||||
return with -EMSGSIZE
|
||||
|
||||
An attempt to send a message that would take the total number
|
||||
of queued bytes over the SO_SNDSIZE threshold will return
|
||||
EAGAIN.
|
||||
An attempt to send a message that would take the total number
|
||||
of queued bytes over the SO_SNDSIZE threshold will return
|
||||
EAGAIN.
|
||||
|
||||
An attempt to send a message to a destination that is marked
|
||||
as "congested" will return ENOBUFS.
|
||||
An attempt to send a message to a destination that is marked
|
||||
as "congested" will return ENOBUFS.
|
||||
|
||||
recvmsg(fd, ...)
|
||||
Receives a message that was queued to this socket. The sockets
|
||||
recv queue accounting is adjusted, and if the queue length
|
||||
drops below SO_SNDSIZE, the port is marked uncongested, and
|
||||
a congestion update is sent to all peers.
|
||||
Receives a message that was queued to this socket. The sockets
|
||||
recv queue accounting is adjusted, and if the queue length
|
||||
drops below SO_SNDSIZE, the port is marked uncongested, and
|
||||
a congestion update is sent to all peers.
|
||||
|
||||
Applications can ask the RDS kernel module to receive
|
||||
notifications via control messages (for instance, there is a
|
||||
notification when a congestion update arrived, or when a RDMA
|
||||
operation completes). These notifications are received through
|
||||
the msg.msg_control buffer of struct msghdr. The format of the
|
||||
messages is described in manpages.
|
||||
Applications can ask the RDS kernel module to receive
|
||||
notifications via control messages (for instance, there is a
|
||||
notification when a congestion update arrived, or when a RDMA
|
||||
operation completes). These notifications are received through
|
||||
the msg.msg_control buffer of struct msghdr. The format of the
|
||||
messages is described in manpages.
|
||||
|
||||
poll(fd)
|
||||
RDS supports the poll interface to allow the application
|
||||
to implement async I/O.
|
||||
RDS supports the poll interface to allow the application
|
||||
to implement async I/O.
|
||||
|
||||
POLLIN handling is pretty straightforward. When there's an
|
||||
incoming message queued to the socket, or a pending notification,
|
||||
we signal POLLIN.
|
||||
POLLIN handling is pretty straightforward. When there's an
|
||||
incoming message queued to the socket, or a pending notification,
|
||||
we signal POLLIN.
|
||||
|
||||
POLLOUT is a little harder. Since you can essentially send
|
||||
to any destination, RDS will always signal POLLOUT as long as
|
||||
there's room on the send queue (ie the number of bytes queued
|
||||
is less than the sendbuf size).
|
||||
POLLOUT is a little harder. Since you can essentially send
|
||||
to any destination, RDS will always signal POLLOUT as long as
|
||||
there's room on the send queue (ie the number of bytes queued
|
||||
is less than the sendbuf size).
|
||||
|
||||
However, the kernel will refuse to accept messages to
|
||||
a destination marked congested - in this case you will loop
|
||||
forever if you rely on poll to tell you what to do.
|
||||
This isn't a trivial problem, but applications can deal with
|
||||
this - by using congestion notifications, and by checking for
|
||||
ENOBUFS errors returned by sendmsg.
|
||||
However, the kernel will refuse to accept messages to
|
||||
a destination marked congested - in this case you will loop
|
||||
forever if you rely on poll to tell you what to do.
|
||||
This isn't a trivial problem, but applications can deal with
|
||||
this - by using congestion notifications, and by checking for
|
||||
ENOBUFS errors returned by sendmsg.
|
||||
|
||||
setsockopt(SOL_RDS, RDS_CANCEL_SENT_TO, &sockaddr_in)
|
||||
This allows the application to discard all messages queued to a
|
||||
specific destination on this particular socket.
|
||||
This allows the application to discard all messages queued to a
|
||||
specific destination on this particular socket.
|
||||
|
||||
This allows the application to cancel outstanding messages if
|
||||
it detects a timeout. For instance, if it tried to send a message,
|
||||
and the remote host is unreachable, RDS will keep trying forever.
|
||||
The application may decide it's not worth it, and cancel the
|
||||
operation. In this case, it would use RDS_CANCEL_SENT_TO to
|
||||
nuke any pending messages.
|
||||
This allows the application to cancel outstanding messages if
|
||||
it detects a timeout. For instance, if it tried to send a message,
|
||||
and the remote host is unreachable, RDS will keep trying forever.
|
||||
The application may decide it's not worth it, and cancel the
|
||||
operation. In this case, it would use RDS_CANCEL_SENT_TO to
|
||||
nuke any pending messages.
|
||||
|
||||
setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
|
||||
getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
|
||||
``setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..), getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)``
|
||||
Set or read an integer defining the underlying
|
||||
encapsulating transport to be used for RDS packets on the
|
||||
socket. When setting the option, integer argument may be
|
||||
|
@ -180,32 +187,39 @@ RDS Protocol
|
|||
Message header
|
||||
|
||||
The message header is a 'struct rds_header' (see rds.h):
|
||||
|
||||
Fields:
|
||||
|
||||
h_sequence:
|
||||
per-packet sequence number
|
||||
per-packet sequence number
|
||||
h_ack:
|
||||
piggybacked acknowledgment of last packet received
|
||||
piggybacked acknowledgment of last packet received
|
||||
h_len:
|
||||
length of data, not including header
|
||||
length of data, not including header
|
||||
h_sport:
|
||||
source port
|
||||
source port
|
||||
h_dport:
|
||||
destination port
|
||||
destination port
|
||||
h_flags:
|
||||
CONG_BITMAP - this is a congestion update bitmap
|
||||
ACK_REQUIRED - receiver must ack this packet
|
||||
RETRANSMITTED - packet has previously been sent
|
||||
Can be:
|
||||
|
||||
============= ==================================
|
||||
CONG_BITMAP this is a congestion update bitmap
|
||||
ACK_REQUIRED receiver must ack this packet
|
||||
RETRANSMITTED packet has previously been sent
|
||||
============= ==================================
|
||||
|
||||
h_credit:
|
||||
indicate to other end of connection that
|
||||
it has more credits available (i.e. there is
|
||||
more send room)
|
||||
indicate to other end of connection that
|
||||
it has more credits available (i.e. there is
|
||||
more send room)
|
||||
h_padding[4]:
|
||||
unused, for future use
|
||||
unused, for future use
|
||||
h_csum:
|
||||
header checksum
|
||||
header checksum
|
||||
h_exthdr:
|
||||
optional data can be passed here. This is currently used for
|
||||
passing RDMA-related information.
|
||||
optional data can be passed here. This is currently used for
|
||||
passing RDMA-related information.
|
||||
|
||||
ACK and retransmit handling
|
||||
|
||||
|
@ -260,7 +274,7 @@ RDS Protocol
|
|||
|
||||
|
||||
RDS Transport Layer
|
||||
==================
|
||||
===================
|
||||
|
||||
As mentioned above, RDS is not IB-specific. Its code is divided
|
||||
into a general RDS layer and a transport layer.
|
||||
|
@ -281,19 +295,25 @@ RDS Kernel Structures
|
|||
be sent and sets header fields as needed, based on the socket API.
|
||||
This is then queued for the individual connection and sent by the
|
||||
connection's transport.
|
||||
|
||||
struct rds_incoming
|
||||
a generic struct referring to incoming data that can be handed from
|
||||
the transport to the general code and queued by the general code
|
||||
while the socket is awoken. It is then passed back to the transport
|
||||
code to handle the actual copy-to-user.
|
||||
|
||||
struct rds_socket
|
||||
per-socket information
|
||||
|
||||
struct rds_connection
|
||||
per-connection information
|
||||
|
||||
struct rds_transport
|
||||
pointers to transport-specific functions
|
||||
|
||||
struct rds_statistics
|
||||
non-transport-specific statistics
|
||||
|
||||
struct rds_cong_map
|
||||
wraps the raw congestion bitmap, contains rbnode, waitq, etc.
|
||||
|
||||
|
@ -317,53 +337,58 @@ The send path
|
|||
=============
|
||||
|
||||
rds_sendmsg()
|
||||
struct rds_message built from incoming data
|
||||
CMSGs parsed (e.g. RDMA ops)
|
||||
transport connection alloced and connected if not already
|
||||
rds_message placed on send queue
|
||||
send worker awoken
|
||||
- struct rds_message built from incoming data
|
||||
- CMSGs parsed (e.g. RDMA ops)
|
||||
- transport connection alloced and connected if not already
|
||||
- rds_message placed on send queue
|
||||
- send worker awoken
|
||||
|
||||
rds_send_worker()
|
||||
calls rds_send_xmit() until queue is empty
|
||||
- calls rds_send_xmit() until queue is empty
|
||||
|
||||
rds_send_xmit()
|
||||
transmits congestion map if one is pending
|
||||
may set ACK_REQUIRED
|
||||
calls transport to send either non-RDMA or RDMA message
|
||||
(RDMA ops never retransmitted)
|
||||
- transmits congestion map if one is pending
|
||||
- may set ACK_REQUIRED
|
||||
- calls transport to send either non-RDMA or RDMA message
|
||||
(RDMA ops never retransmitted)
|
||||
|
||||
rds_ib_xmit()
|
||||
allocs work requests from send ring
|
||||
adds any new send credits available to peer (h_credits)
|
||||
maps the rds_message's sg list
|
||||
piggybacks ack
|
||||
populates work requests
|
||||
post send to connection's queue pair
|
||||
- allocs work requests from send ring
|
||||
- adds any new send credits available to peer (h_credits)
|
||||
- maps the rds_message's sg list
|
||||
- piggybacks ack
|
||||
- populates work requests
|
||||
- post send to connection's queue pair
|
||||
|
||||
The recv path
|
||||
=============
|
||||
|
||||
rds_ib_recv_cq_comp_handler()
|
||||
looks at write completions
|
||||
unmaps recv buffer from device
|
||||
no errors, call rds_ib_process_recv()
|
||||
refill recv ring
|
||||
- looks at write completions
|
||||
- unmaps recv buffer from device
|
||||
- no errors, call rds_ib_process_recv()
|
||||
- refill recv ring
|
||||
|
||||
rds_ib_process_recv()
|
||||
validate header checksum
|
||||
copy header to rds_ib_incoming struct if start of a new datagram
|
||||
add to ibinc's fraglist
|
||||
if competed datagram:
|
||||
update cong map if datagram was cong update
|
||||
call rds_recv_incoming() otherwise
|
||||
note if ack is required
|
||||
- validate header checksum
|
||||
- copy header to rds_ib_incoming struct if start of a new datagram
|
||||
- add to ibinc's fraglist
|
||||
- if competed datagram:
|
||||
- update cong map if datagram was cong update
|
||||
- call rds_recv_incoming() otherwise
|
||||
- note if ack is required
|
||||
|
||||
rds_recv_incoming()
|
||||
drop duplicate packets
|
||||
respond to pings
|
||||
find the sock associated with this datagram
|
||||
add to sock queue
|
||||
wake up sock
|
||||
do some congestion calculations
|
||||
- drop duplicate packets
|
||||
- respond to pings
|
||||
- find the sock associated with this datagram
|
||||
- add to sock queue
|
||||
- wake up sock
|
||||
- do some congestion calculations
|
||||
rds_recvmsg
|
||||
copy data into user iovec
|
||||
handle CMSGs
|
||||
return to application
|
||||
- copy data into user iovec
|
||||
- handle CMSGs
|
||||
- return to application
|
||||
|
||||
Multipath RDS (mprds)
|
||||
=====================
|
|
@ -1,5 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=======================================
|
||||
Linux wireless regulatory documentation
|
||||
---------------------------------------
|
||||
=======================================
|
||||
|
||||
This document gives a brief review over how the Linux wireless
|
||||
regulatory infrastructure works.
|
||||
|
@ -57,7 +60,7 @@ Users can use iw:
|
|||
|
||||
http://wireless.kernel.org/en/users/Documentation/iw
|
||||
|
||||
An example:
|
||||
An example::
|
||||
|
||||
# set regulatory domain to "Costa Rica"
|
||||
iw reg set CR
|
||||
|
@ -104,9 +107,9 @@ Example code - drivers hinting an alpha2:
|
|||
|
||||
This example comes from the zd1211rw device driver. You can start
|
||||
by having a mapping of your device's EEPROM country/regulatory
|
||||
domain value to a specific alpha2 as follows:
|
||||
domain value to a specific alpha2 as follows::
|
||||
|
||||
static struct zd_reg_alpha2_map reg_alpha2_map[] = {
|
||||
static struct zd_reg_alpha2_map reg_alpha2_map[] = {
|
||||
{ ZD_REGDOMAIN_FCC, "US" },
|
||||
{ ZD_REGDOMAIN_IC, "CA" },
|
||||
{ ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */
|
||||
|
@ -116,10 +119,10 @@ static struct zd_reg_alpha2_map reg_alpha2_map[] = {
|
|||
{ ZD_REGDOMAIN_FRANCE, "FR" },
|
||||
|
||||
Then you can define a routine to map your read EEPROM value to an alpha2,
|
||||
as follows:
|
||||
as follows::
|
||||
|
||||
static int zd_reg2alpha2(u8 regdomain, char *alpha2)
|
||||
{
|
||||
static int zd_reg2alpha2(u8 regdomain, char *alpha2)
|
||||
{
|
||||
unsigned int i;
|
||||
struct zd_reg_alpha2_map *reg_map;
|
||||
for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) {
|
||||
|
@ -131,12 +134,14 @@ static int zd_reg2alpha2(u8 regdomain, char *alpha2)
|
|||
}
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
Lastly, you can then hint to the core of your discovered alpha2, if a match
|
||||
was found. You need to do this after you have registered your wiphy. You
|
||||
are expected to do this during initialization.
|
||||
|
||||
::
|
||||
|
||||
r = zd_reg2alpha2(mac->regdomain, alpha2);
|
||||
if (!r)
|
||||
regulatory_hint(hw->wiphy, alpha2);
|
||||
|
@ -156,9 +161,9 @@ call regulatory_hint() with the regulatory domain structure in it.
|
|||
Bellow is a simple example, with a regulatory domain cached using the stack.
|
||||
Your implementation may vary (read EEPROM cache instead, for example).
|
||||
|
||||
Example cache of some regulatory domain
|
||||
Example cache of some regulatory domain::
|
||||
|
||||
struct ieee80211_regdomain mydriver_jp_regdom = {
|
||||
struct ieee80211_regdomain mydriver_jp_regdom = {
|
||||
.n_reg_rules = 3,
|
||||
.alpha2 = "JP",
|
||||
//.alpha2 = "99", /* If I have no alpha2 to map it to */
|
||||
|
@ -173,9 +178,9 @@ struct ieee80211_regdomain mydriver_jp_regdom = {
|
|||
NL80211_RRF_NO_IR|
|
||||
NL80211_RRF_DFS),
|
||||
}
|
||||
};
|
||||
};
|
||||
|
||||
Then in some part of your code after your wiphy has been registered:
|
||||
Then in some part of your code after your wiphy has been registered::
|
||||
|
||||
struct ieee80211_regdomain *rd;
|
||||
int size_of_regd;
|
|
@ -1,6 +1,8 @@
|
|||
======================
|
||||
RxRPC NETWORK PROTOCOL
|
||||
======================
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
======================
|
||||
RxRPC Network Protocol
|
||||
======================
|
||||
|
||||
The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
|
||||
that can be used to perform RxRPC remote operations. This is done over sockets
|
||||
|
@ -9,36 +11,35 @@ receive data, aborts and errors.
|
|||
|
||||
Contents of this document:
|
||||
|
||||
(*) Overview.
|
||||
(#) Overview.
|
||||
|
||||
(*) RxRPC protocol summary.
|
||||
(#) RxRPC protocol summary.
|
||||
|
||||
(*) AF_RXRPC driver model.
|
||||
(#) AF_RXRPC driver model.
|
||||
|
||||
(*) Control messages.
|
||||
(#) Control messages.
|
||||
|
||||
(*) Socket options.
|
||||
(#) Socket options.
|
||||
|
||||
(*) Security.
|
||||
(#) Security.
|
||||
|
||||
(*) Example client usage.
|
||||
(#) Example client usage.
|
||||
|
||||
(*) Example server usage.
|
||||
(#) Example server usage.
|
||||
|
||||
(*) AF_RXRPC kernel interface.
|
||||
(#) AF_RXRPC kernel interface.
|
||||
|
||||
(*) Configurable parameters.
|
||||
(#) Configurable parameters.
|
||||
|
||||
|
||||
========
|
||||
OVERVIEW
|
||||
Overview
|
||||
========
|
||||
|
||||
RxRPC is a two-layer protocol. There is a session layer which provides
|
||||
reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
|
||||
layer, but implements a real network protocol; and there's the presentation
|
||||
layer which renders structured data to binary blobs and back again using XDR
|
||||
(as does SunRPC):
|
||||
(as does SunRPC)::
|
||||
|
||||
+-------------+
|
||||
| Application |
|
||||
|
@ -85,31 +86,30 @@ The Andrew File System (AFS) is an example of an application that uses this and
|
|||
that has both kernel (filesystem) and userspace (utility) components.
|
||||
|
||||
|
||||
======================
|
||||
RXRPC PROTOCOL SUMMARY
|
||||
RxRPC Protocol Summary
|
||||
======================
|
||||
|
||||
An overview of the RxRPC protocol:
|
||||
|
||||
(*) RxRPC sits on top of another networking protocol (UDP is the only option
|
||||
(#) RxRPC sits on top of another networking protocol (UDP is the only option
|
||||
currently), and uses this to provide network transport. UDP ports, for
|
||||
example, provide transport endpoints.
|
||||
|
||||
(*) RxRPC supports multiple virtual "connections" from any given transport
|
||||
(#) RxRPC supports multiple virtual "connections" from any given transport
|
||||
endpoint, thus allowing the endpoints to be shared, even to the same
|
||||
remote endpoint.
|
||||
|
||||
(*) Each connection goes to a particular "service". A connection may not go
|
||||
(#) Each connection goes to a particular "service". A connection may not go
|
||||
to multiple services. A service may be considered the RxRPC equivalent of
|
||||
a port number. AF_RXRPC permits multiple services to share an endpoint.
|
||||
|
||||
(*) Client-originating packets are marked, thus a transport endpoint can be
|
||||
(#) Client-originating packets are marked, thus a transport endpoint can be
|
||||
shared between client and server connections (connections have a
|
||||
direction).
|
||||
|
||||
(*) Up to a billion connections may be supported concurrently between one
|
||||
(#) Up to a billion connections may be supported concurrently between one
|
||||
local transport endpoint and one service on one remote endpoint. An RxRPC
|
||||
connection is described by seven numbers:
|
||||
connection is described by seven numbers::
|
||||
|
||||
Local address }
|
||||
Local port } Transport (UDP) address
|
||||
|
@ -119,22 +119,22 @@ An overview of the RxRPC protocol:
|
|||
Connection ID
|
||||
Service ID
|
||||
|
||||
(*) Each RxRPC operation is a "call". A connection may make up to four
|
||||
(#) Each RxRPC operation is a "call". A connection may make up to four
|
||||
billion calls, but only up to four calls may be in progress on a
|
||||
connection at any one time.
|
||||
|
||||
(*) Calls are two-phase and asymmetric: the client sends its request data,
|
||||
(#) Calls are two-phase and asymmetric: the client sends its request data,
|
||||
which the service receives; then the service transmits the reply data
|
||||
which the client receives.
|
||||
|
||||
(*) The data blobs are of indefinite size, the end of a phase is marked with a
|
||||
(#) The data blobs are of indefinite size, the end of a phase is marked with a
|
||||
flag in the packet. The number of packets of data making up one blob may
|
||||
not exceed 4 billion, however, as this would cause the sequence number to
|
||||
wrap.
|
||||
|
||||
(*) The first four bytes of the request data are the service operation ID.
|
||||
(#) The first four bytes of the request data are the service operation ID.
|
||||
|
||||
(*) Security is negotiated on a per-connection basis. The connection is
|
||||
(#) Security is negotiated on a per-connection basis. The connection is
|
||||
initiated by the first data packet on it arriving. If security is
|
||||
requested, the server then issues a "challenge" and then the client
|
||||
replies with a "response". If the response is successful, the security is
|
||||
|
@ -143,146 +143,145 @@ An overview of the RxRPC protocol:
|
|||
connection lapse before the client, the security will be renegotiated if
|
||||
the client uses the connection again.
|
||||
|
||||
(*) Calls use ACK packets to handle reliability. Data packets are also
|
||||
(#) Calls use ACK packets to handle reliability. Data packets are also
|
||||
explicitly sequenced per call.
|
||||
|
||||
(*) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
|
||||
(#) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
|
||||
A hard-ACK indicates to the far side that all the data received to a point
|
||||
has been received and processed; a soft-ACK indicates that the data has
|
||||
been received but may yet be discarded and re-requested. The sender may
|
||||
not discard any transmittable packets until they've been hard-ACK'd.
|
||||
|
||||
(*) Reception of a reply data packet implicitly hard-ACK's all the data
|
||||
(#) Reception of a reply data packet implicitly hard-ACK's all the data
|
||||
packets that make up the request.
|
||||
|
||||
(*) An call is complete when the request has been sent, the reply has been
|
||||
(#) An call is complete when the request has been sent, the reply has been
|
||||
received and the final hard-ACK on the last packet of the reply has
|
||||
reached the server.
|
||||
|
||||
(*) An call may be aborted by either end at any time up to its completion.
|
||||
(#) An call may be aborted by either end at any time up to its completion.
|
||||
|
||||
|
||||
=====================
|
||||
AF_RXRPC DRIVER MODEL
|
||||
AF_RXRPC Driver Model
|
||||
=====================
|
||||
|
||||
About the AF_RXRPC driver:
|
||||
|
||||
(*) The AF_RXRPC protocol transparently uses internal sockets of the transport
|
||||
(#) The AF_RXRPC protocol transparently uses internal sockets of the transport
|
||||
protocol to represent transport endpoints.
|
||||
|
||||
(*) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC
|
||||
(#) AF_RXRPC sockets map onto RxRPC connection bundles. Actual RxRPC
|
||||
connections are handled transparently. One client socket may be used to
|
||||
make multiple simultaneous calls to the same service. One server socket
|
||||
may handle calls from many clients.
|
||||
|
||||
(*) Additional parallel client connections will be initiated to support extra
|
||||
(#) Additional parallel client connections will be initiated to support extra
|
||||
concurrent calls, up to a tunable limit.
|
||||
|
||||
(*) Each connection is retained for a certain amount of time [tunable] after
|
||||
(#) Each connection is retained for a certain amount of time [tunable] after
|
||||
the last call currently using it has completed in case a new call is made
|
||||
that could reuse it.
|
||||
|
||||
(*) Each internal UDP socket is retained [tunable] for a certain amount of
|
||||
(#) Each internal UDP socket is retained [tunable] for a certain amount of
|
||||
time [tunable] after the last connection using it discarded, in case a new
|
||||
connection is made that could use it.
|
||||
|
||||
(*) A client-side connection is only shared between calls if they have have
|
||||
(#) A client-side connection is only shared between calls if they have have
|
||||
the same key struct describing their security (and assuming the calls
|
||||
would otherwise share the connection). Non-secured calls would also be
|
||||
able to share connections with each other.
|
||||
|
||||
(*) A server-side connection is shared if the client says it is.
|
||||
(#) A server-side connection is shared if the client says it is.
|
||||
|
||||
(*) ACK'ing is handled by the protocol driver automatically, including ping
|
||||
(#) ACK'ing is handled by the protocol driver automatically, including ping
|
||||
replying.
|
||||
|
||||
(*) SO_KEEPALIVE automatically pings the other side to keep the connection
|
||||
(#) SO_KEEPALIVE automatically pings the other side to keep the connection
|
||||
alive [TODO].
|
||||
|
||||
(*) If an ICMP error is received, all calls affected by that error will be
|
||||
(#) If an ICMP error is received, all calls affected by that error will be
|
||||
aborted with an appropriate network error passed through recvmsg().
|
||||
|
||||
|
||||
Interaction with the user of the RxRPC socket:
|
||||
|
||||
(*) A socket is made into a server socket by binding an address with a
|
||||
(#) A socket is made into a server socket by binding an address with a
|
||||
non-zero service ID.
|
||||
|
||||
(*) In the client, sending a request is achieved with one or more sendmsgs,
|
||||
(#) In the client, sending a request is achieved with one or more sendmsgs,
|
||||
followed by the reply being received with one or more recvmsgs.
|
||||
|
||||
(*) The first sendmsg for a request to be sent from a client contains a tag to
|
||||
(#) The first sendmsg for a request to be sent from a client contains a tag to
|
||||
be used in all other sendmsgs or recvmsgs associated with that call. The
|
||||
tag is carried in the control data.
|
||||
|
||||
(*) connect() is used to supply a default destination address for a client
|
||||
(#) connect() is used to supply a default destination address for a client
|
||||
socket. This may be overridden by supplying an alternate address to the
|
||||
first sendmsg() of a call (struct msghdr::msg_name).
|
||||
|
||||
(*) If connect() is called on an unbound client, a random local port will
|
||||
(#) If connect() is called on an unbound client, a random local port will
|
||||
bound before the operation takes place.
|
||||
|
||||
(*) A server socket may also be used to make client calls. To do this, the
|
||||
(#) A server socket may also be used to make client calls. To do this, the
|
||||
first sendmsg() of the call must specify the target address. The server's
|
||||
transport endpoint is used to send the packets.
|
||||
|
||||
(*) Once the application has received the last message associated with a call,
|
||||
(#) Once the application has received the last message associated with a call,
|
||||
the tag is guaranteed not to be seen again, and so it can be used to pin
|
||||
client resources. A new call can then be initiated with the same tag
|
||||
without fear of interference.
|
||||
|
||||
(*) In the server, a request is received with one or more recvmsgs, then the
|
||||
(#) In the server, a request is received with one or more recvmsgs, then the
|
||||
the reply is transmitted with one or more sendmsgs, and then the final ACK
|
||||
is received with a last recvmsg.
|
||||
|
||||
(*) When sending data for a call, sendmsg is given MSG_MORE if there's more
|
||||
(#) When sending data for a call, sendmsg is given MSG_MORE if there's more
|
||||
data to come on that call.
|
||||
|
||||
(*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
|
||||
(#) When receiving data for a call, recvmsg flags MSG_MORE if there's more
|
||||
data to come for that call.
|
||||
|
||||
(*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
|
||||
(#) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
|
||||
to indicate the terminal message for that call.
|
||||
|
||||
(*) A call may be aborted by adding an abort control message to the control
|
||||
(#) A call may be aborted by adding an abort control message to the control
|
||||
data. Issuing an abort terminates the kernel's use of that call's tag.
|
||||
Any messages waiting in the receive queue for that call will be discarded.
|
||||
|
||||
(*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
|
||||
(#) Aborts, busy notifications and challenge packets are delivered by recvmsg,
|
||||
and control data messages will be set to indicate the context. Receiving
|
||||
an abort or a busy message terminates the kernel's use of that call's tag.
|
||||
|
||||
(*) The control data part of the msghdr struct is used for a number of things:
|
||||
(#) The control data part of the msghdr struct is used for a number of things:
|
||||
|
||||
(*) The tag of the intended or affected call.
|
||||
(#) The tag of the intended or affected call.
|
||||
|
||||
(*) Sending or receiving errors, aborts and busy notifications.
|
||||
(#) Sending or receiving errors, aborts and busy notifications.
|
||||
|
||||
(*) Notifications of incoming calls.
|
||||
(#) Notifications of incoming calls.
|
||||
|
||||
(*) Sending debug requests and receiving debug replies [TODO].
|
||||
(#) Sending debug requests and receiving debug replies [TODO].
|
||||
|
||||
(*) When the kernel has received and set up an incoming call, it sends a
|
||||
(#) When the kernel has received and set up an incoming call, it sends a
|
||||
message to server application to let it know there's a new call awaiting
|
||||
its acceptance [recvmsg reports a special control message]. The server
|
||||
application then uses sendmsg to assign a tag to the new call. Once that
|
||||
is done, the first part of the request data will be delivered by recvmsg.
|
||||
|
||||
(*) The server application has to provide the server socket with a keyring of
|
||||
(#) The server application has to provide the server socket with a keyring of
|
||||
secret keys corresponding to the security types it permits. When a secure
|
||||
connection is being set up, the kernel looks up the appropriate secret key
|
||||
in the keyring and then sends a challenge packet to the client and
|
||||
receives a response packet. The kernel then checks the authorisation of
|
||||
the packet and either aborts the connection or sets up the security.
|
||||
|
||||
(*) The name of the key a client will use to secure its communications is
|
||||
(#) The name of the key a client will use to secure its communications is
|
||||
nominated by a socket option.
|
||||
|
||||
|
||||
Notes on sendmsg:
|
||||
|
||||
(*) MSG_WAITALL can be set to tell sendmsg to ignore signals if the peer is
|
||||
(#) MSG_WAITALL can be set to tell sendmsg to ignore signals if the peer is
|
||||
making progress at accepting packets within a reasonable time such that we
|
||||
manage to queue up all the data for transmission. This requires the
|
||||
client to accept at least one packet per 2*RTT time period.
|
||||
|
@ -294,7 +293,7 @@ Notes on sendmsg:
|
|||
|
||||
Notes on recvmsg:
|
||||
|
||||
(*) If there's a sequence of data messages belonging to a particular call on
|
||||
(#) If there's a sequence of data messages belonging to a particular call on
|
||||
the receive queue, then recvmsg will keep working through them until:
|
||||
|
||||
(a) it meets the end of that call's received data,
|
||||
|
@ -320,13 +319,13 @@ Notes on recvmsg:
|
|||
flagged.
|
||||
|
||||
|
||||
================
|
||||
CONTROL MESSAGES
|
||||
Control Messages
|
||||
================
|
||||
|
||||
AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
|
||||
calls, to invoke certain actions and to report certain conditions. These are:
|
||||
|
||||
======================= === =========== ===============================
|
||||
MESSAGE ID SRT DATA MEANING
|
||||
======================= === =========== ===============================
|
||||
RXRPC_USER_CALL_ID sr- User ID App's call specifier
|
||||
|
@ -340,10 +339,11 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
RXRPC_EXCLUSIVE_CALL s-- n/a Make an exclusive client call
|
||||
RXRPC_UPGRADE_SERVICE s-- n/a Client call can be upgraded
|
||||
RXRPC_TX_LENGTH s-- data len Total length of Tx data
|
||||
======================= === =========== ===============================
|
||||
|
||||
(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
|
||||
|
||||
(*) RXRPC_USER_CALL_ID
|
||||
(#) RXRPC_USER_CALL_ID
|
||||
|
||||
This is used to indicate the application's call ID. It's an unsigned long
|
||||
that the app specifies in the client by attaching it to the first data
|
||||
|
@ -351,7 +351,7 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
message. recvmsg() passes it in conjunction with all messages except
|
||||
those of the RXRPC_NEW_CALL message.
|
||||
|
||||
(*) RXRPC_ABORT
|
||||
(#) RXRPC_ABORT
|
||||
|
||||
This is can be used by an application to abort a call by passing it to
|
||||
sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
|
||||
|
@ -359,13 +359,13 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
specify the call affected. If an abort is being sent, then error EBADSLT
|
||||
will be returned if there is no call with that user ID.
|
||||
|
||||
(*) RXRPC_ACK
|
||||
(#) RXRPC_ACK
|
||||
|
||||
This is delivered to a server application to indicate that the final ACK
|
||||
of a call was received from the client. It will be associated with an
|
||||
RXRPC_USER_CALL_ID to indicate the call that's now complete.
|
||||
|
||||
(*) RXRPC_NET_ERROR
|
||||
(#) RXRPC_NET_ERROR
|
||||
|
||||
This is delivered to an application to indicate that an ICMP error message
|
||||
was encountered in the process of trying to talk to the peer. An
|
||||
|
@ -373,13 +373,13 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
|
||||
affected.
|
||||
|
||||
(*) RXRPC_BUSY
|
||||
(#) RXRPC_BUSY
|
||||
|
||||
This is delivered to a client application to indicate that a call was
|
||||
rejected by the server due to the server being busy. It will be
|
||||
associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
|
||||
|
||||
(*) RXRPC_LOCAL_ERROR
|
||||
(#) RXRPC_LOCAL_ERROR
|
||||
|
||||
This is delivered to an application to indicate that a local error was
|
||||
encountered and that a call has been aborted because of it. An
|
||||
|
@ -387,13 +387,13 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
|
||||
affected.
|
||||
|
||||
(*) RXRPC_NEW_CALL
|
||||
(#) RXRPC_NEW_CALL
|
||||
|
||||
This is delivered to indicate to a server application that a new call has
|
||||
arrived and is awaiting acceptance. No user ID is associated with this,
|
||||
as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
|
||||
|
||||
(*) RXRPC_ACCEPT
|
||||
(#) RXRPC_ACCEPT
|
||||
|
||||
This is used by a server application to attempt to accept a call and
|
||||
assign it a user ID. It should be associated with an RXRPC_USER_CALL_ID
|
||||
|
@ -402,12 +402,12 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
return error ENODATA. If the user ID is already in use by another call,
|
||||
then error EBADSLT will be returned.
|
||||
|
||||
(*) RXRPC_EXCLUSIVE_CALL
|
||||
(#) RXRPC_EXCLUSIVE_CALL
|
||||
|
||||
This is used to indicate that a client call should be made on a one-off
|
||||
connection. The connection is discarded once the call has terminated.
|
||||
|
||||
(*) RXRPC_UPGRADE_SERVICE
|
||||
(#) RXRPC_UPGRADE_SERVICE
|
||||
|
||||
This is used to make a client call to probe if the specified service ID
|
||||
may be upgraded by the server. The caller must check msg_name returned to
|
||||
|
@ -419,7 +419,7 @@ calls, to invoke certain actions and to report certain conditions. These are:
|
|||
future communication to that server and RXRPC_UPGRADE_SERVICE should no
|
||||
longer be set.
|
||||
|
||||
(*) RXRPC_TX_LENGTH
|
||||
(#) RXRPC_TX_LENGTH
|
||||
|
||||
This is used to inform the kernel of the total amount of data that is
|
||||
going to be transmitted by a call (whether in a client request or a
|
||||
|
@ -443,7 +443,7 @@ SOCKET OPTIONS
|
|||
|
||||
AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
|
||||
|
||||
(*) RXRPC_SECURITY_KEY
|
||||
(#) RXRPC_SECURITY_KEY
|
||||
|
||||
This is used to specify the description of the key to be used. The key is
|
||||
extracted from the calling process's keyrings with request_key() and
|
||||
|
@ -452,17 +452,17 @@ AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
|
|||
The optval pointer points to the description string, and optlen indicates
|
||||
how long the string is, without the NUL terminator.
|
||||
|
||||
(*) RXRPC_SECURITY_KEYRING
|
||||
(#) RXRPC_SECURITY_KEYRING
|
||||
|
||||
Similar to above but specifies a keyring of server secret keys to use (key
|
||||
type "keyring"). See the "Security" section.
|
||||
|
||||
(*) RXRPC_EXCLUSIVE_CONNECTION
|
||||
(#) RXRPC_EXCLUSIVE_CONNECTION
|
||||
|
||||
This is used to request that new connections should be used for each call
|
||||
made subsequently on this socket. optval should be NULL and optlen 0.
|
||||
|
||||
(*) RXRPC_MIN_SECURITY_LEVEL
|
||||
(#) RXRPC_MIN_SECURITY_LEVEL
|
||||
|
||||
This is used to specify the minimum security level required for calls on
|
||||
this socket. optval must point to an int containing one of the following
|
||||
|
@ -482,14 +482,14 @@ AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
|
|||
Encrypted checksum plus entire packet padded and encrypted, including
|
||||
actual packet length.
|
||||
|
||||
(*) RXRPC_UPGRADEABLE_SERVICE
|
||||
(#) RXRPC_UPGRADEABLE_SERVICE
|
||||
|
||||
This is used to indicate that a service socket with two bindings may
|
||||
upgrade one bound service to the other if requested by the client. optval
|
||||
must point to an array of two unsigned short ints. The first is the
|
||||
service ID to upgrade from and the second the service ID to upgrade to.
|
||||
|
||||
(*) RXRPC_SUPPORTED_CMSG
|
||||
(#) RXRPC_SUPPORTED_CMSG
|
||||
|
||||
This is a read-only option that writes an int into the buffer indicating
|
||||
the highest control message type supported.
|
||||
|
@ -509,7 +509,7 @@ found at:
|
|||
http://people.redhat.com/~dhowells/rxrpc/klog.c
|
||||
|
||||
The payload provided to add_key() on the client should be of the following
|
||||
form:
|
||||
form::
|
||||
|
||||
struct rxrpc_key_sec2_v1 {
|
||||
uint16_t security_index; /* 2 */
|
||||
|
@ -546,14 +546,14 @@ EXAMPLE CLIENT USAGE
|
|||
|
||||
A client would issue an operation by:
|
||||
|
||||
(1) An RxRPC socket is set up by:
|
||||
(1) An RxRPC socket is set up by::
|
||||
|
||||
client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
|
||||
|
||||
Where the third parameter indicates the protocol family of the transport
|
||||
socket used - usually IPv4 but it can also be IPv6 [TODO].
|
||||
|
||||
(2) A local address can optionally be bound:
|
||||
(2) A local address can optionally be bound::
|
||||
|
||||
struct sockaddr_rxrpc srx = {
|
||||
.srx_family = AF_RXRPC,
|
||||
|
@ -570,20 +570,20 @@ A client would issue an operation by:
|
|||
several unrelated RxRPC sockets. Security is handled on a basis of
|
||||
per-RxRPC virtual connection.
|
||||
|
||||
(3) The security is set:
|
||||
(3) The security is set::
|
||||
|
||||
const char *key = "AFS:cambridge.redhat.com";
|
||||
setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
|
||||
|
||||
This issues a request_key() to get the key representing the security
|
||||
context. The minimum security level can be set:
|
||||
context. The minimum security level can be set::
|
||||
|
||||
unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
|
||||
setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
|
||||
&sec, sizeof(sec));
|
||||
|
||||
(4) The server to be contacted can then be specified (alternatively this can
|
||||
be done through sendmsg):
|
||||
be done through sendmsg)::
|
||||
|
||||
struct sockaddr_rxrpc srx = {
|
||||
.srx_family = AF_RXRPC,
|
||||
|
@ -598,7 +598,9 @@ A client would issue an operation by:
|
|||
(5) The request data should then be posted to the server socket using a series
|
||||
of sendmsg() calls, each with the following control message attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
================== ===================================
|
||||
RXRPC_USER_CALL_ID specifies the user ID for this call
|
||||
================== ===================================
|
||||
|
||||
MSG_MORE should be set in msghdr::msg_flags on all but the last part of
|
||||
the request. Multiple requests may be made simultaneously.
|
||||
|
@ -635,13 +637,12 @@ any more calls (further calls to the same destination will be blocked until the
|
|||
probe is concluded).
|
||||
|
||||
|
||||
====================
|
||||
EXAMPLE SERVER USAGE
|
||||
Example Server Usage
|
||||
====================
|
||||
|
||||
A server would be set up to accept operations in the following manner:
|
||||
|
||||
(1) An RxRPC socket is created by:
|
||||
(1) An RxRPC socket is created by::
|
||||
|
||||
server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
|
||||
|
||||
|
@ -649,7 +650,7 @@ A server would be set up to accept operations in the following manner:
|
|||
socket used - usually IPv4.
|
||||
|
||||
(2) Security is set up if desired by giving the socket a keyring with server
|
||||
secret keys in it:
|
||||
secret keys in it::
|
||||
|
||||
keyring = add_key("keyring", "AFSkeys", NULL, 0,
|
||||
KEY_SPEC_PROCESS_KEYRING);
|
||||
|
@ -663,7 +664,7 @@ A server would be set up to accept operations in the following manner:
|
|||
The keyring can be manipulated after it has been given to the socket. This
|
||||
permits the server to add more keys, replace keys, etc. while it is live.
|
||||
|
||||
(3) A local address must then be bound:
|
||||
(3) A local address must then be bound::
|
||||
|
||||
struct sockaddr_rxrpc srx = {
|
||||
.srx_family = AF_RXRPC,
|
||||
|
@ -680,7 +681,7 @@ A server would be set up to accept operations in the following manner:
|
|||
should be called twice.
|
||||
|
||||
(4) If service upgrading is required, first two service IDs must have been
|
||||
bound and then the following option must be set:
|
||||
bound and then the following option must be set::
|
||||
|
||||
unsigned short service_ids[2] = { from_ID, to_ID };
|
||||
setsockopt(server, SOL_RXRPC, RXRPC_UPGRADEABLE_SERVICE,
|
||||
|
@ -690,14 +691,14 @@ A server would be set up to accept operations in the following manner:
|
|||
to_ID if they request it. This will be reflected in msg_name obtained
|
||||
through recvmsg() when the request data is delivered to userspace.
|
||||
|
||||
(5) The server is then set to listen out for incoming calls:
|
||||
(5) The server is then set to listen out for incoming calls::
|
||||
|
||||
listen(server, 100);
|
||||
|
||||
(6) The kernel notifies the server of pending incoming connections by sending
|
||||
it a message for each. This is received with recvmsg() on the server
|
||||
socket. It has no data, and has a single dataless control message
|
||||
attached:
|
||||
attached::
|
||||
|
||||
RXRPC_NEW_CALL
|
||||
|
||||
|
@ -709,8 +710,10 @@ A server would be set up to accept operations in the following manner:
|
|||
(7) The server then accepts the new call by issuing a sendmsg() with two
|
||||
pieces of control data and no actual data:
|
||||
|
||||
RXRPC_ACCEPT - indicate connection acceptance
|
||||
RXRPC_USER_CALL_ID - specify user ID for this call
|
||||
================== ==============================
|
||||
RXRPC_ACCEPT indicate connection acceptance
|
||||
RXRPC_USER_CALL_ID specify user ID for this call
|
||||
================== ==============================
|
||||
|
||||
(8) The first request data packet will then be posted to the server socket for
|
||||
recvmsg() to pick up. At that point, the RxRPC address for the call can
|
||||
|
@ -722,12 +725,17 @@ A server would be set up to accept operations in the following manner:
|
|||
|
||||
All data will be delivered with the following control message attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
|
||||
================== ===================================
|
||||
RXRPC_USER_CALL_ID specifies the user ID for this call
|
||||
================== ===================================
|
||||
|
||||
(9) The reply data should then be posted to the server socket using a series
|
||||
of sendmsg() calls, each with the following control messages attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
================== ===================================
|
||||
RXRPC_USER_CALL_ID specifies the user ID for this call
|
||||
================== ===================================
|
||||
|
||||
MSG_MORE should be set in msghdr::msg_flags on all but the last message
|
||||
for a particular call.
|
||||
|
@ -736,8 +744,10 @@ A server would be set up to accept operations in the following manner:
|
|||
when it is received. It will take the form of a dataless message with two
|
||||
control messages attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
RXRPC_ACK - indicates final ACK (no data)
|
||||
================== ===================================
|
||||
RXRPC_USER_CALL_ID specifies the user ID for this call
|
||||
RXRPC_ACK indicates final ACK (no data)
|
||||
================== ===================================
|
||||
|
||||
MSG_EOR will be flagged to indicate that this is the final message for
|
||||
this call.
|
||||
|
@ -746,8 +756,10 @@ A server would be set up to accept operations in the following manner:
|
|||
aborted by calling sendmsg() with a dataless message with the following
|
||||
control messages attached:
|
||||
|
||||
RXRPC_USER_CALL_ID - specifies the user ID for this call
|
||||
RXRPC_ABORT - indicates abort code (4 byte data)
|
||||
================== ===================================
|
||||
RXRPC_USER_CALL_ID specifies the user ID for this call
|
||||
RXRPC_ABORT indicates abort code (4 byte data)
|
||||
================== ===================================
|
||||
|
||||
Any packets waiting in the socket's receive queue will be discarded if
|
||||
this is issued.
|
||||
|
@ -757,8 +769,7 @@ the one server socket, using control messages on sendmsg() and recvmsg() to
|
|||
determine the call affected.
|
||||
|
||||
|
||||
=========================
|
||||
AF_RXRPC KERNEL INTERFACE
|
||||
AF_RXRPC Kernel Interface
|
||||
=========================
|
||||
|
||||
The AF_RXRPC module also provides an interface for use by in-kernel utilities
|
||||
|
@ -786,7 +797,7 @@ then it passes this to the kernel interface functions.
|
|||
|
||||
The kernel interface functions are as follows:
|
||||
|
||||
(*) Begin a new client call.
|
||||
(#) Begin a new client call::
|
||||
|
||||
struct rxrpc_call *
|
||||
rxrpc_kernel_begin_call(struct socket *sock,
|
||||
|
@ -837,7 +848,7 @@ The kernel interface functions are as follows:
|
|||
returned. The caller now holds a reference on this and it must be
|
||||
properly ended.
|
||||
|
||||
(*) End a client call.
|
||||
(#) End a client call::
|
||||
|
||||
void rxrpc_kernel_end_call(struct socket *sock,
|
||||
struct rxrpc_call *call);
|
||||
|
@ -846,7 +857,7 @@ The kernel interface functions are as follows:
|
|||
from AF_RXRPC's knowledge and will not be seen again in association with
|
||||
the specified call.
|
||||
|
||||
(*) Send data through a call.
|
||||
(#) Send data through a call::
|
||||
|
||||
typedef void (*rxrpc_notify_end_tx_t)(struct sock *sk,
|
||||
unsigned long user_call_ID,
|
||||
|
@ -872,7 +883,7 @@ The kernel interface functions are as follows:
|
|||
called with the call-state spinlock held to prevent any reply or final ACK
|
||||
from being delivered first.
|
||||
|
||||
(*) Receive data from a call.
|
||||
(#) Receive data from a call::
|
||||
|
||||
int rxrpc_kernel_recv_data(struct socket *sock,
|
||||
struct rxrpc_call *call,
|
||||
|
@ -902,12 +913,14 @@ The kernel interface functions are as follows:
|
|||
more data was available, EMSGSIZE is returned.
|
||||
|
||||
If a remote ABORT is detected, the abort code received will be stored in
|
||||
*_abort and ECONNABORTED will be returned.
|
||||
``*_abort`` and ECONNABORTED will be returned.
|
||||
|
||||
The service ID that the call ended up with is returned into *_service.
|
||||
This can be used to see if a call got a service upgrade.
|
||||
|
||||
(*) Abort a call.
|
||||
(#) Abort a call??
|
||||
|
||||
::
|
||||
|
||||
void rxrpc_kernel_abort_call(struct socket *sock,
|
||||
struct rxrpc_call *call,
|
||||
|
@ -916,7 +929,7 @@ The kernel interface functions are as follows:
|
|||
This is used to abort a call if it's still in an abortable state. The
|
||||
abort code specified will be placed in the ABORT message sent.
|
||||
|
||||
(*) Intercept received RxRPC messages.
|
||||
(#) Intercept received RxRPC messages::
|
||||
|
||||
typedef void (*rxrpc_interceptor_t)(struct sock *sk,
|
||||
unsigned long user_call_ID,
|
||||
|
@ -937,7 +950,8 @@ The kernel interface functions are as follows:
|
|||
|
||||
The skb->mark field indicates the type of message:
|
||||
|
||||
MARK MEANING
|
||||
=============================== =======================================
|
||||
Mark Meaning
|
||||
=============================== =======================================
|
||||
RXRPC_SKB_MARK_DATA Data message
|
||||
RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call
|
||||
|
@ -946,6 +960,7 @@ The kernel interface functions are as follows:
|
|||
RXRPC_SKB_MARK_NET_ERROR Network error detected
|
||||
RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered
|
||||
RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance
|
||||
=============================== =======================================
|
||||
|
||||
The remote abort message can be probed with rxrpc_kernel_get_abort_code().
|
||||
The two error messages can be probed with rxrpc_kernel_get_error_number().
|
||||
|
@ -961,7 +976,7 @@ The kernel interface functions are as follows:
|
|||
is possible to get extra refs on all types of message for later freeing,
|
||||
but this may pin the state of a call until the message is finally freed.
|
||||
|
||||
(*) Accept an incoming call.
|
||||
(#) Accept an incoming call::
|
||||
|
||||
struct rxrpc_call *
|
||||
rxrpc_kernel_accept_call(struct socket *sock,
|
||||
|
@ -975,7 +990,7 @@ The kernel interface functions are as follows:
|
|||
returned. The caller now holds a reference on this and it must be
|
||||
properly ended.
|
||||
|
||||
(*) Reject an incoming call.
|
||||
(#) Reject an incoming call::
|
||||
|
||||
int rxrpc_kernel_reject_call(struct socket *sock);
|
||||
|
||||
|
@ -984,21 +999,21 @@ The kernel interface functions are as follows:
|
|||
Other errors may be returned if the call had been aborted (-ECONNABORTED)
|
||||
or had timed out (-ETIME).
|
||||
|
||||
(*) Allocate a null key for doing anonymous security.
|
||||
(#) Allocate a null key for doing anonymous security::
|
||||
|
||||
struct key *rxrpc_get_null_key(const char *keyname);
|
||||
|
||||
This is used to allocate a null RxRPC key that can be used to indicate
|
||||
anonymous security for a particular domain.
|
||||
|
||||
(*) Get the peer address of a call.
|
||||
(#) Get the peer address of a call::
|
||||
|
||||
void rxrpc_kernel_get_peer(struct socket *sock, struct rxrpc_call *call,
|
||||
struct sockaddr_rxrpc *_srx);
|
||||
|
||||
This is used to find the remote peer address of a call.
|
||||
|
||||
(*) Set the total transmit data size on a call.
|
||||
(#) Set the total transmit data size on a call::
|
||||
|
||||
void rxrpc_kernel_set_tx_length(struct socket *sock,
|
||||
struct rxrpc_call *call,
|
||||
|
@ -1009,14 +1024,14 @@ The kernel interface functions are as follows:
|
|||
size should be set when the call is begun. tx_total_len may not be less
|
||||
than zero.
|
||||
|
||||
(*) Get call RTT.
|
||||
(#) Get call RTT::
|
||||
|
||||
u64 rxrpc_kernel_get_rtt(struct socket *sock, struct rxrpc_call *call);
|
||||
|
||||
Get the RTT time to the peer in use by a call. The value returned is in
|
||||
nanoseconds.
|
||||
|
||||
(*) Check call still alive.
|
||||
(#) Check call still alive::
|
||||
|
||||
bool rxrpc_kernel_check_life(struct socket *sock,
|
||||
struct rxrpc_call *call,
|
||||
|
@ -1024,7 +1039,7 @@ The kernel interface functions are as follows:
|
|||
void rxrpc_kernel_probe_life(struct socket *sock,
|
||||
struct rxrpc_call *call);
|
||||
|
||||
The first function passes back in *_life a number that is updated when
|
||||
The first function passes back in ``*_life`` a number that is updated when
|
||||
ACKs are received from the peer (notably including PING RESPONSE ACKs
|
||||
which we can elicit by sending PING ACKs to see if the call still exists
|
||||
on the server). The caller should compare the numbers of two calls to see
|
||||
|
@ -1040,7 +1055,7 @@ The kernel interface functions are as follows:
|
|||
first function to change. Note that this must be called in TASK_RUNNING
|
||||
state.
|
||||
|
||||
(*) Get reply timestamp.
|
||||
(#) Get reply timestamp::
|
||||
|
||||
bool rxrpc_kernel_get_reply_time(struct socket *sock,
|
||||
struct rxrpc_call *call,
|
||||
|
@ -1048,10 +1063,10 @@ The kernel interface functions are as follows:
|
|||
|
||||
This allows the timestamp on the first DATA packet of the reply of a
|
||||
client call to be queried, provided that it is still in the Rx ring. If
|
||||
successful, the timestamp will be stored into *_ts and true will be
|
||||
successful, the timestamp will be stored into ``*_ts`` and true will be
|
||||
returned; false will be returned otherwise.
|
||||
|
||||
(*) Get remote client epoch.
|
||||
(#) Get remote client epoch::
|
||||
|
||||
u32 rxrpc_kernel_get_epoch(struct socket *sock,
|
||||
struct rxrpc_call *call)
|
||||
|
@ -1065,7 +1080,7 @@ The kernel interface functions are as follows:
|
|||
This value can be used to determine if the remote client has been
|
||||
restarted as it shouldn't change otherwise.
|
||||
|
||||
(*) Set the maxmimum lifespan on a call.
|
||||
(#) Set the maxmimum lifespan on a call::
|
||||
|
||||
void rxrpc_kernel_set_max_life(struct socket *sock,
|
||||
struct rxrpc_call *call,
|
||||
|
@ -1076,14 +1091,13 @@ The kernel interface functions are as follows:
|
|||
aborted and -ETIME or -ETIMEDOUT will be returned.
|
||||
|
||||
|
||||
=======================
|
||||
CONFIGURABLE PARAMETERS
|
||||
Configurable Parameters
|
||||
=======================
|
||||
|
||||
The RxRPC protocol driver has a number of configurable parameters that can be
|
||||
adjusted through sysctls in /proc/net/rxrpc/:
|
||||
|
||||
(*) req_ack_delay
|
||||
(#) req_ack_delay
|
||||
|
||||
The amount of time in milliseconds after receiving a packet with the
|
||||
request-ack flag set before we honour the flag and actually send the
|
||||
|
@ -1093,60 +1107,60 @@ adjusted through sysctls in /proc/net/rxrpc/:
|
|||
reception window is full (to a maximum of 255 packets), so delaying the
|
||||
ACK permits several packets to be ACK'd in one go.
|
||||
|
||||
(*) soft_ack_delay
|
||||
(#) soft_ack_delay
|
||||
|
||||
The amount of time in milliseconds after receiving a new packet before we
|
||||
generate a soft-ACK to tell the sender that it doesn't need to resend.
|
||||
|
||||
(*) idle_ack_delay
|
||||
(#) idle_ack_delay
|
||||
|
||||
The amount of time in milliseconds after all the packets currently in the
|
||||
received queue have been consumed before we generate a hard-ACK to tell
|
||||
the sender it can free its buffers, assuming no other reason occurs that
|
||||
we would send an ACK.
|
||||
|
||||
(*) resend_timeout
|
||||
(#) resend_timeout
|
||||
|
||||
The amount of time in milliseconds after transmitting a packet before we
|
||||
transmit it again, assuming no ACK is received from the receiver telling
|
||||
us they got it.
|
||||
|
||||
(*) max_call_lifetime
|
||||
(#) max_call_lifetime
|
||||
|
||||
The maximum amount of time in seconds that a call may be in progress
|
||||
before we preemptively kill it.
|
||||
|
||||
(*) dead_call_expiry
|
||||
(#) dead_call_expiry
|
||||
|
||||
The amount of time in seconds before we remove a dead call from the call
|
||||
list. Dead calls are kept around for a little while for the purpose of
|
||||
repeating ACK and ABORT packets.
|
||||
|
||||
(*) connection_expiry
|
||||
(#) connection_expiry
|
||||
|
||||
The amount of time in seconds after a connection was last used before we
|
||||
remove it from the connection list. While a connection is in existence,
|
||||
it serves as a placeholder for negotiated security; when it is deleted,
|
||||
the security must be renegotiated.
|
||||
|
||||
(*) transport_expiry
|
||||
(#) transport_expiry
|
||||
|
||||
The amount of time in seconds after a transport was last used before we
|
||||
remove it from the transport list. While a transport is in existence, it
|
||||
serves to anchor the peer data and keeps the connection ID counter.
|
||||
|
||||
(*) rxrpc_rx_window_size
|
||||
(#) rxrpc_rx_window_size
|
||||
|
||||
The size of the receive window in packets. This is the maximum number of
|
||||
unconsumed received packets we're willing to hold in memory for any
|
||||
particular call.
|
||||
|
||||
(*) rxrpc_rx_mtu
|
||||
(#) rxrpc_rx_mtu
|
||||
|
||||
The maximum packet MTU size that we're willing to receive in bytes. This
|
||||
indicates to the peer whether we're willing to accept jumbo packets.
|
||||
|
||||
(*) rxrpc_rx_jumbo_max
|
||||
(#) rxrpc_rx_jumbo_max
|
||||
|
||||
The maximum number of packets that we're willing to accept in a jumbo
|
||||
packet. Non-terminal packets in a jumbo packet must contain a four byte
|
|
@ -1,35 +1,42 @@
|
|||
Linux Kernel SCTP
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=================
|
||||
Linux Kernel SCTP
|
||||
=================
|
||||
|
||||
This is the current BETA release of the Linux Kernel SCTP reference
|
||||
implementation.
|
||||
implementation.
|
||||
|
||||
SCTP (Stream Control Transmission Protocol) is a IP based, message oriented,
|
||||
reliable transport protocol, with congestion control, support for
|
||||
transparent multi-homing, and multiple ordered streams of messages.
|
||||
RFC2960 defines the core protocol. The IETF SIGTRAN working group originally
|
||||
developed the SCTP protocol and later handed the protocol over to the
|
||||
Transport Area (TSVWG) working group for the continued evolvement of SCTP as a
|
||||
general purpose transport.
|
||||
developed the SCTP protocol and later handed the protocol over to the
|
||||
Transport Area (TSVWG) working group for the continued evolvement of SCTP as a
|
||||
general purpose transport.
|
||||
|
||||
See the IETF website (http://www.ietf.org) for further documents on SCTP.
|
||||
See http://www.ietf.org/rfc/rfc2960.txt
|
||||
See the IETF website (http://www.ietf.org) for further documents on SCTP.
|
||||
See http://www.ietf.org/rfc/rfc2960.txt
|
||||
|
||||
The initial project goal is to create an Linux kernel reference implementation
|
||||
of SCTP that is RFC 2960 compliant and provides an programming interface
|
||||
referred to as the UDP-style API of the Sockets Extensions for SCTP, as
|
||||
proposed in IETF Internet-Drafts.
|
||||
of SCTP that is RFC 2960 compliant and provides an programming interface
|
||||
referred to as the UDP-style API of the Sockets Extensions for SCTP, as
|
||||
proposed in IETF Internet-Drafts.
|
||||
|
||||
Caveats:
|
||||
Caveats
|
||||
=======
|
||||
|
||||
-lksctp can be built as statically or as a module. However, be aware that
|
||||
module removal of lksctp is not yet a safe activity.
|
||||
- lksctp can be built as statically or as a module. However, be aware that
|
||||
module removal of lksctp is not yet a safe activity.
|
||||
|
||||
-There is tentative support for IPv6, but most work has gone towards
|
||||
implementation and testing lksctp on IPv4.
|
||||
- There is tentative support for IPv6, but most work has gone towards
|
||||
implementation and testing lksctp on IPv4.
|
||||
|
||||
|
||||
For more information, please visit the lksctp project website:
|
||||
|
||||
http://www.sf.net/projects/lksctp
|
||||
|
||||
Or contact the lksctp developers through the mailing list:
|
||||
|
||||
<linux-sctp@vger.kernel.org>
|
|
@ -1,3 +1,9 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=================
|
||||
LSM/SeLinux secid
|
||||
=================
|
||||
|
||||
flowi structure:
|
||||
|
||||
The secid member in the flow structure is used in LSMs (e.g. SELinux) to indicate
|
|
@ -0,0 +1,26 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
====================
|
||||
Seg6 Sysfs variables
|
||||
====================
|
||||
|
||||
|
||||
/proc/sys/net/conf/<iface>/seg6_* variables:
|
||||
============================================
|
||||
|
||||
seg6_enabled - BOOL
|
||||
Accept or drop SR-enabled IPv6 packets on this interface.
|
||||
|
||||
Relevant packets are those with SRH present and DA = local.
|
||||
|
||||
* 0 - disabled (default)
|
||||
* not 0 - enabled
|
||||
|
||||
seg6_require_hmac - INTEGER
|
||||
Define HMAC policy for ingress SR-enabled packets on this interface.
|
||||
|
||||
* -1 - Ignore HMAC field
|
||||
* 0 - Accept SR packets without HMAC, validate SR packets with HMAC
|
||||
* 1 - Drop SR packets without HMAC, validate SR packets with HMAC
|
||||
|
||||
Default is 0.
|
|
@ -1,18 +0,0 @@
|
|||
/proc/sys/net/conf/<iface>/seg6_* variables:
|
||||
|
||||
seg6_enabled - BOOL
|
||||
Accept or drop SR-enabled IPv6 packets on this interface.
|
||||
|
||||
Relevant packets are those with SRH present and DA = local.
|
||||
|
||||
0 - disabled (default)
|
||||
not 0 - enabled
|
||||
|
||||
seg6_require_hmac - INTEGER
|
||||
Define HMAC policy for ingress SR-enabled packets on this interface.
|
||||
|
||||
-1 - Ignore HMAC field
|
||||
0 - Accept SR packets without HMAC, validate SR packets with HMAC
|
||||
1 - Drop SR packets without HMAC, validate SR packets with HMAC
|
||||
|
||||
Default is 0.
|
|
@ -1,35 +1,41 @@
|
|||
(C)Copyright 1998-2000 SysKonnect,
|
||||
===========================================================================
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
.. include:: <isonum.txt>
|
||||
|
||||
========================
|
||||
SysKonnect driver - SKFP
|
||||
========================
|
||||
|
||||
|copy| Copyright 1998-2000 SysKonnect,
|
||||
|
||||
skfp.txt created 11-May-2000
|
||||
|
||||
Readme File for skfp.o v2.06
|
||||
|
||||
|
||||
This file contains
|
||||
(1) OVERVIEW
|
||||
(2) SUPPORTED ADAPTERS
|
||||
(3) GENERAL INFORMATION
|
||||
(4) INSTALLATION
|
||||
(5) INCLUSION OF THE ADAPTER IN SYSTEM START
|
||||
(6) TROUBLESHOOTING
|
||||
(7) FUNCTION OF THE ADAPTER LEDS
|
||||
(8) HISTORY
|
||||
.. This file contains
|
||||
|
||||
===========================================================================
|
||||
(1) OVERVIEW
|
||||
(2) SUPPORTED ADAPTERS
|
||||
(3) GENERAL INFORMATION
|
||||
(4) INSTALLATION
|
||||
(5) INCLUSION OF THE ADAPTER IN SYSTEM START
|
||||
(6) TROUBLESHOOTING
|
||||
(7) FUNCTION OF THE ADAPTER LEDS
|
||||
(8) HISTORY
|
||||
|
||||
|
||||
|
||||
(1) OVERVIEW
|
||||
============
|
||||
1. Overview
|
||||
===========
|
||||
|
||||
This README explains how to use the driver 'skfp' for Linux with your
|
||||
network adapter.
|
||||
|
||||
Chapter 2: Contains a list of all network adapters that are supported by
|
||||
this driver.
|
||||
this driver.
|
||||
|
||||
Chapter 3: Gives some general information.
|
||||
Chapter 3:
|
||||
Gives some general information.
|
||||
|
||||
Chapter 4: Describes common problems and solutions.
|
||||
|
||||
|
@ -37,14 +43,13 @@ Chapter 5: Shows the changed functionality of the adapter LEDs.
|
|||
|
||||
Chapter 6: History of development.
|
||||
|
||||
***
|
||||
|
||||
|
||||
(2) SUPPORTED ADAPTERS
|
||||
======================
|
||||
2. Supported adapters
|
||||
=====================
|
||||
|
||||
The network driver 'skfp' supports the following network adapters:
|
||||
SysKonnect adapters:
|
||||
|
||||
- SK-5521 (SK-NET FDDI-UP)
|
||||
- SK-5522 (SK-NET FDDI-UP DAS)
|
||||
- SK-5541 (SK-NET FDDI-FP)
|
||||
|
@ -55,157 +60,187 @@ SysKonnect adapters:
|
|||
- SK-5841 (SK-NET FDDI-FP64)
|
||||
- SK-5843 (SK-NET FDDI-LP64)
|
||||
- SK-5844 (SK-NET FDDI-LP64 DAS)
|
||||
|
||||
Compaq adapters (not tested):
|
||||
|
||||
- Netelligent 100 FDDI DAS Fibre SC
|
||||
- Netelligent 100 FDDI SAS Fibre SC
|
||||
- Netelligent 100 FDDI DAS UTP
|
||||
- Netelligent 100 FDDI SAS UTP
|
||||
- Netelligent 100 FDDI SAS Fibre MIC
|
||||
***
|
||||
|
||||
|
||||
(3) GENERAL INFORMATION
|
||||
=======================
|
||||
3. General Information
|
||||
======================
|
||||
|
||||
From v2.01 on, the driver is integrated in the linux kernel sources.
|
||||
Therefore, the installation is the same as for any other adapter
|
||||
supported by the kernel.
|
||||
|
||||
Refer to the manual of your distribution about the installation
|
||||
of network adapters.
|
||||
|
||||
Makes my life much easier :-)
|
||||
***
|
||||
|
||||
|
||||
(4) TROUBLESHOOTING
|
||||
===================
|
||||
4. Troubleshooting
|
||||
==================
|
||||
|
||||
If you run into problems during installation, check those items:
|
||||
|
||||
Problem: The FDDI adapter cannot be found by the driver.
|
||||
Reason: Look in /proc/pci for the following entry:
|
||||
'FDDI network controller: SysKonnect SK-FDDI-PCI ...'
|
||||
Problem:
|
||||
The FDDI adapter cannot be found by the driver.
|
||||
|
||||
Reason:
|
||||
Look in /proc/pci for the following entry:
|
||||
|
||||
'FDDI network controller: SysKonnect SK-FDDI-PCI ...'
|
||||
|
||||
If this entry exists, then the FDDI adapter has been
|
||||
found by the system and should be able to be used.
|
||||
|
||||
If this entry does not exist or if the file '/proc/pci'
|
||||
is not there, then you may have a hardware problem or PCI
|
||||
support may not be enabled in your kernel.
|
||||
|
||||
The adapter can be checked using the diagnostic program
|
||||
which is available from the SysKonnect web site:
|
||||
|
||||
www.syskonnect.de
|
||||
|
||||
Some COMPAQ machines have a problem with PCI under
|
||||
Linux. This is described in the 'PCI howto' document
|
||||
(included in some distributions or available from the
|
||||
www, e.g. at 'www.linux.org') and no workaround is available.
|
||||
|
||||
Problem: You want to use your computer as a router between
|
||||
multiple IP subnetworks (using multiple adapters), but
|
||||
Problem:
|
||||
You want to use your computer as a router between
|
||||
multiple IP subnetworks (using multiple adapters), but
|
||||
you cannot reach computers in other subnetworks.
|
||||
Reason: Either the router's kernel is not configured for IP
|
||||
|
||||
Reason:
|
||||
Either the router's kernel is not configured for IP
|
||||
forwarding or there is a problem with the routing table
|
||||
and gateway configuration in at least one of the
|
||||
computers.
|
||||
|
||||
If your problem is not listed here, please contact our
|
||||
technical support for help.
|
||||
You can send email to:
|
||||
linux@syskonnect.de
|
||||
technical support for help.
|
||||
|
||||
You can send email to: linux@syskonnect.de
|
||||
|
||||
When contacting our technical support,
|
||||
please ensure that the following information is available:
|
||||
|
||||
- System Manufacturer and Model
|
||||
- Boards in your system
|
||||
- Distribution
|
||||
- Kernel version
|
||||
|
||||
***
|
||||
|
||||
5. Function of the Adapter LEDs
|
||||
===============================
|
||||
|
||||
The functionality of the LED's on the FDDI network adapters was
|
||||
changed in SMT version v2.82. With this new SMT version, the yellow
|
||||
LED works as a ring operational indicator. An active yellow LED
|
||||
indicates that the ring is down. The green LED on the adapter now
|
||||
works as a link indicator where an active GREEN LED indicates that
|
||||
the respective port has a physical connection.
|
||||
|
||||
With versions of SMT prior to v2.82 a ring up was indicated if the
|
||||
yellow LED was off while the green LED(s) showed the connection
|
||||
status of the adapter. During a ring down the green LED was off and
|
||||
the yellow LED was on.
|
||||
|
||||
All implementations indicate that a driver is not loaded if
|
||||
all LEDs are off.
|
||||
|
||||
|
||||
(5) FUNCTION OF THE ADAPTER LEDS
|
||||
================================
|
||||
|
||||
The functionality of the LED's on the FDDI network adapters was
|
||||
changed in SMT version v2.82. With this new SMT version, the yellow
|
||||
LED works as a ring operational indicator. An active yellow LED
|
||||
indicates that the ring is down. The green LED on the adapter now
|
||||
works as a link indicator where an active GREEN LED indicates that
|
||||
the respective port has a physical connection.
|
||||
|
||||
With versions of SMT prior to v2.82 a ring up was indicated if the
|
||||
yellow LED was off while the green LED(s) showed the connection
|
||||
status of the adapter. During a ring down the green LED was off and
|
||||
the yellow LED was on.
|
||||
|
||||
All implementations indicate that a driver is not loaded if
|
||||
all LEDs are off.
|
||||
|
||||
***
|
||||
|
||||
|
||||
(6) HISTORY
|
||||
===========
|
||||
6. History
|
||||
==========
|
||||
|
||||
v2.06 (20000511) (In-Kernel version)
|
||||
New features:
|
||||
|
||||
- 64 bit support
|
||||
- new pci dma interface
|
||||
- in kernel 2.3.99
|
||||
|
||||
v2.05 (20000217) (In-Kernel version)
|
||||
New features:
|
||||
|
||||
- Changes for 2.3.45 kernel
|
||||
|
||||
v2.04 (20000207) (Standalone version)
|
||||
New features:
|
||||
|
||||
- Added rx/tx byte counter
|
||||
|
||||
v2.03 (20000111) (Standalone version)
|
||||
Problems fixed:
|
||||
|
||||
- Fixed printk statements from v2.02
|
||||
|
||||
v2.02 (991215) (Standalone version)
|
||||
Problems fixed:
|
||||
|
||||
- Removed unnecessary output
|
||||
- Fixed path for "printver.sh" in makefile
|
||||
|
||||
v2.01 (991122) (In-Kernel version)
|
||||
New features:
|
||||
|
||||
- Integration in Linux kernel sources
|
||||
- Support for memory mapped I/O.
|
||||
|
||||
v2.00 (991112)
|
||||
New features:
|
||||
|
||||
- Full source released under GPL
|
||||
|
||||
v1.05 (991023)
|
||||
Problems fixed:
|
||||
|
||||
- Compilation with kernel version 2.2.13 failed
|
||||
|
||||
v1.04 (990427)
|
||||
Changes:
|
||||
|
||||
- New SMT module included, changing LED functionality
|
||||
|
||||
Problems fixed:
|
||||
|
||||
- Synchronization on SMP machines was buggy
|
||||
|
||||
v1.03 (990325)
|
||||
Problems fixed:
|
||||
|
||||
- Interrupt routing on SMP machines could be incorrect
|
||||
|
||||
v1.02 (990310)
|
||||
New features:
|
||||
|
||||
- Support for kernel versions 2.2.x added
|
||||
- Kernel patch instead of private duplicate of kernel functions
|
||||
|
||||
v1.01 (980812)
|
||||
Problems fixed:
|
||||
|
||||
Connection hangup with telnet
|
||||
Slow telnet connection
|
||||
|
||||
v1.00 beta 01 (980507)
|
||||
New features:
|
||||
|
||||
None.
|
||||
|
||||
Problems fixed:
|
||||
|
||||
None.
|
||||
|
||||
Known limitations:
|
||||
- tar archive instead of standard package format (rpm).
|
||||
|
||||
- tar archive instead of standard package format (rpm).
|
||||
- FDDI statistic is empty.
|
||||
- not tested with 2.1.xx kernels
|
||||
- integration in kernel not tested
|
||||
|
@ -216,5 +251,3 @@ v1.00 beta 01 (980507)
|
|||
- does not work on some COMPAQ machines. See the PCI howto
|
||||
document for details about this problem.
|
||||
- data corruption with kernel versions below 2.0.33.
|
||||
|
||||
*** End of information file ***
|
|
@ -1,4 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=========================
|
||||
Stream Parser (strparser)
|
||||
=========================
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
@ -34,8 +38,10 @@ that is called when a full message has been completed.
|
|||
Functions
|
||||
=========
|
||||
|
||||
strp_init(struct strparser *strp, struct sock *sk,
|
||||
const struct strp_callbacks *cb)
|
||||
::
|
||||
|
||||
strp_init(struct strparser *strp, struct sock *sk,
|
||||
const struct strp_callbacks *cb)
|
||||
|
||||
Called to initialize a stream parser. strp is a struct of type
|
||||
strparser that is allocated by the upper layer. sk is the TCP
|
||||
|
@ -43,31 +49,41 @@ strp_init(struct strparser *strp, struct sock *sk,
|
|||
callback mode; in general mode this is set to NULL. Callbacks
|
||||
are called by the stream parser (the callbacks are listed below).
|
||||
|
||||
void strp_pause(struct strparser *strp)
|
||||
::
|
||||
|
||||
void strp_pause(struct strparser *strp)
|
||||
|
||||
Temporarily pause a stream parser. Message parsing is suspended
|
||||
and no new messages are delivered to the upper layer.
|
||||
|
||||
void strp_unpause(struct strparser *strp)
|
||||
::
|
||||
|
||||
void strp_unpause(struct strparser *strp)
|
||||
|
||||
Unpause a paused stream parser.
|
||||
|
||||
void strp_stop(struct strparser *strp);
|
||||
::
|
||||
|
||||
void strp_stop(struct strparser *strp);
|
||||
|
||||
strp_stop is called to completely stop stream parser operations.
|
||||
This is called internally when the stream parser encounters an
|
||||
error, and it is called from the upper layer to stop parsing
|
||||
operations.
|
||||
|
||||
void strp_done(struct strparser *strp);
|
||||
::
|
||||
|
||||
void strp_done(struct strparser *strp);
|
||||
|
||||
strp_done is called to release any resources held by the stream
|
||||
parser instance. This must be called after the stream processor
|
||||
has been stopped.
|
||||
|
||||
int strp_process(struct strparser *strp, struct sk_buff *orig_skb,
|
||||
unsigned int orig_offset, size_t orig_len,
|
||||
size_t max_msg_size, long timeo)
|
||||
::
|
||||
|
||||
int strp_process(struct strparser *strp, struct sk_buff *orig_skb,
|
||||
unsigned int orig_offset, size_t orig_len,
|
||||
size_t max_msg_size, long timeo)
|
||||
|
||||
strp_process is called in general mode for a stream parser to
|
||||
parse an sk_buff. The number of bytes processed or a negative
|
||||
|
@ -75,7 +91,9 @@ int strp_process(struct strparser *strp, struct sk_buff *orig_skb,
|
|||
consume the sk_buff. max_msg_size is maximum size the stream
|
||||
parser will parse. timeo is timeout for completing a message.
|
||||
|
||||
void strp_data_ready(struct strparser *strp);
|
||||
::
|
||||
|
||||
void strp_data_ready(struct strparser *strp);
|
||||
|
||||
The upper layer calls strp_tcp_data_ready when data is ready on
|
||||
the lower socket for strparser to process. This should be called
|
||||
|
@ -83,7 +101,9 @@ void strp_data_ready(struct strparser *strp);
|
|||
maximum messages size is the limit of the receive socket
|
||||
buffer and message timeout is the receive timeout for the socket.
|
||||
|
||||
void strp_check_rcv(struct strparser *strp);
|
||||
::
|
||||
|
||||
void strp_check_rcv(struct strparser *strp);
|
||||
|
||||
strp_check_rcv is called to check for new messages on the socket.
|
||||
This is normally called at initialization of a stream parser
|
||||
|
@ -94,7 +114,9 @@ Callbacks
|
|||
|
||||
There are six callbacks:
|
||||
|
||||
int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
|
||||
::
|
||||
|
||||
int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
|
||||
|
||||
parse_msg is called to determine the length of the next message
|
||||
in the stream. The upper layer must implement this function. It
|
||||
|
@ -107,14 +129,16 @@ int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
|
|||
|
||||
The return values of this function are:
|
||||
|
||||
>0 : indicates length of successfully parsed message
|
||||
0 : indicates more data must be received to parse the message
|
||||
-ESTRPIPE : current message should not be processed by the
|
||||
kernel, return control of the socket to userspace which
|
||||
can proceed to read the messages itself
|
||||
other < 0 : Error in parsing, give control back to userspace
|
||||
assuming that synchronization is lost and the stream
|
||||
is unrecoverable (application expected to close TCP socket)
|
||||
========= ===========================================================
|
||||
>0 indicates length of successfully parsed message
|
||||
0 indicates more data must be received to parse the message
|
||||
-ESTRPIPE current message should not be processed by the
|
||||
kernel, return control of the socket to userspace which
|
||||
can proceed to read the messages itself
|
||||
other < 0 Error in parsing, give control back to userspace
|
||||
assuming that synchronization is lost and the stream
|
||||
is unrecoverable (application expected to close TCP socket)
|
||||
========= ===========================================================
|
||||
|
||||
In the case that an error is returned (return value is less than
|
||||
zero) and the parser is in receive callback mode, then it will set
|
||||
|
@ -123,7 +147,9 @@ int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
|
|||
the current message, then the error set on the attached socket is
|
||||
ENODATA since the stream is unrecoverable in that case.
|
||||
|
||||
void (*lock)(struct strparser *strp)
|
||||
::
|
||||
|
||||
void (*lock)(struct strparser *strp)
|
||||
|
||||
The lock callback is called to lock the strp structure when
|
||||
the strparser is performing an asynchronous operation (such as
|
||||
|
@ -131,14 +157,18 @@ void (*lock)(struct strparser *strp)
|
|||
function is to lock_sock for the associated socket. In general
|
||||
mode the callback must be set appropriately.
|
||||
|
||||
void (*unlock)(struct strparser *strp)
|
||||
::
|
||||
|
||||
void (*unlock)(struct strparser *strp)
|
||||
|
||||
The unlock callback is called to release the lock obtained
|
||||
by the lock callback. In receive callback mode the default
|
||||
function is release_sock for the associated socket. In general
|
||||
mode the callback must be set appropriately.
|
||||
|
||||
void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);
|
||||
::
|
||||
|
||||
void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);
|
||||
|
||||
rcv_msg is called when a full message has been received and
|
||||
is queued. The callee must consume the sk_buff; it can
|
||||
|
@ -152,7 +182,9 @@ void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);
|
|||
the length of the message. skb->len - offset may be greater
|
||||
then full_len since strparser does not trim the skb.
|
||||
|
||||
int (*read_sock_done)(struct strparser *strp, int err);
|
||||
::
|
||||
|
||||
int (*read_sock_done)(struct strparser *strp, int err);
|
||||
|
||||
read_sock_done is called when the stream parser is done reading
|
||||
the TCP socket in receive callback mode. The stream parser may
|
||||
|
@ -160,7 +192,9 @@ int (*read_sock_done)(struct strparser *strp, int err);
|
|||
to occur when exiting the loop. If the callback is not set (NULL
|
||||
in strp_init) a default function is used.
|
||||
|
||||
void (*abort_parser)(struct strparser *strp, int err);
|
||||
::
|
||||
|
||||
void (*abort_parser)(struct strparser *strp, int err);
|
||||
|
||||
This function is called when stream parser encounters an error
|
||||
in parsing. The default function stops the stream parser and
|
||||
|
@ -204,4 +238,3 @@ Author
|
|||
======
|
||||
|
||||
Tom Herbert (tom@quantonium.net)
|
||||
|
|
@ -1,7 +1,13 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: <isonum.txt>
|
||||
|
||||
===============================================
|
||||
Ethernet switch device driver model (switchdev)
|
||||
===============================================
|
||||
Copyright (c) 2014 Jiri Pirko <jiri@resnulli.us>
|
||||
Copyright (c) 2014-2015 Scott Feldman <sfeldma@gmail.com>
|
||||
|
||||
Copyright |copy| 2014 Jiri Pirko <jiri@resnulli.us>
|
||||
|
||||
Copyright |copy| 2014-2015 Scott Feldman <sfeldma@gmail.com>
|
||||
|
||||
|
||||
The Ethernet switch device driver model (switchdev) is an in-kernel driver
|
||||
|
@ -12,53 +18,57 @@ Figure 1 is a block diagram showing the components of the switchdev model for
|
|||
an example setup using a data-center-class switch ASIC chip. Other setups
|
||||
with SR-IOV or soft switches, such as OVS, are possible.
|
||||
|
||||
::
|
||||
|
||||
User-space tools
|
||||
|
||||
User-space tools
|
||||
|
||||
user space |
|
||||
+-------------------------------------------------------------------+
|
||||
kernel | Netlink
|
||||
|
|
||||
+--------------+-------------------------------+
|
||||
| Network stack |
|
||||
| (Linux) |
|
||||
| |
|
||||
+----------------------------------------------+
|
||||
|
|
||||
+--------------+-------------------------------+
|
||||
| Network stack |
|
||||
| (Linux) |
|
||||
| |
|
||||
+----------------------------------------------+
|
||||
|
||||
sw1p2 sw1p4 sw1p6
|
||||
sw1p1 + sw1p3 + sw1p5 + eth1
|
||||
+ | + | + | +
|
||||
| | | | | | |
|
||||
+--+----+----+----+----+----+---+ +-----+-----+
|
||||
| Switch driver | | mgmt |
|
||||
| (this document) | | driver |
|
||||
| | | |
|
||||
+--------------+----------------+ +-----------+
|
||||
|
|
||||
sw1p2 sw1p4 sw1p6
|
||||
sw1p1 + sw1p3 + sw1p5 + eth1
|
||||
+ | + | + | +
|
||||
| | | | | | |
|
||||
+--+----+----+----+----+----+---+ +-----+-----+
|
||||
| Switch driver | | mgmt |
|
||||
| (this document) | | driver |
|
||||
| | | |
|
||||
+--------------+----------------+ +-----------+
|
||||
|
|
||||
kernel | HW bus (eg PCI)
|
||||
+-------------------------------------------------------------------+
|
||||
hardware |
|
||||
+--------------+----------------+
|
||||
| Switch device (sw1) |
|
||||
| +----+ +--------+
|
||||
| | v offloaded data path | mgmt port
|
||||
| | | |
|
||||
+--|----|----+----+----+----+---+
|
||||
| | | | | |
|
||||
+ + + + + +
|
||||
p1 p2 p3 p4 p5 p6
|
||||
+--------------+----------------+
|
||||
| Switch device (sw1) |
|
||||
| +----+ +--------+
|
||||
| | v offloaded data path | mgmt port
|
||||
| | | |
|
||||
+--|----|----+----+----+----+---+
|
||||
| | | | | |
|
||||
+ + + + + +
|
||||
p1 p2 p3 p4 p5 p6
|
||||
|
||||
front-panel ports
|
||||
front-panel ports
|
||||
|
||||
|
||||
Fig 1.
|
||||
Fig 1.
|
||||
|
||||
|
||||
Include Files
|
||||
-------------
|
||||
|
||||
#include <linux/netdevice.h>
|
||||
#include <net/switchdev.h>
|
||||
::
|
||||
|
||||
#include <linux/netdevice.h>
|
||||
#include <net/switchdev.h>
|
||||
|
||||
|
||||
Configuration
|
||||
|
@ -114,10 +124,10 @@ Using port PHYS name (ndo_get_phys_port_name) for the key is particularly
|
|||
useful for dynamically-named ports where the device names its ports based on
|
||||
external configuration. For example, if a physical 40G port is split logically
|
||||
into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
|
||||
name for each port using port PHYS name. The udev rule would be:
|
||||
name for each port using port PHYS name. The udev rule would be::
|
||||
|
||||
SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
|
||||
ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
|
||||
SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
|
||||
ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
|
||||
|
||||
Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
|
||||
is the port name or ID, and Z is the sub-port name or ID. For example, sw1p1s0
|
||||
|
@ -173,7 +183,7 @@ Static FDB Entries
|
|||
|
||||
The switchdev driver should implement ndo_fdb_add, ndo_fdb_del and ndo_fdb_dump
|
||||
to support static FDB entries installed to the device. Static bridge FDB
|
||||
entries are installed, for example, using iproute2 bridge cmd:
|
||||
entries are installed, for example, using iproute2 bridge cmd::
|
||||
|
||||
bridge fdb add ADDR dev DEV [vlan VID] [self]
|
||||
|
||||
|
@ -185,7 +195,7 @@ XXX: what should be done if offloading this rule to hardware fails (for
|
|||
example, due to full capacity in hardware tables) ?
|
||||
|
||||
Note: by default, the bridge does not filter on VLAN and only bridges untagged
|
||||
traffic. To enable VLAN support, turn on VLAN filtering:
|
||||
traffic. To enable VLAN support, turn on VLAN filtering::
|
||||
|
||||
echo 1 >/sys/class/net/<bridge>/bridge/vlan_filtering
|
||||
|
||||
|
@ -194,7 +204,7 @@ Notification of Learned/Forgotten Source MAC/VLANs
|
|||
|
||||
The switch device will learn/forget source MAC address/VLAN on ingress packets
|
||||
and notify the switch driver of the mac/vlan/port tuples. The switch driver,
|
||||
in turn, will notify the bridge driver using the switchdev notifier call:
|
||||
in turn, will notify the bridge driver using the switchdev notifier call::
|
||||
|
||||
err = call_switchdev_notifiers(val, dev, info, extack);
|
||||
|
||||
|
@ -202,7 +212,7 @@ Where val is SWITCHDEV_FDB_ADD when learning and SWITCHDEV_FDB_DEL when
|
|||
forgetting, and info points to a struct switchdev_notifier_fdb_info. On
|
||||
SWITCHDEV_FDB_ADD, the bridge driver will install the FDB entry into the
|
||||
bridge's FDB and mark the entry as NTF_EXT_LEARNED. The iproute2 bridge
|
||||
command will label these entries "offload":
|
||||
command will label these entries "offload"::
|
||||
|
||||
$ bridge fdb
|
||||
52:54:00:12:35:01 dev sw1p1 master br0 permanent
|
||||
|
@ -219,11 +229,11 @@ command will label these entries "offload":
|
|||
01:00:5e:00:00:01 dev br0 self permanent
|
||||
33:33:ff:12:35:01 dev br0 self permanent
|
||||
|
||||
Learning on the port should be disabled on the bridge using the bridge command:
|
||||
Learning on the port should be disabled on the bridge using the bridge command::
|
||||
|
||||
bridge link set dev DEV learning off
|
||||
|
||||
Learning on the device port should be enabled, as well as learning_sync:
|
||||
Learning on the device port should be enabled, as well as learning_sync::
|
||||
|
||||
bridge link set dev DEV learning on self
|
||||
bridge link set dev DEV learning_sync on self
|
||||
|
@ -314,12 +324,16 @@ forwards the packet to the matching FIB entry's nexthop(s) egress ports.
|
|||
|
||||
To program the device, the driver has to register a FIB notifier handler
|
||||
using register_fib_notifier. The following events are available:
|
||||
FIB_EVENT_ENTRY_ADD: used for both adding a new FIB entry to the device,
|
||||
or modifying an existing entry on the device.
|
||||
FIB_EVENT_ENTRY_DEL: used for removing a FIB entry
|
||||
FIB_EVENT_RULE_ADD, FIB_EVENT_RULE_DEL: used to propagate FIB rule changes
|
||||
|
||||
FIB_EVENT_ENTRY_ADD and FIB_EVENT_ENTRY_DEL events pass:
|
||||
=================== ===================================================
|
||||
FIB_EVENT_ENTRY_ADD used for both adding a new FIB entry to the device,
|
||||
or modifying an existing entry on the device.
|
||||
FIB_EVENT_ENTRY_DEL used for removing a FIB entry
|
||||
FIB_EVENT_RULE_ADD,
|
||||
FIB_EVENT_RULE_DEL used to propagate FIB rule changes
|
||||
=================== ===================================================
|
||||
|
||||
FIB_EVENT_ENTRY_ADD and FIB_EVENT_ENTRY_DEL events pass::
|
||||
|
||||
struct fib_entry_notifier_info {
|
||||
struct fib_notifier_info info; /* must be first */
|
||||
|
@ -332,12 +346,12 @@ FIB_EVENT_ENTRY_ADD and FIB_EVENT_ENTRY_DEL events pass:
|
|||
u32 nlflags;
|
||||
};
|
||||
|
||||
to add/modify/delete IPv4 dst/dest_len prefix on table tb_id. The *fi
|
||||
structure holds details on the route and route's nexthops. *dev is one of the
|
||||
port netdevs mentioned in the route's next hop list.
|
||||
to add/modify/delete IPv4 dst/dest_len prefix on table tb_id. The ``*fi``
|
||||
structure holds details on the route and route's nexthops. ``*dev`` is one
|
||||
of the port netdevs mentioned in the route's next hop list.
|
||||
|
||||
Routes offloaded to the device are labeled with "offload" in the ip route
|
||||
listing:
|
||||
listing::
|
||||
|
||||
$ ip route show
|
||||
default via 192.168.0.2 dev eth0
|
|
@ -0,0 +1,29 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
================================
|
||||
TC Actions - Environmental Rules
|
||||
================================
|
||||
|
||||
|
||||
The "environmental" rules for authors of any new tc actions are:
|
||||
|
||||
1) If you stealeth or borroweth any packet thou shalt be branching
|
||||
from the righteous path and thou shalt cloneth.
|
||||
|
||||
For example if your action queues a packet to be processed later,
|
||||
or intentionally branches by redirecting a packet, then you need to
|
||||
clone the packet.
|
||||
|
||||
2) If you munge any packet thou shalt call pskb_expand_head in the case
|
||||
someone else is referencing the skb. After that you "own" the skb.
|
||||
|
||||
3) Dropping packets you don't own is a no-no. You simply return
|
||||
TC_ACT_SHOT to the caller and they will drop it.
|
||||
|
||||
The "environmental" rules for callers of actions (qdiscs etc) are:
|
||||
|
||||
#) Thou art responsible for freeing anything returned as being
|
||||
TC_ACT_SHOT/STOLEN/QUEUED. If none of TC_ACT_SHOT/STOLEN/QUEUED is
|
||||
returned, then all is great and you don't need to do anything.
|
||||
|
||||
Post on netdev if something is unclear.
|
|
@ -1,24 +0,0 @@
|
|||
|
||||
The "environmental" rules for authors of any new tc actions are:
|
||||
|
||||
1) If you stealeth or borroweth any packet thou shalt be branching
|
||||
from the righteous path and thou shalt cloneth.
|
||||
|
||||
For example if your action queues a packet to be processed later,
|
||||
or intentionally branches by redirecting a packet, then you need to
|
||||
clone the packet.
|
||||
|
||||
2) If you munge any packet thou shalt call pskb_expand_head in the case
|
||||
someone else is referencing the skb. After that you "own" the skb.
|
||||
|
||||
3) Dropping packets you don't own is a no-no. You simply return
|
||||
TC_ACT_SHOT to the caller and they will drop it.
|
||||
|
||||
The "environmental" rules for callers of actions (qdiscs etc) are:
|
||||
|
||||
*) Thou art responsible for freeing anything returned as being
|
||||
TC_ACT_SHOT/STOLEN/QUEUED. If none of TC_ACT_SHOT/STOLEN/QUEUED is
|
||||
returned, then all is great and you don't need to do anything.
|
||||
|
||||
Post on netdev if something is unclear.
|
||||
|
|
@ -1,5 +1,9 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
====================
|
||||
Thin-streams and TCP
|
||||
====================
|
||||
|
||||
A wide range of Internet-based services that use reliable transport
|
||||
protocols display what we call thin-stream properties. This means
|
||||
that the application sends data with such a low rate that the
|
||||
|
@ -42,6 +46,7 @@ References
|
|||
==========
|
||||
More information on the modifications, as well as a wide range of
|
||||
experimental data can be found here:
|
||||
|
||||
"Improving latency for interactive, thin-stream applications over
|
||||
reliable transport"
|
||||
http://simula.no/research/nd/publications/Simula.nd.477/simula_pdf_file
|
|
@ -1,2 +1,8 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
====
|
||||
Team
|
||||
====
|
||||
|
||||
Team devices are driven from userspace via libteam library which is here:
|
||||
https://github.com/jpirko/libteam
|
|
@ -1,9 +1,16 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
============
|
||||
Timestamping
|
||||
============
|
||||
|
||||
|
||||
1. Control Interfaces
|
||||
=====================
|
||||
|
||||
The interfaces for receiving network packages timestamps are:
|
||||
|
||||
* SO_TIMESTAMP
|
||||
SO_TIMESTAMP
|
||||
Generates a timestamp for each incoming packet in (not necessarily
|
||||
monotonic) system time. Reports the timestamp via recvmsg() in a
|
||||
control message in usec resolution.
|
||||
|
@ -13,7 +20,7 @@ The interfaces for receiving network packages timestamps are:
|
|||
SO_TIMESTAMP_OLD and in struct __kernel_sock_timeval for
|
||||
SO_TIMESTAMP_NEW options respectively.
|
||||
|
||||
* SO_TIMESTAMPNS
|
||||
SO_TIMESTAMPNS
|
||||
Same timestamping mechanism as SO_TIMESTAMP, but reports the
|
||||
timestamp as struct timespec in nsec resolution.
|
||||
SO_TIMESTAMPNS is defined as SO_TIMESTAMPNS_NEW or SO_TIMESTAMPNS_OLD
|
||||
|
@ -22,17 +29,18 @@ The interfaces for receiving network packages timestamps are:
|
|||
and in struct __kernel_timespec for SO_TIMESTAMPNS_NEW options
|
||||
respectively.
|
||||
|
||||
* IP_MULTICAST_LOOP + SO_TIMESTAMP[NS]
|
||||
IP_MULTICAST_LOOP + SO_TIMESTAMP[NS]
|
||||
Only for multicast:approximate transmit timestamp obtained by
|
||||
reading the looped packet receive timestamp.
|
||||
|
||||
* SO_TIMESTAMPING
|
||||
SO_TIMESTAMPING
|
||||
Generates timestamps on reception, transmission or both. Supports
|
||||
multiple timestamp sources, including hardware. Supports generating
|
||||
timestamps for stream sockets.
|
||||
|
||||
|
||||
1.1 SO_TIMESTAMP (also SO_TIMESTAMP_OLD and SO_TIMESTAMP_NEW):
|
||||
1.1 SO_TIMESTAMP (also SO_TIMESTAMP_OLD and SO_TIMESTAMP_NEW)
|
||||
-------------------------------------------------------------
|
||||
|
||||
This socket option enables timestamping of datagrams on the reception
|
||||
path. Because the destination socket, if any, is not known early in
|
||||
|
@ -59,10 +67,11 @@ struct __kernel_timespec format.
|
|||
SO_TIMESTAMPNS_OLD returns incorrect timestamps after the year 2038
|
||||
on 32 bit machines.
|
||||
|
||||
1.3 SO_TIMESTAMPING (also SO_TIMESTAMPING_OLD and SO_TIMESTAMPING_NEW):
|
||||
1.3 SO_TIMESTAMPING (also SO_TIMESTAMPING_OLD and SO_TIMESTAMPING_NEW)
|
||||
----------------------------------------------------------------------
|
||||
|
||||
Supports multiple types of timestamp requests. As a result, this
|
||||
socket option takes a bitmap of flags, not a boolean. In
|
||||
socket option takes a bitmap of flags, not a boolean. In::
|
||||
|
||||
err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
|
||||
|
||||
|
@ -76,6 +85,7 @@ be enabled for individual sendmsg calls using cmsg (1.3.4).
|
|||
|
||||
|
||||
1.3.1 Timestamp Generation
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Some bits are requests to the stack to try to generate timestamps. Any
|
||||
combination of them is valid. Changes to these bits apply to newly
|
||||
|
@ -106,7 +116,6 @@ SOF_TIMESTAMPING_TX_SOFTWARE:
|
|||
require driver support and may not be available for all devices.
|
||||
This flag can be enabled via both socket options and control messages.
|
||||
|
||||
|
||||
SOF_TIMESTAMPING_TX_SCHED:
|
||||
Request tx timestamps prior to entering the packet scheduler. Kernel
|
||||
transmit latency is, if long, often dominated by queuing delay. The
|
||||
|
@ -132,6 +141,7 @@ SOF_TIMESTAMPING_TX_ACK:
|
|||
|
||||
|
||||
1.3.2 Timestamp Reporting
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The other three bits control which timestamps will be reported in a
|
||||
generated control message. Changes to the bits take immediate
|
||||
|
@ -151,11 +161,11 @@ SOF_TIMESTAMPING_RAW_HARDWARE:
|
|||
|
||||
|
||||
1.3.3 Timestamp Options
|
||||
^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The interface supports the options
|
||||
|
||||
SOF_TIMESTAMPING_OPT_ID:
|
||||
|
||||
Generate a unique identifier along with each packet. A process can
|
||||
have multiple concurrent timestamping requests outstanding. Packets
|
||||
can be reordered in the transmit path, for instance in the packet
|
||||
|
@ -183,7 +193,6 @@ SOF_TIMESTAMPING_OPT_ID:
|
|||
|
||||
|
||||
SOF_TIMESTAMPING_OPT_CMSG:
|
||||
|
||||
Support recv() cmsg for all timestamped packets. Control messages
|
||||
are already supported unconditionally on all packets with receive
|
||||
timestamps and on IPv6 packets with transmit timestamp. This option
|
||||
|
@ -193,7 +202,6 @@ SOF_TIMESTAMPING_OPT_CMSG:
|
|||
|
||||
|
||||
SOF_TIMESTAMPING_OPT_TSONLY:
|
||||
|
||||
Applies to transmit timestamps only. Makes the kernel return the
|
||||
timestamp as a cmsg alongside an empty packet, as opposed to
|
||||
alongside the original packet. This reduces the amount of memory
|
||||
|
@ -202,7 +210,6 @@ SOF_TIMESTAMPING_OPT_TSONLY:
|
|||
This option disables SOF_TIMESTAMPING_OPT_CMSG.
|
||||
|
||||
SOF_TIMESTAMPING_OPT_STATS:
|
||||
|
||||
Optional stats that are obtained along with the transmit timestamps.
|
||||
It must be used together with SOF_TIMESTAMPING_OPT_TSONLY. When the
|
||||
transmit timestamp is available, the stats are available in a
|
||||
|
@ -213,7 +220,6 @@ SOF_TIMESTAMPING_OPT_STATS:
|
|||
data was limited by peer's receiver window.
|
||||
|
||||
SOF_TIMESTAMPING_OPT_PKTINFO:
|
||||
|
||||
Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
|
||||
packets with hardware timestamps. The message contains struct
|
||||
scm_ts_pktinfo, which supplies the index of the real interface which
|
||||
|
@ -223,7 +229,6 @@ SOF_TIMESTAMPING_OPT_PKTINFO:
|
|||
other fields, but they are reserved and undefined.
|
||||
|
||||
SOF_TIMESTAMPING_OPT_TX_SWHW:
|
||||
|
||||
Request both hardware and software timestamps for outgoing packets
|
||||
when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
|
||||
are enabled at the same time. If both timestamps are generated,
|
||||
|
@ -242,12 +247,13 @@ combined with SOF_TIMESTAMPING_OPT_TSONLY.
|
|||
|
||||
|
||||
1.3.4. Enabling timestamps via control messages
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
In addition to socket options, timestamp generation can be requested
|
||||
per write via cmsg, only for SOF_TIMESTAMPING_TX_* (see Section 1.3.1).
|
||||
Using this feature, applications can sample timestamps per sendmsg()
|
||||
without paying the overhead of enabling and disabling timestamps via
|
||||
setsockopt:
|
||||
setsockopt::
|
||||
|
||||
struct msghdr *msg;
|
||||
...
|
||||
|
@ -264,7 +270,7 @@ The SOF_TIMESTAMPING_TX_* flags set via cmsg will override
|
|||
the SOF_TIMESTAMPING_TX_* flags set via setsockopt.
|
||||
|
||||
Moreover, applications must still enable timestamp reporting via
|
||||
setsockopt to receive timestamps:
|
||||
setsockopt to receive timestamps::
|
||||
|
||||
__u32 val = SOF_TIMESTAMPING_SOFTWARE |
|
||||
SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
|
||||
|
@ -272,6 +278,7 @@ setsockopt to receive timestamps:
|
|||
|
||||
|
||||
1.4 Bytestream Timestamps
|
||||
-------------------------
|
||||
|
||||
The SO_TIMESTAMPING interface supports timestamping of bytes in a
|
||||
bytestream. Each request is interpreted as a request for when the
|
||||
|
@ -331,6 +338,7 @@ unusual.
|
|||
|
||||
|
||||
2 Data Interfaces
|
||||
==================
|
||||
|
||||
Timestamps are read using the ancillary data feature of recvmsg().
|
||||
See `man 3 cmsg` for details of this interface. The socket manual
|
||||
|
@ -339,20 +347,21 @@ SO_TIMESTAMP and SO_TIMESTAMPNS records can be retrieved.
|
|||
|
||||
|
||||
2.1 SCM_TIMESTAMPING records
|
||||
----------------------------
|
||||
|
||||
These timestamps are returned in a control message with cmsg_level
|
||||
SOL_SOCKET, cmsg_type SCM_TIMESTAMPING, and payload of type
|
||||
|
||||
For SO_TIMESTAMPING_OLD:
|
||||
For SO_TIMESTAMPING_OLD::
|
||||
|
||||
struct scm_timestamping {
|
||||
struct timespec ts[3];
|
||||
};
|
||||
struct scm_timestamping {
|
||||
struct timespec ts[3];
|
||||
};
|
||||
|
||||
For SO_TIMESTAMPING_NEW:
|
||||
For SO_TIMESTAMPING_NEW::
|
||||
|
||||
struct scm_timestamping64 {
|
||||
struct __kernel_timespec ts[3];
|
||||
struct scm_timestamping64 {
|
||||
struct __kernel_timespec ts[3];
|
||||
|
||||
Always use SO_TIMESTAMPING_NEW timestamp to always get timestamp in
|
||||
struct scm_timestamping64 format.
|
||||
|
@ -377,6 +386,7 @@ in ts[0] when a real software timestamp is missing. This happens also
|
|||
on hardware transmit timestamps.
|
||||
|
||||
2.1.1 Transmit timestamps with MSG_ERRQUEUE
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
For transmit timestamps the outgoing packet is looped back to the
|
||||
socket's error queue with the send timestamp(s) attached. A process
|
||||
|
@ -393,6 +403,7 @@ embeds the struct scm_timestamping.
|
|||
|
||||
|
||||
2.1.1.2 Timestamp types
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The semantics of the three struct timespec are defined by field
|
||||
ee_info in the extended error structure. It contains a value of
|
||||
|
@ -408,6 +419,7 @@ case the timestamp is stored in ts[0].
|
|||
|
||||
|
||||
2.1.1.3 Fragmentation
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Fragmentation of outgoing datagrams is rare, but is possible, e.g., by
|
||||
explicitly disabling PMTU discovery. If an outgoing packet is fragmented,
|
||||
|
@ -416,6 +428,7 @@ socket.
|
|||
|
||||
|
||||
2.1.1.4 Packet Payload
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The calling application is often not interested in receiving the whole
|
||||
packet payload that it passed to the stack originally: the socket
|
||||
|
@ -427,6 +440,7 @@ however, the full packet is queued, taking up budget from SO_RCVBUF.
|
|||
|
||||
|
||||
2.1.1.5 Blocking Read
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Reading from the error queue is always a non-blocking operation. To
|
||||
block waiting on a timestamp, use poll or select. poll() will return
|
||||
|
@ -436,6 +450,7 @@ ignored on request. See also `man 2 poll`.
|
|||
|
||||
|
||||
2.1.2 Receive timestamps
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
On reception, there is no reason to read from the socket error queue.
|
||||
The SCM_TIMESTAMPING ancillary data is sent along with the packet data
|
||||
|
@ -447,16 +462,17 @@ is again deprecated and ts[2] holds a hardware timestamp if set.
|
|||
|
||||
|
||||
3. Hardware Timestamping configuration: SIOCSHWTSTAMP and SIOCGHWTSTAMP
|
||||
=======================================================================
|
||||
|
||||
Hardware time stamping must also be initialized for each device driver
|
||||
that is expected to do hardware time stamping. The parameter is defined in
|
||||
include/uapi/linux/net_tstamp.h as:
|
||||
include/uapi/linux/net_tstamp.h as::
|
||||
|
||||
struct hwtstamp_config {
|
||||
int flags; /* no flags defined right now, must be zero */
|
||||
int tx_type; /* HWTSTAMP_TX_* */
|
||||
int rx_filter; /* HWTSTAMP_FILTER_* */
|
||||
};
|
||||
struct hwtstamp_config {
|
||||
int flags; /* no flags defined right now, must be zero */
|
||||
int tx_type; /* HWTSTAMP_TX_* */
|
||||
int rx_filter; /* HWTSTAMP_FILTER_* */
|
||||
};
|
||||
|
||||
Desired behavior is passed into the kernel and to a specific device by
|
||||
calling ioctl(SIOCSHWTSTAMP) with a pointer to a struct ifreq whose
|
||||
|
@ -487,44 +503,47 @@ Any process can read the actual configuration by passing this
|
|||
structure to ioctl(SIOCGHWTSTAMP) in the same way. However, this has
|
||||
not been implemented in all drivers.
|
||||
|
||||
/* possible values for hwtstamp_config->tx_type */
|
||||
enum {
|
||||
/*
|
||||
* no outgoing packet will need hardware time stamping;
|
||||
* should a packet arrive which asks for it, no hardware
|
||||
* time stamping will be done
|
||||
*/
|
||||
HWTSTAMP_TX_OFF,
|
||||
::
|
||||
|
||||
/*
|
||||
* enables hardware time stamping for outgoing packets;
|
||||
* the sender of the packet decides which are to be
|
||||
* time stamped by setting SOF_TIMESTAMPING_TX_SOFTWARE
|
||||
* before sending the packet
|
||||
*/
|
||||
HWTSTAMP_TX_ON,
|
||||
};
|
||||
/* possible values for hwtstamp_config->tx_type */
|
||||
enum {
|
||||
/*
|
||||
* no outgoing packet will need hardware time stamping;
|
||||
* should a packet arrive which asks for it, no hardware
|
||||
* time stamping will be done
|
||||
*/
|
||||
HWTSTAMP_TX_OFF,
|
||||
|
||||
/* possible values for hwtstamp_config->rx_filter */
|
||||
enum {
|
||||
/* time stamp no incoming packet at all */
|
||||
HWTSTAMP_FILTER_NONE,
|
||||
/*
|
||||
* enables hardware time stamping for outgoing packets;
|
||||
* the sender of the packet decides which are to be
|
||||
* time stamped by setting SOF_TIMESTAMPING_TX_SOFTWARE
|
||||
* before sending the packet
|
||||
*/
|
||||
HWTSTAMP_TX_ON,
|
||||
};
|
||||
|
||||
/* time stamp any incoming packet */
|
||||
HWTSTAMP_FILTER_ALL,
|
||||
/* possible values for hwtstamp_config->rx_filter */
|
||||
enum {
|
||||
/* time stamp no incoming packet at all */
|
||||
HWTSTAMP_FILTER_NONE,
|
||||
|
||||
/* return value: time stamp all packets requested plus some others */
|
||||
HWTSTAMP_FILTER_SOME,
|
||||
/* time stamp any incoming packet */
|
||||
HWTSTAMP_FILTER_ALL,
|
||||
|
||||
/* PTP v1, UDP, any kind of event packet */
|
||||
HWTSTAMP_FILTER_PTP_V1_L4_EVENT,
|
||||
/* return value: time stamp all packets requested plus some others */
|
||||
HWTSTAMP_FILTER_SOME,
|
||||
|
||||
/* for the complete list of values, please check
|
||||
* the include file include/uapi/linux/net_tstamp.h
|
||||
*/
|
||||
};
|
||||
/* PTP v1, UDP, any kind of event packet */
|
||||
HWTSTAMP_FILTER_PTP_V1_L4_EVENT,
|
||||
|
||||
/* for the complete list of values, please check
|
||||
* the include file include/uapi/linux/net_tstamp.h
|
||||
*/
|
||||
};
|
||||
|
||||
3.1 Hardware Timestamping Implementation: Device Drivers
|
||||
--------------------------------------------------------
|
||||
|
||||
A driver which supports hardware time stamping must support the
|
||||
SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with
|
||||
|
@ -533,22 +552,23 @@ should also support SIOCGHWTSTAMP.
|
|||
|
||||
Time stamps for received packets must be stored in the skb. To get a pointer
|
||||
to the shared time stamp structure of the skb call skb_hwtstamps(). Then
|
||||
set the time stamps in the structure:
|
||||
set the time stamps in the structure::
|
||||
|
||||
struct skb_shared_hwtstamps {
|
||||
/* hardware time stamp transformed into duration
|
||||
* since arbitrary point in time
|
||||
*/
|
||||
ktime_t hwtstamp;
|
||||
};
|
||||
struct skb_shared_hwtstamps {
|
||||
/* hardware time stamp transformed into duration
|
||||
* since arbitrary point in time
|
||||
*/
|
||||
ktime_t hwtstamp;
|
||||
};
|
||||
|
||||
Time stamps for outgoing packets are to be generated as follows:
|
||||
|
||||
- In hard_start_xmit(), check if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)
|
||||
is set no-zero. If yes, then the driver is expected to do hardware time
|
||||
stamping.
|
||||
- If this is possible for the skb and requested, then declare
|
||||
that the driver is doing the time stamping by setting the flag
|
||||
SKBTX_IN_PROGRESS in skb_shinfo(skb)->tx_flags , e.g. with
|
||||
SKBTX_IN_PROGRESS in skb_shinfo(skb)->tx_flags , e.g. with::
|
||||
|
||||
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
|
||||
|
|
@ -1,3 +1,6 @@
|
|||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=========================
|
||||
Transparent proxy support
|
||||
=========================
|
||||
|
||||
|
@ -11,39 +14,39 @@ From Linux 4.18 transparent proxy support is also available in nf_tables.
|
|||
================================
|
||||
|
||||
The idea is that you identify packets with destination address matching a local
|
||||
socket on your box, set the packet mark to a certain value:
|
||||
socket on your box, set the packet mark to a certain value::
|
||||
|
||||
# iptables -t mangle -N DIVERT
|
||||
# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
|
||||
# iptables -t mangle -A DIVERT -j MARK --set-mark 1
|
||||
# iptables -t mangle -A DIVERT -j ACCEPT
|
||||
# iptables -t mangle -N DIVERT
|
||||
# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
|
||||
# iptables -t mangle -A DIVERT -j MARK --set-mark 1
|
||||
# iptables -t mangle -A DIVERT -j ACCEPT
|
||||
|
||||
Alternatively you can do this in nft with the following commands:
|
||||
Alternatively you can do this in nft with the following commands::
|
||||
|
||||
# nft add table filter
|
||||
# nft add chain filter divert "{ type filter hook prerouting priority -150; }"
|
||||
# nft add rule filter divert meta l4proto tcp socket transparent 1 meta mark set 1 accept
|
||||
# nft add table filter
|
||||
# nft add chain filter divert "{ type filter hook prerouting priority -150; }"
|
||||
# nft add rule filter divert meta l4proto tcp socket transparent 1 meta mark set 1 accept
|
||||
|
||||
And then match on that value using policy routing to have those packets
|
||||
delivered locally:
|
||||
delivered locally::
|
||||
|
||||
# ip rule add fwmark 1 lookup 100
|
||||
# ip route add local 0.0.0.0/0 dev lo table 100
|
||||
# ip rule add fwmark 1 lookup 100
|
||||
# ip route add local 0.0.0.0/0 dev lo table 100
|
||||
|
||||
Because of certain restrictions in the IPv4 routing output code you'll have to
|
||||
modify your application to allow it to send datagrams _from_ non-local IP
|
||||
addresses. All you have to do is enable the (SOL_IP, IP_TRANSPARENT) socket
|
||||
option before calling bind:
|
||||
option before calling bind::
|
||||
|
||||
fd = socket(AF_INET, SOCK_STREAM, 0);
|
||||
/* - 8< -*/
|
||||
int value = 1;
|
||||
setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
|
||||
/* - 8< -*/
|
||||
name.sin_family = AF_INET;
|
||||
name.sin_port = htons(0xCAFE);
|
||||
name.sin_addr.s_addr = htonl(0xDEADBEEF);
|
||||
bind(fd, &name, sizeof(name));
|
||||
fd = socket(AF_INET, SOCK_STREAM, 0);
|
||||
/* - 8< -*/
|
||||
int value = 1;
|
||||
setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
|
||||
/* - 8< -*/
|
||||
name.sin_family = AF_INET;
|
||||
name.sin_port = htons(0xCAFE);
|
||||
name.sin_addr.s_addr = htonl(0xDEADBEEF);
|
||||
bind(fd, &name, sizeof(name));
|
||||
|
||||
A trivial patch for netcat is available here:
|
||||
http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch
|
||||
|
@ -61,10 +64,10 @@ be able to find out the original destination address. Even in case of TCP
|
|||
getting the original destination address is racy.)
|
||||
|
||||
The 'TPROXY' target provides similar functionality without relying on NAT. Simply
|
||||
add rules like this to the iptables ruleset above:
|
||||
add rules like this to the iptables ruleset above::
|
||||
|
||||
# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
|
||||
--tproxy-mark 0x1/0x1 --on-port 50080
|
||||
# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
|
||||
--tproxy-mark 0x1/0x1 --on-port 50080
|
||||
|
||||
Or the following rule to nft:
|
||||
|
||||
|
@ -82,10 +85,12 @@ nf_tables implementation.
|
|||
====================================
|
||||
|
||||
To use tproxy you'll need to have the following modules compiled for iptables:
|
||||
|
||||
- NETFILTER_XT_MATCH_SOCKET
|
||||
- NETFILTER_XT_TARGET_TPROXY
|
||||
|
||||
Or the floowing modules for nf_tables:
|
||||
|
||||
- NFT_SOCKET
|
||||
- NFT_TPROXY
|
||||
|
14
MAINTAINERS
14
MAINTAINERS
|
@ -193,7 +193,7 @@ W: https://wireless.wiki.kernel.org/
|
|||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git
|
||||
F: Documentation/driver-api/80211/cfg80211.rst
|
||||
F: Documentation/networking/regulatory.txt
|
||||
F: Documentation/networking/regulatory.rst
|
||||
F: include/linux/ieee80211.h
|
||||
F: include/net/cfg80211.h
|
||||
F: include/net/ieee80211_radiotap.h
|
||||
|
@ -9515,7 +9515,7 @@ F: drivers/soc/lantiq
|
|||
LAPB module
|
||||
L: linux-x25@vger.kernel.org
|
||||
S: Orphan
|
||||
F: Documentation/networking/lapb-module.txt
|
||||
F: Documentation/networking/lapb-module.rst
|
||||
F: include/*/lapb.h
|
||||
F: net/lapb/
|
||||
|
||||
|
@ -10079,7 +10079,7 @@ S: Maintained
|
|||
W: https://wireless.wiki.kernel.org/
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git
|
||||
F: Documentation/networking/mac80211-injection.txt
|
||||
F: Documentation/networking/mac80211-injection.rst
|
||||
F: Documentation/networking/mac80211_hwsim/mac80211_hwsim.rst
|
||||
F: drivers/net/wireless/mac80211_hwsim.[ch]
|
||||
F: include/net/mac80211.h
|
||||
|
@ -13262,7 +13262,7 @@ F: drivers/input/joystick/pxrc.c
|
|||
PHONET PROTOCOL
|
||||
M: Remi Denis-Courmont <courmisch@gmail.com>
|
||||
S: Supported
|
||||
F: Documentation/networking/phonet.txt
|
||||
F: Documentation/networking/phonet.rst
|
||||
F: include/linux/phonet.h
|
||||
F: include/net/phonet/
|
||||
F: include/uapi/linux/phonet.h
|
||||
|
@ -14219,7 +14219,7 @@ L: linux-rdma@vger.kernel.org
|
|||
L: rds-devel@oss.oracle.com (moderated for non-subscribers)
|
||||
S: Supported
|
||||
W: https://oss.oracle.com/projects/rds/
|
||||
F: Documentation/networking/rds.txt
|
||||
F: Documentation/networking/rds.rst
|
||||
F: net/rds/
|
||||
|
||||
RDT - RESOURCE ALLOCATION
|
||||
|
@ -14593,7 +14593,7 @@ M: David Howells <dhowells@redhat.com>
|
|||
L: linux-afs@lists.infradead.org
|
||||
S: Supported
|
||||
W: https://www.infradead.org/~dhowells/kafs/
|
||||
F: Documentation/networking/rxrpc.txt
|
||||
F: Documentation/networking/rxrpc.rst
|
||||
F: include/keys/rxrpc-type.h
|
||||
F: include/net/af_rxrpc.h
|
||||
F: include/trace/events/rxrpc.h
|
||||
|
@ -14999,7 +14999,7 @@ M: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
|
|||
L: linux-sctp@vger.kernel.org
|
||||
S: Maintained
|
||||
W: http://lksctp.sourceforge.net
|
||||
F: Documentation/networking/sctp.txt
|
||||
F: Documentation/networking/sctp.rst
|
||||
F: include/linux/sctp.h
|
||||
F: include/net/sctp/
|
||||
F: include/uapi/linux/sctp.h
|
||||
|
|
|
@ -302,7 +302,7 @@ config NETCONSOLE
|
|||
tristate "Network console logging support"
|
||||
---help---
|
||||
If you want to log kernel messages over the network, enable this.
|
||||
See <file:Documentation/networking/netconsole.txt> for details.
|
||||
See <file:Documentation/networking/netconsole.rst> for details.
|
||||
|
||||
config NETCONSOLE_DYNAMIC
|
||||
bool "Dynamic reconfiguration of logging targets"
|
||||
|
@ -312,7 +312,7 @@ config NETCONSOLE_DYNAMIC
|
|||
This option enables the ability to dynamically reconfigure target
|
||||
parameters (interface, IP addresses, port numbers, MAC addresses)
|
||||
at runtime through a userspace interface exported using configfs.
|
||||
See <file:Documentation/networking/netconsole.txt> for details.
|
||||
See <file:Documentation/networking/netconsole.rst> for details.
|
||||
|
||||
config NETPOLL
|
||||
def_bool NETCONSOLE
|
||||
|
|
|
@ -48,7 +48,7 @@ config LTPC
|
|||
If you are in doubt, this card is the one with the 65C02 chip on it.
|
||||
You also need version 1.3.3 or later of the netatalk package.
|
||||
This driver is experimental, which means that it may not work.
|
||||
See the file <file:Documentation/networking/ltpc.txt>.
|
||||
See the file <file:Documentation/networking/ltpc.rst>.
|
||||
|
||||
config COPS
|
||||
tristate "COPS LocalTalk PC support"
|
||||
|
|
|
@ -1150,7 +1150,7 @@ static irqreturn_t gelic_card_interrupt(int irq, void *ptr)
|
|||
* gelic_net_poll_controller - artificial interrupt for netconsole etc.
|
||||
* @netdev: interface device structure
|
||||
*
|
||||
* see Documentation/networking/netconsole.txt
|
||||
* see Documentation/networking/netconsole.rst
|
||||
*/
|
||||
void gelic_net_poll_controller(struct net_device *netdev)
|
||||
{
|
||||
|
|
|
@ -1615,7 +1615,7 @@ spider_net_interrupt(int irq, void *ptr)
|
|||
* spider_net_poll_controller - artificial interrupt for netconsole etc.
|
||||
* @netdev: interface device structure
|
||||
*
|
||||
* see Documentation/networking/netconsole.txt
|
||||
* see Documentation/networking/netconsole.rst
|
||||
*/
|
||||
static void
|
||||
spider_net_poll_controller(struct net_device *netdev)
|
||||
|
|
|
@ -77,7 +77,7 @@ config SKFP
|
|||
- Netelligent 100 FDDI SAS UTP
|
||||
- Netelligent 100 FDDI SAS Fibre MIC
|
||||
|
||||
Read <file:Documentation/networking/skfp.txt> for information about
|
||||
Read <file:Documentation/networking/skfp.rst> for information about
|
||||
the driver.
|
||||
|
||||
Questions concerning this driver can be addressed to:
|
||||
|
|
|
@ -21,7 +21,7 @@ config PLIP
|
|||
bits at a time (mode 0) or with special PLIP cables, to be used on
|
||||
bidirectional parallel ports only, which can transmit 8 bits at a
|
||||
time (mode 1); you can find the wiring of these cables in
|
||||
<file:Documentation/networking/PLIP.txt>. The cables can be up to
|
||||
<file:Documentation/networking/plip.rst>. The cables can be up to
|
||||
15m long. Mode 0 works also if one of the machines runs DOS/Windows
|
||||
and has some PLIP software installed, e.g. the Crynwr PLIP packet
|
||||
driver (<http://oak.oakland.edu/simtel.net/msdos/pktdrvr-pre.html>)
|
||||
|
|
|
@ -57,7 +57,7 @@ config PCMCIA_RAYCS
|
|||
---help---
|
||||
Say Y here if you intend to attach an Aviator/Raytheon PCMCIA
|
||||
(PC-card) wireless Ethernet networking card to your computer.
|
||||
Please read the file <file:Documentation/networking/ray_cs.txt> for
|
||||
Please read the file <file:Documentation/networking/ray_cs.rst> for
|
||||
details.
|
||||
|
||||
To compile this driver as a module, choose M here: the module will be
|
||||
|
|
|
@ -79,7 +79,7 @@ The DPSW can have ports connected to DPNIs or to PHYs via DPMACs.
|
|||
|
||||
For a more detailed description of the Ethernet switch device driver model
|
||||
see:
|
||||
Documentation/networking/switchdev.txt
|
||||
Documentation/networking/switchdev.rst
|
||||
|
||||
Creating an Ethernet Switch
|
||||
===========================
|
||||
|
|
|
@ -89,7 +89,7 @@ enum {
|
|||
* Add your fresh new feature above and remember to update
|
||||
* netdev_features_strings[] in net/core/ethtool.c and maybe
|
||||
* some feature mask #defines below. Please also describe it
|
||||
* in Documentation/networking/netdev-features.txt.
|
||||
* in Documentation/networking/netdev-features.rst.
|
||||
*/
|
||||
|
||||
/**/NETDEV_FEATURE_COUNT
|
||||
|
|
|
@ -5211,7 +5211,7 @@ u32 ieee80211_mandatory_rates(struct ieee80211_supported_band *sband,
|
|||
* Radiotap parsing functions -- for controlled injection support
|
||||
*
|
||||
* Implemented in net/wireless/radiotap.c
|
||||
* Documentation in Documentation/networking/radiotap-headers.txt
|
||||
* Documentation in Documentation/networking/radiotap-headers.rst
|
||||
*/
|
||||
|
||||
struct radiotap_align_size {
|
||||
|
|
|
@ -36,7 +36,7 @@ struct sock_extended_err {
|
|||
*
|
||||
* The timestamping interfaces SO_TIMESTAMPING, MSG_TSTAMP_*
|
||||
* communicate network timestamps by passing this struct in a cmsg with
|
||||
* recvmsg(). See Documentation/networking/timestamping.txt for details.
|
||||
* recvmsg(). See Documentation/networking/timestamping.rst for details.
|
||||
* User space sees a timespec definition that matches either
|
||||
* __kernel_timespec or __kernel_old_timespec, in the kernel we
|
||||
* require two structure definitions to provide both.
|
||||
|
|
|
@ -344,7 +344,7 @@ config NET_PKTGEN
|
|||
what was just said, you don't need it: say N.
|
||||
|
||||
Documentation on how to use the packet generator can be found
|
||||
at <file:Documentation/networking/pktgen.txt>.
|
||||
at <file:Documentation/networking/pktgen.rst>.
|
||||
|
||||
To compile this code as a module, choose M here: the
|
||||
module will be called pktgen.
|
||||
|
|
|
@ -56,7 +56,7 @@
|
|||
* Integrated to 2.5.x 021029 --Lucio Maciel (luciomaciel@zipmail.com.br)
|
||||
*
|
||||
* 021124 Finished major redesign and rewrite for new functionality.
|
||||
* See Documentation/networking/pktgen.txt for how to use this.
|
||||
* See Documentation/networking/pktgen.rst for how to use this.
|
||||
*
|
||||
* The new operation:
|
||||
* For each CPU one thread/process is created at start. This process checks
|
||||
|
|
|
@ -15,7 +15,7 @@ config LAPB
|
|||
currently supports LAPB only over Ethernet connections. If you want
|
||||
to use LAPB connections over Ethernet, say Y here and to "LAPB over
|
||||
Ethernet driver" below. Read
|
||||
<file:Documentation/networking/lapb-module.txt> for technical
|
||||
<file:Documentation/networking/lapb-module.rst> for technical
|
||||
details.
|
||||
|
||||
To compile this driver as a module, choose M here: the
|
||||
|
|
|
@ -2144,7 +2144,7 @@ static bool ieee80211_parse_tx_radiotap(struct ieee80211_local *local,
|
|||
|
||||
/*
|
||||
* Please update the file
|
||||
* Documentation/networking/mac80211-injection.txt
|
||||
* Documentation/networking/mac80211-injection.rst
|
||||
* when parsing new fields here.
|
||||
*/
|
||||
|
||||
|
|
|
@ -1043,7 +1043,7 @@ config NETFILTER_XT_TARGET_TPROXY
|
|||
on Netfilter connection tracking and NAT, unlike REDIRECT.
|
||||
For it to work you will have to configure certain iptables rules
|
||||
and use policy routing. For more information on how to set it up
|
||||
see Documentation/networking/tproxy.txt.
|
||||
see Documentation/networking/tproxy.rst.
|
||||
|
||||
To compile it as a module, choose M here. If unsure, say N.
|
||||
|
||||
|
|
|
@ -18,7 +18,7 @@ config AF_RXRPC
|
|||
This module at the moment only supports client operations and is
|
||||
currently incomplete.
|
||||
|
||||
See Documentation/networking/rxrpc.txt.
|
||||
See Documentation/networking/rxrpc.rst.
|
||||
|
||||
config AF_RXRPC_IPV6
|
||||
bool "IPv6 support for RxRPC"
|
||||
|
@ -41,7 +41,7 @@ config AF_RXRPC_DEBUG
|
|||
help
|
||||
Say Y here to make runtime controllable debugging messages appear.
|
||||
|
||||
See Documentation/networking/rxrpc.txt.
|
||||
See Documentation/networking/rxrpc.rst.
|
||||
|
||||
|
||||
config RXKAD
|
||||
|
@ -56,4 +56,4 @@ config RXKAD
|
|||
Provide kerberos 4 and AFS kaserver security handling for AF_RXRPC
|
||||
through the use of the key retention service.
|
||||
|
||||
See Documentation/networking/rxrpc.txt.
|
||||
See Documentation/networking/rxrpc.rst.
|
||||
|
|
|
@ -21,7 +21,7 @@ static const unsigned long max_jiffies = MAX_JIFFY_OFFSET;
|
|||
/*
|
||||
* RxRPC operating parameters.
|
||||
*
|
||||
* See Documentation/networking/rxrpc.txt and the variable definitions for more
|
||||
* See Documentation/networking/rxrpc.rst and the variable definitions for more
|
||||
* information on the individual parameters.
|
||||
*/
|
||||
static struct ctl_table rxrpc_sysctl_table[] = {
|
||||
|
|
|
@ -90,7 +90,7 @@ static const struct ieee80211_radiotap_namespace radiotap_ns = {
|
|||
* iterator.this_arg for type "type" safely on all arches.
|
||||
*
|
||||
* Example code:
|
||||
* See Documentation/networking/radiotap-headers.txt
|
||||
* See Documentation/networking/radiotap-headers.rst
|
||||
*/
|
||||
|
||||
int ieee80211_radiotap_iterator_init(
|
||||
|
|
|
@ -3,7 +3,7 @@ Sample and benchmark scripts for pktgen (packet generator)
|
|||
This directory contains some pktgen sample and benchmark scripts, that
|
||||
can easily be copied and adjusted for your own use-case.
|
||||
|
||||
General doc is located in kernel: Documentation/networking/pktgen.txt
|
||||
General doc is located in kernel: Documentation/networking/pktgen.rst
|
||||
|
||||
Helper include files
|
||||
====================
|
||||
|
|
Loading…
Reference in New Issue