linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Alexei Starovoitov	39f19ebbf5	bpf: rename ARG_PTR_TO_STACK since ARG_PTR_TO_STACK is no longer just pointer to stack rename it to ARG_PTR_TO_MEM and adjust comment. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:56:27 -05:00
Gianluca Borello	06c1c04972	bpf: allow helpers access to variable memory Currently, helpers that read and write from/to the stack can do so using a pair of arguments of type ARG_PTR_TO_STACK and ARG_CONST_STACK_SIZE. ARG_CONST_STACK_SIZE accepts a constant register of type CONST_IMM, so that the verifier can safely check the memory access. However, requiring the argument to be a constant can be limiting in some circumstances. Since the current logic keeps track of the minimum and maximum value of a register throughout the simulated execution, ARG_CONST_STACK_SIZE can be changed to also accept an UNKNOWN_VALUE register in case its boundaries have been set and the range doesn't cause invalid memory accesses. One common situation when this is useful: int len; char buf[BUFSIZE]; /* BUFSIZE is 128 */ if (some_condition) len = 42; else len = 84; some_helper(..., buf, len & (BUFSIZE - 1)); The compiler can often decide to assign the constant values 42 or 48 into a variable on the stack, instead of keeping it in a register. When the variable is then read back from stack into the register in order to be passed to the helper, the verifier will not be able to recognize the register as constant (the verifier is not currently tracking all constant writes into memory), and the program won't be valid. However, by allowing the helper to accept an UNKNOWN_VALUE register, this program will work because the bitwise AND operation will set the range of possible values for the UNKNOWN_VALUE register to [0, BUFSIZE), so the verifier can guarantee the helper call will be safe (assuming the argument is of type ARG_CONST_STACK_SIZE_OR_ZERO, otherwise one more check against 0 would be needed). Custom ranges can be set not only with ALU operations, but also by explicitly comparing the UNKNOWN_VALUE register with constants. Another very common example happens when intercepting system call arguments and accessing user-provided data of variable size using bpf_probe_read(). One can load at runtime the user-provided length in an UNKNOWN_VALUE register, and then read that exact amount of data up to a compile-time determined limit in order to fit into the proper local storage allocated on the stack, without having to guess a suboptimal access size at compile time. Also, in case the helpers accepting the UNKNOWN_VALUE register operate in raw mode, disable the raw mode so that the program is required to initialize all memory, since there is no guarantee the helper will fill it completely, leaving possibilities for data leak (just relevant when the memory used by the helper is the stack, not when using a pointer to map element value or packet). In other words, ARG_PTR_TO_RAW_STACK will be treated as ARG_PTR_TO_STACK. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:56:27 -05:00
Gianluca Borello	f0318d01b6	bpf: allow adjusted map element values to spill commit `484611357c` ("bpf: allow access into map value arrays") introduces the ability to do pointer math inside a map element value via the PTR_TO_MAP_VALUE_ADJ register type. The current support doesn't handle the case where a PTR_TO_MAP_VALUE_ADJ is spilled into the stack, limiting several use cases, especially when generating bpf code from a compiler. Handle this case by explicitly enabling the register type PTR_TO_MAP_VALUE_ADJ to be spilled. Also, make sure that min_value and max_value are reset just for BPF_LDX operations that don't result in a restore of a spilled register from stack. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:56:27 -05:00
Gianluca Borello	5722569bb9	bpf: allow helpers access to map element values Enable helpers to directly access a map element value by passing a register type PTR_TO_MAP_VALUE (or PTR_TO_MAP_VALUE_ADJ) to helper arguments ARG_PTR_TO_STACK or ARG_PTR_TO_RAW_STACK. This enables several use cases. For example, a typical tracing program might want to capture pathnames passed to sys_open() with: struct trace_data { char pathname[PATHLEN]; }; SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs ctx) { struct trace_data data; bpf_probe_read(data.pathname, sizeof(data.pathname), ctx->di); / consume data.pathname, for example via * bpf_trace_printk() or bpf_perf_event_output() / } Such a program could easily hit the stack limit in case PATHLEN needs to be large or more local variables need to exist, both of which are quite common scenarios. Allowing direct helper access to map element values, one could do: struct bpf_map_def SEC("maps") scratch_map = { .type = BPF_MAP_TYPE_PERCPU_ARRAY, .key_size = sizeof(u32), .value_size = sizeof(struct trace_data), .max_entries = 1, }; SEC("kprobe/sys_open") int bpf_sys_open(struct pt_regs ctx) { int id = 0; struct trace_data p = bpf_map_lookup_elem(&scratch_map, &id); if (!p) return; bpf_probe_read(p->pathname, sizeof(p->pathname), ctx->di); / consume p->pathname, for example via * bpf_trace_printk() or bpf_perf_event_output() */ } And wouldn't risk exhausting the stack. Code changes are loosely modeled after commit `6841de8b0d` ("bpf: allow helpers access the packet directly"). Unlike with PTR_TO_PACKET, these changes just work with ARG_PTR_TO_STACK and ARG_PTR_TO_RAW_STACK (not ARG_PTR_TO_MAP_KEY, ARG_PTR_TO_MAP_VALUE, ...): adding those would be trivial, but since there is not currently a use case for that, it's reasonable to limit the set of changes. Also, add new tests to make sure accesses to map element values from helpers never go out of boundary, even when adjusted. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:56:26 -05:00
Gianluca Borello	dbcfe5f76d	bpf: split check_mem_access logic for map values Move the logic to check memory accesses to a PTR_TO_MAP_VALUE_ADJ from check_mem_access() to a separate helper check_map_access_adj(). This enables to use those checks in other parts of the verifier as well, where boundaries on PTR_TO_MAP_VALUE_ADJ might need to be checked, for example when checking helper function arguments. The same thing is already happening for other types such as PTR_TO_PACKET and its check_packet_access() helper. The code has been copied verbatim, with the only difference of removing the "off += reg->max_value" statement and moving the sum into the call statement to check_map_access(), as that was only needed due to the earlier common check_map_access() call. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:56:26 -05:00
Christoph Hellwig	84a4620cfe	xfs: don't print warnings when xfs_log_force fails There are only two reasons for xfs_log_force / xfs_log_force_lsn to fail: one is an I/O error, for which xlog_bdstrat already logs a warning, and the second is an already shutdown log due to a previous I/O errors. In the latter case we'll already have a previous indication for the actual error, but the large stream of misleading warnings from xfs_log_force will probably scroll it out of the message buffer. Simply removing the warnings thus makes the XFS log reporting significantly better. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-01-09 13:45:01 -08:00
Christoph Hellwig	12ef830198	xfs: don't rely on ->total in xfs_alloc_space_available ->total is a bit of an odd parameter passed down to the low-level allocator all the way from the high-level callers. It's supposed to contain the maximum number of blocks to be allocated for the whole transaction [1]. But in xfs_iomap_write_allocate we only convert existing delayed allocations and thus only have a minimal block reservation for the current transaction, so xfs_alloc_space_available can't use it for the allocation decisions. Use the maximum of args->total and the calculated block requirement to make a decision. We probably should get rid of args->total eventually and instead apply ->minleft more broadly, but that will require some extensive changes all over. [1] which creates lots of confusion as most callers don't decrement it once doing a first allocation. But that's for a separate series. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-01-09 13:45:01 -08:00
Christoph Hellwig	54fee133ad	xfs: adjust allocation length in xfs_alloc_space_available We must decide in xfs_alloc_fix_freelist if we can perform an allocation from a given AG is possible or not based on the available space, and should not fail the allocation past that point on a healthy file system. But currently we have two additional places that second-guess xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the maxlen parameter to remove the reservation before doing the allocation (but ignores the various minium freespace requirements), and xfs_alloc_fix_minleft tries to fix up the allocated length after we've found an extent, but ignores the reservations and also doesn't take the AGFL into account (and thus fails allocations for not matching minlen in some cases). Remove all these later fixups and just correct the maxlen argument inside xfs_alloc_fix_freelist once we have the AGF buffer locked. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-01-09 13:37:44 -08:00
Christoph Hellwig	255c516278	xfs: fix bogus minleft manipulations We can't just set minleft to 0 when we're low on space - that's exactly what we need minleft for: to protect space in the AG for btree block allocations when we are low on free space. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-01-09 13:36:36 -08:00
Christoph Hellwig	5149fd327f	xfs: bump up reserved blocks in xfs_alloc_set_aside Setting aside 4 blocks globally for bmbt splits isn't all that useful, as different threads can allocate space in parallel. Bump it to 4 blocks per AG to allow each thread that is currently doing an allocation to dip into it separately. Without that we may no have enough reserved blocks if there are enough parallel transactions in an almost out space file system that all run into bmap btree splits. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-01-09 13:35:00 -08:00
Eric Dumazet	6bb629db5e	tcp: do not export tcp_peer_is_proven() After commit `1fb6f159fd` ("tcp: add tcp_conn_request"), tcp_peer_is_proven() no longer needs to be exported. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:34:39 -05:00
Jean Delvare	2ebae8bd60	net: phy: Add Meson GXL PHY hardware dependency As I understand it the Meson GXL PHY driver is only useful on one architecture so only make it visible on that architecture. Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: `7334b3e47a` ("net: phy: Add Meson GXL Internal PHY driver") Cc: Neil Armstrong <narmstrong@baylibre.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:34:39 -05:00
Vlad Tsyrklevich	ce7e40c432	net/appletalk: Fix kernel memory disclosure ipddp_route structs contain alignment padding so kernel heap memory is leaked when they are copied to user space in ipddp_ioctl(SIOCFINDIPDDPRT). Change kmalloc() to kzalloc() to clear that memory. Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:34:39 -05:00
Pavel Tikhomirov	b007f09072	ipv4: make tcp_notsent_lowat sysctl knob behave as true unsigned int > cat /proc/sys/net/ipv4/tcp_notsent_lowat -1 > echo 4294967295 > /proc/sys/net/ipv4/tcp_notsent_lowat -bash: echo: write error: Invalid argument > echo -2147483648 > /proc/sys/net/ipv4/tcp_notsent_lowat > cat /proc/sys/net/ipv4/tcp_notsent_lowat -2147483648 but in documentation we have "tcp_notsent_lowat - UNSIGNED INTEGER" v2: simplify to just proc_douintvec Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:34:38 -05:00
Alexander Alemayhu	67c408cfa8	ipv6: fix typos o s/approriate/appropriate o s/discouvery/discovery Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:34:15 -05:00
David S. Miller	f3a3e248f3	Merge branch 'net-smc' Ursula Braun says: ==================== net/smc: Shared Memory Communications - RDMA here is now V4 of the SMC-R patches having processed your feedback from end of November. The most important change is the replacement of sysfs by a generic netlink solution in patch 04. And I tried to get rid of the __packed attributes. There are still a few usages left due to SMC-R protocol defined structures. V4 changes: The order of patches 03 and 04 for pnet table management and SMC IB-client establishing has been exchanged, since pnet table management is now built on top of smc_ib_devices. Patch 01: Use EXPORT_SYMBOL_GPL(). Patch 02: Define "use_fallback" as bool. Get rid of useless smc_sock fields clearing in smc_sock_alloc(), since sk_alloc() clears out the memory. Patch 03: Postpone smc_ib_remember_port_attr() call till ib_device is mentioned in the pnet table. Patch 04: Replace sysfs-usage by a generic netlink approach for pnet table configuration. Change layout of pnet table entries to reference net_device and ib_device instead of dealing with names of net_devices and ib_devices. Patch 05: Adapt "use_fallback" usages to new type bool. Get rid of useless smc_sock fields clearing in smc_sock_alloc() Avoid __packed where possible. Check if clc responses are not too big. Patch 09: Postpone smc_setup_per_ibdev till the first connection with this ib_device is really created. Patch 11: Get rid of __packed usage. V3 changes: Patch 05: Remove unneeded DEFINE_WAIT Patch 06: Improve synchronization of link group creation Patch 07: Rename peer_rmbe_len into peer_rmbe_size to be more consistent Patch 09: Avoid calls of ib_get_memory_region with IB_ACCESS_LOCAL_WRITE, use new default local_dma_lkey from protection domain as lkey instead. Remove no longer needed function smc_ib_dereg_memory_region(). Patch 14: Switch to state ACTIVE only if still in state INIT. Return 0 for recvmsg invoked in a socket closing state. Allow getname call in state APPCLOSEWAIT1 Do not trigger destruction of a socket-in-error queued in accept queue. During cleanup of accept queue, make sure sockets are destructed, and sockets in fallback mode are handled appropriately. When freeing sndbufs/rmbs, remove them from their list and free the entry. Use add_wait_queue() and remove_wait_queue() in close wait functions. If actively closing a socket in state for PEERFINCLOSEWAIT, keep this state. If passively closing a socket while bytes are to be received, move to state APPCLOSEWAIT1. If actively aborting a socket, skip sending the close_abort flag, since RDMA communication is no longer possible. When terminating a link group, do not schedule link group freeing a 2nd time, since already done when unregistering the last remaining connection. Patch 15: Introduce smc_diag module for monitoring SMC protocol sockets. This replaces the old patch 0015 dealing with procfs. V2 changes: Patch 0002: Add SMC versions for family key strings in net/core/sock.c. Patch 0006: initialize rb_tree. Patch 0007: Get rid of unneeded use of xchg() in smc_sndbuf_unuse() and smc_rmb_unuse(). Patch 0008: Correct error checking logic for ib_function calls. Define struct smc_link field wr_tx_id as atomic_long_t. Use "do_div" instead of "%" to be architecture-independent. Patch 0009: Correct error checking logic for ib_function calls. Patch 0011: Remove xchg() calls in cursor handling. Use atomic64_t for cursor overlays on 64-bit architectures. If not available, use plain u64 and add locking for cursor reading and writing. Implement smc_curs_add() without modulo operator "%". Patch 0012: Remove xchg() calls in cursor handling. Implement smc_tx_rdma_writes() without module operator "%". Patch 0013: Remove xchg() calls in cursor handling. Patch 0014: Return type bool in smc_wr_tx_has_pending(). Remove unneeded semicolon in smc_close_shutdown_write(). Call smc_close_active() in non-fallback case only. Get rid of duplicate schedule of sock_put_work(). Take nested sock_lock in smc_listen_work(). Start close stream_wait in case of prepared sends only. Patch 0015: Remove unneeded socket ref_count in smc_proc_seq_show(). Take lock before list_empty check in smc_proc_sock_list_del(). These patches are the initial part of the implementation of the "Shared Memory Communications-RDMA" (SMC-R) protocol as defined in RFC7609 [1]. While SMC-R does not aim to replace TCP, it taps a wealth of existing data center TCP socket applications to become more efficient without the need for rewriting them. SMC-R uses RDMA over Converged Ethernet (RoCE) to save CPU consumption. For instance, when running 10 parallel connections with uperf, we measured a decrease of 60% in CPU consumption with SMC-R compared to TCP/IP (with throughput and latency comparable; measured on x86_64 with the same RoCE card and port). SMC-R does not require an RDMA communication manager (RDMA CM). SMC-R inherits TCP qualities such as reliable connections, host-based firewall packet filtering (on connection establishment) and unmodified application of communication encryption such as TLS (transport layer security) or SSL (secure sockets layer). Since original TCP is used to establish SMC-R connections, load balancers and packet inspection based on TCP/IP connection establishment continue to work for SMC-R. On the other hand, using SMC-R implies: - either involving a preload library when invoking the unchanged TCP-application or slightly modifying the source by simply changing the socket family in the socket() call - accepting extra overhead and latency in connection establishment due to SMC Connection Layer Control (CLC) handshake - explicit coupling of RoCE ports with Ethernet ports - not routable as currently built on RoCE V1 - bypassing of packet-based networking features - filtering (netfilter) - sniffing (libpcap, packet sockets, (E)BPF) - traffic control (scheduling, shaping) - bypassing of IP-header based socket options - bypassing of memory buffer (pressure) management - unusable together with IPsec Overview of the SMC-R Protocol described in informational RFC 7609 SMC-R is an open protocol that provides RDMA capabilities over RoCE transparently for applications exploiting TCP sockets. A new socket protocol family PF_SMC is introduced. There are no changes required to applications using the sockets API for TCP stream sockets other than the specification of the new socket family AF_SMC. Unmodified applications can be used by means of a dynamic preload shared library which rewrites the socket API call socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) into socket(AF_SMC, SOCK_STREAM, IPPROTO_TCP). SMC-R re-uses the address family AF_INET for all addressing purposes around struct sockaddr. SMC-R system architecture layers: +=============================================================================+ \| \| unmodified TCP application \| \| native SMC application +--------------------------------------+ \| \| dynamic preload shared library \| +=============================================================================+ \| SMC socket \| +-----------------------------------------------------------------------------+ \| \| TCP socket (for connection establishment and fallback) \| \| IB verbs +--------------------------------------------------------+ \| \| IP \| +--------------------+--------------------------------------------------------+ \| RoCE device driver \| some network device driver \| +=============================================================================+ Terms: A link group is determined by an ordered peer pair of TCP client and TCP server (IP addresses and subnet). Reversed client server roles cause an own link group. A link is a logical point-to-point connection based on an infiniband reliable connected queue pair (RC-QP) between two RoCE ports (MACs and GIDs) of a peer pair. A link group can have 1..8 links for failover and load balancing. This initial Linux implementation always has 1 link per link group. Each link group on a peer can have 1..255 remote memory buffers (RMBs). If more RMBs are needed, a peer can open another link group (this initial Linux implementation) or fall back to TCP. Each RMB has its own particular size and its own (R)DMA mapping and credentials (rtoken consisting of rkey and RDMA "virtual address"). This initial Linux implementation uses physically contiguous memory for RMBs but we are working towards scattered memory because of memory fragmentation. Each RMB has 1..255 RMB elements (RMBEs) of equal size to provide multiplexing of connections within an RMB. An RMBE is the RDMA Write destination organized as wrapping ring buffer for data transmit of a particular connection in one direction (duplex by means of mirror symmetry as with TCP). This initial Linux implementation always has 1 RMBE per RMB and thus an individual RMB for each connection. SMC-R connection establishment with subsequent data transfer: CLIENT SERVER TCP three-way handshake: regular TCP SYN --------------------------------------------------------> regular TCP SYN ACK <-------------------------------------------------------- regular TCP ACK --------------------------------------------------------> SMC Connection Layer Control (CLC) handshake exchanges RDMA credentials between peers: via above TCP connection: SMC CLC Proposal --------------------------------------------------------> via above TCP connection: SMC CLC Accept <-------------------------------------------------------- via above TCP connection: SMC CLC Confirm --------------------------------------------------------> SMC Link Layer Control (LLC) (only once per link, i.e. 1st conn. of link group): RoCE RC-QP: SMC LLC Confirm Link <======================================================== RoCE RC-QP: SMC LLC Confirm Link response ========================================================> SMC data transmission (incl. SMC Connection Data Control (CDC) message): RoCE RC-QP: RDMA Write ========================================================> RoCE RC-QP: SMC CDC message (flow control) ========================================================> ... RoCE RC-QP: RDMA Write <======================================================== RoCE RC-QP: SMC CDC message (flow control) <======================================================== ... Data flow within an established connection: +---------------------------------------------------------------------------- \| SENDER \| sendmsg() \| \| \| \| produces into sndbuf [sender's process context] \| v \| +--------+ \| \| sndbuf \| [ring buffer] \| +--------+ \| \| \| \| consumes from sndbuf and produces into receiver's RMBE [any context] \| \| by sending RDMA Write followed by SMC CDC message over RoCE RC-QP \| \| +----\|----------------------------------------------------------------------- \| +----\|----------------------------------------------------------------------- \| v RECEIVER \| +------+ \| \| RMBE \| [ring buffer, can have size different from sender's sndbuf] \| \| \| [RMBE represents rcvbuf, no further de-coupling as on sender side] \| +------+ \| \| \| \| consumes from RMBE [receiver's process context] \| v \| recvmsg() +---------------------------------------------------------------------------- Flow control ("cursor" updates) by means of SMC CDC messages: SENDER RECEIVER sends updates via CDC-------------+ sends updates via CDC on consuming from sndbuf \| on consuming from RMBE and producing into RMBE \| by means of recvmsg() \| \| \| \| +-----------------------------------\|------------+ \| \| +--v-------------------------+ +--v-----------------------+ \| receiver's consumer cursor \| \| sender's producer cursor----+ +----------------\|-----------+ +--------------------------+ \| \| \| \| receiver's RMBE \| \| +--------------------------+ \| \| \| \| \| +--------------------------------+ \| \| \| \| \| \| \| v \| \| \| +------------\| \| \|-------------+////////////\| \| \|//RDMA data written by////\| \| \|////sender that is////////\| \| \|/available to be consumed/\| \| \|///////// +---------------\| \| \|----------+^ \| \| \| \| \| \| \| +-----------------+ \| \| +--------------------------+ Sending updates of the producer cursor is immediate for low latency; something like Nagle's algorithm (absence of TCP_NODELAY) is optional and currently not part of this initial Linux implementation. Sending updates of the consumer cursor is conditional to avoid the silly window syndrome. Normal connection termination: Normal connection termination starts transitioning from socket state ACTIVE via either "Active Close" or "Passive Close". shutdown rdwr +-----------------+ or close, +-------------->\| INIT / CLOSED \|<-------------+ send PeerCon\|nClosed +-----------------+ \| PeerConnClosed \| \| \| received \| connection \| established \| \| V \| +----------------+ +-----------------+ +----------------+ \|AppFinCloseWait \| \| ACTIVE \| \|PeerFinCloseWait\| +----------------+ +-----------------+ +----------------+ \| \| \| \| \| Active Close: \| \|Passive Close: \| \| close or \| \|PeerConnClosed or \| \| shutdown wr or\| \|PeerDoneWriting \| \| shutdown rdwr \| \|received \| \| V V \| PeerConnClo\|sed +--------------+ +-------------+ \| close or received +--<----\|PeerCloseWait1\| \|AppCloseWait1\|--->----+ shutdown rdwr, \| +--------------+ +-------------+ \| send \| PeerDoneWri\|ting \| shutdown wr, \| PeerConnClosed \| received \| send Pee\|rDoneWriting \| \| V V \| \| +--------------+ +-------------+ \| +--<----\|PeerCloseWait2\| \|AppCloseWait2\|--->----+ +--------------+ +-------------+ In state CLOSED, the socket can be destructed only, once the application has issued a close(). Abnormal connection termination: +-----------------+ +-------------->\| INIT / CLOSED \|<-------------+ \| +-----------------+ \| \| \| \| +-----------------------+ \| \| \| Any state \| \| PeerConnAbo\|rt \| (before setting \| \| send received \| \| PeerConnClosed \| \| PeerConnAbort \| \| indicator in \| \| \| \| peer's RMBE) \| \| \| +-----------------------+ \| \| \| \| \| \| Active Abort: \| \| Passive Abort: \| \| problem, \| \| PeerConnAbort \| \| send \| \| received, \| \| PeerConnAbort,\| \| ECONNRESET \| \| ECONNABORTED \| \| \| \| V V \| \| +--------------+ +--------------+ \| +-------\|PeerAbortWait \| \| ProcessAbort \|------+ +--------------+ +--------------+ Implementation notes beyond RFC 7609: A PNET table in sysfs provides the mapping between network device names and RoCE Infiniband device names for the transparent switch of data communication. A PNET table can contain an arbitrary number of PNETIDs. Each PNETID contains exactly one (Ethernet) network device name and one or more RoCE Infiniband device names. Each device name can only exist in at most one PNETID (no overlapping). This initial Linux implementation allows at most one RoCE Infiniband device name per PNETID. After a new TCP connection is established, the network device name used for egress traffic with the TCP connection's local source IP address is used as key to lookup the unique PNETID, and the RoCE Infiniband device of this PNETID is used to switch data communication from TCP to RDMA during SMC CLC handshake. Problem determination: A protocol dissector is available with upstream wireshark for formatting SMC-R related RoCE LAN traffic. [https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob;f=epan/dissectors/packet-smcr.c] We are working on enhancing the Linux implementation to cover: - Improve default socket closing asynchronicity - Address corner cases with many parallel connections - Tracing - Integrated load balancing and fail-over within a link group - Splice and sendpage support - IPv6 addressing support - Keepalive, Cork - Namespaces support - Urgent data - More socket options - Diagnostics - Statistics support - SNMP support References: [1] SMC-R Informational RFC: http://www.rfc-editor.org/info/rfc7609 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:41 -05:00
Ursula Braun	f16a7dd5cf	smc: netlink interface for SMC sockets Support for SMC socket monitoring via netlink sockets of protocol NETLINK_SOCK_DIAG. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:41 -05:00
Ursula Braun	b38d732477	smc: socket closing and linkgroup cleanup smc_shutdown() and smc_release() handling delayed linkgroup cleanup for linkgroups without connections Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:40 -05:00
Ursula Braun	952310ccf2	smc: receive data from RMBE move RMBE data into user space buffer and update managing cursors Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:40 -05:00
Ursula Braun	e6727f3900	smc: send data (through RDMA) copy data to kernel send buffer, and trigger RDMA write Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:40 -05:00
Ursula Braun	5f08318f61	smc: connection data control (CDC) send and receive CDC messages (via IB message send and CQE) Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:40 -05:00
Ursula Braun	9bf9abead2	smc: link layer control (LLC) send and receive LLC messages CONFIRM_LINK (via IB message send and CQE) Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:40 -05:00
Ursula Braun	bd4ad57718	smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR Prepare the link for RDMA transport: Create a queue pair (QP) and move it into the state Ready-To-Receive (RTR). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:39 -05:00
Ursula Braun	f38ba179c6	smc: work request (WR) base for use by LLC and CDC The base containers for RDMA transport are work requests and completion queue entries processed through Infiniband verbs: * allocate and initialize these areas * map these areas to DMA * implement the basic communication consisting of work request posting and receival of completion queue events Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:39 -05:00
Ursula Braun	cd6851f303	smc: remote memory buffers (RMBs) * allocate data RMB memory for sending and receiving * size depends on the maximum socket send and receive buffers * allocated RMBs are kept during life time of the owning link group * map the allocated RMBs to DMA Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:39 -05:00
Ursula Braun	0cfdd8f92c	smc: connection and link group creation * create smc_connection for SMC-sockets * determine suitable link group for a connection * create a new link group if necessary Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:39 -05:00
Ursula Braun	a046d57da1	smc: CLC handshake (incl. preparation steps) * CLC (Connection Layer Control) handshake Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:39 -05:00
Thomas Richter	6812baabf2	smc: establish pnet table management Connection creation with SMC-R starts through an internal TCP-connection. The Ethernet interface for this TCP-connection is not restricted to the Ethernet interface of a RoCE device. Any existing Ethernet interface belonging to the same physical net can be used, as long as there is a defined relation between the Ethernet interface and some RoCE devices. This relation is defined with the help of an identification string called "Physical Net ID" or short "pnet ID". Information about defined pnet IDs and their related Ethernet interfaces and RoCE devices is stored in the SMC-R pnet table. A pnet table entry consists of the identifying pnet ID and the associated network and IB device. This patch adds pnet table configuration support using the generic netlink message interface referring to network and IB device by their names. Commands exist to add, delete, and display pnet table entries, and to flush or display the entire pnet table. There are cross-checks to verify whether the ethernet interfaces or infiniband devices really exist in the system. If either device is not available, the pnet ID entry is not created. Loss of network devices and IB devices is also monitored; a pnet ID entry is removed when an associated network or IB device is removed. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:38 -05:00
Ursula Braun	a4cf0443c4	smc: introduce SMC as an IB-client * create a list of SMC IB-devices Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:38 -05:00
Ursula Braun	ac7138746e	smc: establish new socket family * enable smc module loading and unloading * register new socket family * basic smc socket creation and deletion * use backing TCP socket to run CLC (Connection Layer Control) handshake of SMC protocol * Setup for infiniband traffic is implemented in follow-on patches. For now fallback to TCP socket is always used. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:38 -05:00
Ursula Braun	4b9d07a440	net: introduce keepalive function in struct proto Direct call of tcp_set_keepalive() function from protocol-agnostic sock_setsockopt() function in net/core/sock.c violates network layering. And newly introduced protocol (SMC-R) will need its own keepalive function. Therefore, add "keepalive" function pointer to "struct proto", and call it from sock_setsockopt() via this pointer. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 16:07:37 -05:00
David S. Miller	c8584b3fdf	Merge branch 'sh_eth-wol' Niklas Söderlund says: ==================== sh_eth: add wake-on-lan support via magic packet This series adds support for Wake-on-Lan using Magic Packet for a few models of the sh_eth driver. Patch 1/6 fix a naming error, patch 2/6 adds generic support to control and support WoL while patches 3/6 - 6/6 enable different models. Based ontop of net-next master. Changes since v2. - Fix bookkeeping for "active_count" and "event_count" reported in /sys/kernel/debug/wakeup_sources. Thanks Geert for noticing this. - Add new patch 1/6 which corrects the name of ECMR_MPDE bit, suggested by Sergei. - s/sh7743/sh7734/ in patch 5/6. Thanks Geert for spotting this. - Spelling improvements suggested by Sergei and Geert. - Add Tested-by to 3/6 and 4/6. Changes since v1. - Split generic WoL functionality and device enablement to different patches. - Enable more devices then Gen2 after feedback from Geert and datasheets. - Do not set mdp->irq_enabled = false and remove specific MagicPacket interrupt clearing, instead let sh_eth_error() clear the interrupt as for other EMAC interrupts, thanks Sergei for the suggestion. - Use the original return logic in sh_eth_resume(). - Moved sh_eth_private variable *clk to top of data structure to avoid possible gaps due to alignment restrictions. - Make wol_enabled in sh_eth_private part of the already existing bitfield instead of a bool. - Do not initiate mdp->wol_enabled to 0, the struct is kzalloc'ed so it's already set to 0. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:55:18 -05:00
Niklas Söderlund	267e1d5c74	sh_eth: enable wake-on-lan for sh7763 This is based on public datasheet for sh7763 which shows it has the same behavior and registers for WoL as other versions of sh_eth. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:55:08 -05:00
Niklas Söderlund	159c2a9044	sh_eth: enable wake-on-lan for sh7734 This is based on public datasheet for sh7734 which shows it has the same behavior and registers for WoL as other versions of sh_eth. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:55:08 -05:00
Niklas Söderlund	33017e240f	sh_eth: enable wake-on-lan for r8a7740/armadillo Geert Uytterhoeven reported WoL worked on his Armadillo board. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:54:37 -05:00
Linus Torvalds	bd5d7428f5	amdgpu, radeon, msm, meson, tilcdc, drm fixes. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYctBEAAoJEAx081l5xIa+M1kP/Rgu+pt4e+gU95jqXrYvIdQQ ffe0R/d1+Y11KVk67YL3nDZ2lJ+8RouspklRNTbqRntvqqI2UfJLTc5W4ZiRZ2hS 6W5T8x8djm70Az+aoNq+MFSBdyG37+Of1PWSvAsoMdeluFm+Psnt21E7SO/ZItNs UU2/sXoHCeIEc3fHHfwrvZMIfpp0K3087MV25EeWnSXuguXxy7j8kJBy7QwmT7IM lvvd5lVStgOwjumBxeVJtUxhN76NCORlIOS5W5VwF7IY8NpU5vqBaV2GYoLwbv9Y pejkDbvDCF6rEzegosRtzP0MPCngsFwwiIAFuMoVOWc03TZf5GX5yXbo45CBrh0c x+6cztV8IBYkvDiC/gsPgwz0PNfdbGC3HjSO8R25ZnvZU50Dp7j32dZL6Lb0e729 VjWaqXpHucrVi+ugND/HqfCTXJJDbYZcuSwwX8b9DHRNxKNbIBS1tOWLPXkhSpNx 1hHL086cQGQPd7gD7rvLe94jq5DJfBmUp6q9n44Z30cKB9V8HBwaVM5dAV7aGpVB 80t0Y2CMYqYLSZEFRYsWwwqvrr4nIvlkIa4PZjfJ2giWTo/OyeAufB+oDgookwRF hrZBk+VwgnHR1mY6Nfn30DISYMzBRy1I1IHfo71SnC7ZV8UohlsLwrlZnPNifaVD /qPzUeCNGFLK1IVxy3br =21go -----END PGP SIGNATURE----- Merge tag 'drm-fixes-for-v4.10-rc4' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "amdgpu, radeon, msm, meson, tilcdc, drm fixes. Just back online for a couple of days, gathered up the remaining fixes pull requests. This contains fixes for a few ARM platforms (msm, tilcdc, meson), and one core atomic fix. The AMD pull has some new hardware support (Polaris12) in it, but this is pretty limited to just hw enablement and shouldn't cause any problems" * tag 'drm-fixes-for-v4.10-rc4' of git://people.freedesktop.org/~airlied/linux: drm/amdgpu: drop verde dpm quirks drm/radeon: drop verde dpm quirks drm/radeon: update smc firmware selection for SI drm/amdgpu: update si kicker smc firmware drm/amd/powerplay: extend smu's response timeout time. drm/amdgpu: remove static integer for uvd pp state drm/amd/amdgpu: add Polaris12 PCI ID drm/amdgpu/powerplay: add Polaris12 support drm/amd/amdgpu: add Polaris12 support (v3) MAINTAINERS: Update mailing list for radeon and amdgpu drm/meson: Fix CVBS VDAC disable drm/meson: Fix CVBS initialization when HDMI is configured by bootloader drm: Clean up planes in atomic commit helper failure path drm: tilcdc: simplify the recovery from sync lost error on rev1 drm/meson: Fix plane atomic check when no crtc for the plane drm/msm: Verify that MSM_SUBMIT_BO_FLAGS are set drm/msm: Put back the vaddr in submit_reloc() drm/msm: Ensure that the hardware write pointer is valid	2017-01-09 12:54:20 -08:00
Niklas Söderlund	e410d86d4a	sh_eth: enable wake-on-lan for R-Car Gen2 devices Tested on Gen2 r8a7791/Koelsch. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:54:00 -05:00
Niklas Söderlund	d8981d029d	sh_eth: add generic wake-on-lan support via magic packet Add generic functionality to support Wake-on-LAN using MagicPacket which are supported by at least a few versions of sh_eth. Only add functionality for WoL, no specific sh_eth versions are marked to support WoL yet. WoL is enabled in the suspend callback by setting MagicPacket detection and disabling all interrupts expect MagicPacket. In the resume path the driver needs to reset the hardware to rearm the WoL logic, this prevents the driver from simply restoring the registers and to take advantage of that sh_eth was not suspended to reduce resume time. To reset the hardware the driver closes and reopens the device just like it would do in a normal suspend/resume scenario without WoL enabled, but it both closes and opens the device in the resume callback since the device needs to be open for WoL to work. One quirk needed for WoL is that the module clock needs to be prevented from being switched off by Runtime PM. To keep the clock alive the suspend callback need to call clk_enable() directly to increase the usage count of the clock. Then when Runtime PM decreases the clock usage count it won't reach 0 and be switched off. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:54:00 -05:00
Niklas Söderlund	6dcf45e514	sh_eth: use correct name for ECMR_MPDE bit This bit was wrongly named due to a typo, Sergei checked the SH7734/63 manuals and this bit should be named MPDE. Suggested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:53:45 -05:00
Linus Torvalds	756a7334f2	GPIO fixes for the v4.10 series: - Move free:ing of GPIO hogs to after free:ing the device to get rid of a warning state. - A small comile warning fix. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYc5BKAAoJEEEQszewGV1zLSkP/0DzvkhrTWwyc59eJoDPa8YD FHauEokwOLmTgzaErafeAmbY7BYkNyDLNSupKyQEXg39DTQZZYOeA++VPcZGlnSC Pc5Xb38dy9hbdLvas/QNl6Ft1yqB5q4R3Uamv1LH8D+VRlJxpmm85CBl1LA5R69M O902jrnplxFvF5qux0z3xBwrbFuafgaO9cvOaZyUS7NdwK1QSD45lekczLZmETM+ 9VGZCaAeNJC7oLu+7h99TxVI0ClqciwS7jX+H+G4GwgnwNf/jYpmLxoULKZQK7iY OSx/X5A10MEQESRmsqYHpBlJPvPMVL3YTRWFjpLcCmU+hSTWTja7FzSZlkP0p7Yw 3RtKivYVZM0eW4rpWcVgyydEDx4OVCwkVPpxTGiqc3RRbSm9vakWnKKTa6OvigXH FtRCWSZ2fYnc+SBHsW7qORFmEBEQnXuIoNLhnKD3tthCTiZeG0S+4STXYwJIxic8 RhX30D0nMNY8lzrtQ9Pys8T7PSabu6boE0usZ+1sUxQmabUgsoRRjzyQE8Qd4BYj 4QyJBAdnZ+u/ArAkyeB375g71azebHQuRMw1z36MYNv9l+sM5gZ/f93bOouFwQnF e0ZTNWoTqVdgM38qj9Y7m8UjYVCpCcPFB6uGeD9xurdH61/XgEtvGSlMr6/SQQyO e12eb7CIJjaiokXUCzUN =NGjC -----END PGP SIGNATURE----- Merge tag 'gpio-v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: - move freeing of GPIO hogs to after freeing the device to get rid of a warning state. - a small compile warning fix * tag 'gpio-v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: Move freeing of GPIO hogs before numbing of the device gpio: mxs: remove __init annotation	2017-01-09 12:50:33 -08:00
David S. Miller	9f2f27a9a5	Merge branch 'icmp-reply-optimize' Jesper Dangaard Brouer says: ==================== net: optimize ICMP-reply code path This patchset is optimizing the ICMP-reply code path, for ICMP packets that gets rate limited. A remote party can easily trigger this code path by sending packets to port number with no listening service. Generally the patchset moves the sysctl_icmp_msgs_per_sec ratelimit checking to earlier in the code path and removes an allocation. Use-case: The specific case I experienced this being a bottleneck is, sending UDP packets to a port with no listener, which obviously result in kernel replying with ICMP Destination Unreachable (type:3), Port Unreachable (code:3), which cause the bottleneck. After Eric and Paolo optimized the UDP socket code, the kernels PPS processing capabilities is lower for no-listen ports, than normal UDP sockets. This is bad for capacity planning when restarting a service. UDP no-listen benchmark 8xCPUs using pktgen_sample04_many_flows.sh: Baseline: 6.6 Mpps Patch: 14.7 Mpps Driver mlx5 at 50Gbit/s. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:49:13 -05:00
Jesper Dangaard Brouer	7ba91ecb16	net: for rate-limited ICMP replies save one atomic operation It is possible to avoid the atomic operation in icmp{v6,}_xmit_lock, by checking the sysctl_icmp_msgs_per_sec ratelimit before these calls, as pointed out by Eric Dumazet, but the BH disabled state must be correct. The icmp_global_allow() call states it must be called with BH disabled. This protection was given by the calls icmp_xmit_lock and icmpv6_xmit_lock. Thus, split out local_bh_disable/enable from these functions and maintain it explicitly at callers. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:49:12 -05:00
Jesper Dangaard Brouer	c0303efeab	net: reduce cycles spend on ICMP replies that gets rate limited This patch split the global and per (inet)peer ICMP-reply limiter code, and moves the global limit check to earlier in the packet processing path. Thus, avoid spending cycles on ICMP replies that gets limited/suppressed anyhow. The global ICMP rate limiter icmp_global_allow() is a good solution, it just happens too late in the process. The kernel goes through the full route lookup (return path) for the ICMP message, before taking the rate limit decision of not sending the ICMP reply. Details: The kernels global rate limiter for ICMP messages got added in commit `4cdf507d54` ("icmp: add a global rate limitation"). It is a token bucket limiter with a global lock. It brilliantly avoids locking congestion by only updating when 20ms (HZ/50) were elapsed. It can then avoids taking lock when credit is exhausted (when under pressure) and time constraint for refill is not yet meet. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:49:12 -05:00
Jesper Dangaard Brouer	8d9ba388f3	Revert "icmp: avoid allocating large struct on stack" This reverts commit `9a99d4a50c` ("icmp: avoid allocating large struct on stack"), because struct icmp_bxm no really a large struct, and allocating and free of this small 112 bytes hurts performance. Fixes: `9a99d4a50c` ("icmp: avoid allocating large struct on stack") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:49:12 -05:00
David S. Miller	aaa9c1071d	RxRPC rewrite -----BEGIN PGP SIGNATURE----- iQIVAwUAWHNwyPSw1s6N8H32AQKoqw//Wi8fpY/7SlQ8UT0RcF4KlBtfKux4dhMh c4P2ARqEi3hVHz0MAJSYwhJDiXmPT8FboXq7yQmXj7DpkwDUgEHJlOZyoZFrStWC hE72lbwD/m57jYgTG694wJZnGvTtqBEEkoMMIiUTSpEkSxB8aGsL+8dP9E6Q5hBS ixLUHINdjaubsu+uzlI3MZdDk7TWBwp5fNekf4Jbjlb9anoICEkJsjZJHTR9n3nM d9QpEbh42+YHAn2EFL8gXN+Cb7o75QppT3K+b68Pz43yvPgMLd78Q4tSN0aCo190 9ynR1szpniiw3T/xW0dGanpRjKLs7HZubTujc1oQ+TD1Q1Uh+2/nZWb9PxWAAe3S CW+ssn6slv9IS+KXyoIMbDtyPaJOu1pMxYcFVXlZOAPXnYGl8P0A610f8u9833jT OEqVKQ/bHAPiiTl2X/ATzCePhATtoYUq7jIc71pP01WK+o054bzm0r9Wyjxgs7g6 iPi4cfueZFOJMilkE9ZWuIws43YDv5wIEOWtpTkRCIHKCmkeVXkDfdRnnXhJCUeF 6y3iW0staR/pnTqI6g8LEnGku2gbteBQNCueYoJA5jsxLyl6oJw1Bur7yGTzzPnJ SP+9+RBlyGI5EzIcqQWsReOhGY4U/hOWDtltYR/gmlhlQ2o/iO4U1aiN0qa1AiaH 3ixixVygYOA= =H/FD -----END PGP SIGNATURE----- Merge tag 'rxrpc-rewrite-20170109' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== afs: Refcount afs_call struct These patches provide some tracepoints for AFS and fix a potential leak by adding refcounting to the afs_call struct. The patches are: (1) Add some tracepoints for logging incoming calls and monitoring notifications from AF_RXRPC and data reception. (2) Get rid of afs_wait_mode as it didn't turn out to be as useful as initially expected. It can be brought back later if needed. This clears some stuff out that I don't then need to fix up in (4). (3) Allow listen(..., 0) to be used to disable listening. This makes shutting down the AFS cache manager server in the kernel much easier and the accounting simpler as we can then be sure that (a) all preallocated afs_call structs are relesed and (b) no new incoming calls are going to be started. For the moment, listening cannot be reenabled. (4) Add refcounting to the afs_call struct to fix a potential multiple release detected by static checking and add a tracepoint to follow the lifecycle of afs_call objects. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:47:52 -05:00
David S. Miller	73517885fc	Merge branch 'dsa_swqitch_ops-const' Florian Fainelli says: ==================== net: dsa: Make dsa_switch_ops const This patch series allows us to annotate dsa_switch_ops with a const qualifier. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:44:57 -05:00
Florian Fainelli	a82f67afe8	net: dsa: Make dsa_switch_ops const Now that we have properly encapsulated and made drivers utilize exported functions, we can switch dsa_switch_ops to be a annotated with const. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:44:50 -05:00
Florian Fainelli	ab3d408d3f	net: dsa: Encapsulate legacy switch drivers into dsa_switch_driver In preparation for making struct dsa_switch_ops const, encapsulate it within a dsa_switch_driver which has a list pointer and a pointer to dsa_switch_ops. This allows us to take the list_head pointer out of dsa_switch_ops, which is written to by {un,}register_switch_driver. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:44:50 -05:00
Florian Fainelli	73095cb188	net: dsa: bcm_sf2: Declare our own dsa_switch_ops Utilize the b53 exported functions to fill our bcm_sf2_ops structure, also making it clear what we utilize and what we specifically override. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:44:49 -05:00
Florian Fainelli	3117455dd6	net: dsa: b53: Export most operations to other drivers In preparation for making dsa_switch_ops const, export b53 operations utilized by other drivers such as bcm_sf2. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-09 15:44:04 -05:00

... 7 8 9 10 11 ...

649233 Commits All Branches Search

649233 Commits

All Branches