OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Stanislav Fomichev	f6d08d9d85	bpftool: support cgroup sockopt Support sockopt prog type and cgroup hooks in the bpftool. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	0c51b3697a	bpf: add sockopt documentation Provide user documentation about sockopt prog type and cgroup hooks. v9: * add details about setsockopt context and inheritance v7: * add description for retval=0 and optlen=-1 v6: * describe cgroup chaining, add example v2: * use return code 2 for kernel bypass Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	65b4414a05	selftests/bpf: add sockopt test that exercises BPF_F_ALLOW_MULTI sockopt test that verifies chaining behavior. v9: * setsockopt chaining example v7: * rework the test to verify cgroup getsockopt chaining Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	8a027dc0d8	selftests/bpf: add sockopt test that exercises sk helpers socktop test that introduces new SOL_CUSTOM sockopt level and stores whatever users sets in sk storage. Whenever getsockopt is called, the original value is retrieved. v9: * SO_SNDBUF example to override user-supplied buffer v7: * use retval=0 and optlen-1 v6: * test 'ret=1' use-case as well (Alexei Starovoitov) v4: * don't call bpf_sk_fullsock helper v3: * drop (__u8 )(long) casts for optval{,_end} v2: new test Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	9ec8a4c948	selftests/bpf: add sockopt test Add sockopt selftests: * require proper expected_attach_type * enforce context field read/write access * test bpf_sockopt_handled handler * test EPERM * test limiting optlen from getsockopt * test out-of-bounds access v9: * add tests for setsockopt argument mangling v7: * remove return 2; test retval=0 and optlen=-1 v3: * use DW for optval{,_end} loads v2: * use return code 2 for kernel bypass Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	47ac90bbce	selftests/bpf: test sockopt section name Add tests that make sure libbpf section detection works. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	4cdbfb59c4	libbpf: support sockopt hooks Make libbpf aware of new sockopt hooks so it can derive prog type and hook point from the section names. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:17 -07:00
Stanislav Fomichev	aa6ab6471e	bpf: sync bpf.h to tools/ Export new prog type and hook points to the libbpf. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:16 -07:00
Stanislav Fomichev	0d01da6afc	bpf: implement getsockopt and setsockopt hooks Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks. BPF_CGROUP_SETSOCKOPT can modify user setsockopt arguments before passing them down to the kernel or bypass kernel completely. BPF_CGROUP_GETSOCKOPT can can inspect/modify getsockopt arguments that kernel returns. Both hooks reuse existing PTR_TO_PACKET{,_END} infrastructure. The buffer memory is pre-allocated (because I don't think there is a precedent for working with __user memory from bpf). This might be slow to do for each {s,g}etsockopt call, that's why I've added __cgroup_bpf_prog_array_is_empty that exits early if there is nothing attached to a cgroup. Note, however, that there is a race between __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup program layout might have changed; this should not be a problem because in general there is a race between multiple calls to {s,g}etsocktop and user adding/removing bpf progs from a cgroup. The return code of the BPF program is handled as follows: * 0: EPERM * 1: success, continue with next BPF program in the cgroup chain v9: * allow overwriting setsockopt arguments (Alexei Starovoitov): * use set_fs (same as kernel_setsockopt) * buffer is always kzalloc'd (no small on-stack buffer) v8: * use s32 for optlen (Andrii Nakryiko) v7: * return only 0 or 1 (Alexei Starovoitov) * always run all progs (Alexei Starovoitov) * use optval=0 as kernel bypass in setsockopt (Alexei Starovoitov) (decided to use optval=-1 instead, optval=0 might be a valid input) * call getsockopt hook after kernel handlers (Alexei Starovoitov) v6: * rework cgroup chaining; stop as soon as bpf program returns 0 or 2; see patch with the documentation for the details * drop Andrii's and Martin's Acked-by (not sure they are comfortable with the new state of things) v5: * skip copy_to_user() and put_user() when ret == 0 (Martin Lau) v4: * don't export bpf_sk_fullsock helper (Martin Lau) * size != sizeof(__u64) for uapi pointers (Martin Lau) * offsetof instead of bpf_ctx_range when checking ctx access (Martin Lau) v3: * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko) * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii Nakryiko) * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau) * use BPF_FIELD_SIZEOF() for consistency (Martin Lau) * new CG_SOCKOPT_ACCESS macro to wrap repeated parts v2: * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau) * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau) * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau) * added [0,2] return code check to verifier (Martin Lau) * dropped unused buf[64] from the stack (Martin Lau) * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau) * dropped bpf_target_off from ctx rewrites (Martin Lau) * use return code for kernel bypass (Martin Lau & Andrii Nakryiko) Cc: Andrii Nakryiko <andriin@fb.com> Cc: Martin Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-06-27 15:25:16 -07:00
Daniel Borkmann	3b1c667e47	Merge branch 'bpf-af-xdp-mlx5e' Tariq Toukan says: ==================== This series contains improvements to the AF_XDP kernel infrastructure and AF_XDP support in mlx5e. The infrastructure improvements are required for mlx5e, but also some of them benefit to all drivers, and some can be useful for other drivers that want to implement AF_XDP. The performance testing was performed on a machine with the following configuration: - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz - Mellanox ConnectX-5 Ex with 100 Gbit/s link The results with retpoline disabled, single stream: txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU) rxdrop: 12.2 Mpps l2fwd: 9.4 Mpps The results with retpoline enabled, single stream: txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU) rxdrop: 9.9 Mpps l2fwd: 6.8 Mpps v2 changes: Added patches for mlx5e and addressed the comments for v1. Rebased for bpf-next. v3 changes: Rebased for the newer bpf-next, resolved conflicts in libbpf. Addressed Björn's comments for coding style. Fixed a bug in error handling flow in mlx5e_open_xsk. v4 changes: UAPI is not changed, XSK RX queues are exposed to the kernel. The lower half of the available amount of RX queues are regular queues, and the upper half are XSK RX queues. The patch "xsk: Extend channels to support combined XSK/non-XSK traffic" was dropped. The final patch was reworked accordingly. Added "net/mlx5e: Attach/detach XDP program safely", as the changes introduced in the XSK patch base on the stuff from this one. Added "libbpf: Support drivers with non-combined channels", which aligns the condition in libbpf with the condition in the kernel. Rebased over the newer bpf-next. v5 changes: In v4, ethtool reports the number of channels as 'combined' and the number of XSK RX queues as 'rx' for mlx5e. It was changed, so that 'rx' is 0, and 'combined' reports the double amount of channels if there is an active UMEM - to make libbpf happy. The patch for libbpf was dropped. Although it's still useful and fixes things, it raises some disagreement, so I'm dropping it - it's no longer useful for mlx5e anymore after the change above. v6 changes: As Maxim is out of office, I rebased the series on behalf of him, solved some conflicts, and re-spinned. ==================== Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:29 +02:00
Maxim Mikityanskiy	db05815b36	net/mlx5e: Add XSK zero-copy support This commit adds support for AF_XDP zero-copy RX and TX. We create a dedicated XSK RQ inside the channel, it means that two RQs are running simultaneously: one for non-XSK traffic and the other for XSK traffic. The regular and XSK RQs use a single ID namespace split into two halves: the lower half is regular RQs, and the upper half is XSK RQs. When any zero-copy AF_XDP socket is active, changing the number of channels is not allowed, because it would break to mapping between XSK RQ IDs and channels. XSK requires different page allocation and release routines. Such functions as mlx5e_{alloc,free}_rx_mpwqe and mlx5e_{get,put}_rx_frag are generic enough to be used for both regular and XSK RQs, and they use the mlx5e_page_{alloc,release} wrappers around the real allocation functions. Function pointers are not used to avoid losing the performance with retpolines. Wherever it's certain that the regular (non-XSK) page release function should be used, it's called directly. Only the stats that could be meaningful for XSK are exposed to the userspace. Those that don't take part in the XSK flow are not considered. Note that we don't wait for WQEs on the XSK RQ (unlike the regular RQ), because the newer xdpsock sample doesn't provide any Fill Ring entries at the setup stage. We create a dedicated XSK SQ in the channel. This separation has its advantages: 1. When the UMEM is closed, the XSK SQ can also be closed and stop receiving completions. If an existing SQ was used for XSK, it would continue receiving completions for the packets of the closed socket. If a new UMEM was opened at that point, it would start getting completions that don't belong to it. 2. Calculating statistics separately. When the userspace kicks the TX, the driver triggers a hardware interrupt by posting a NOP to a dedicated XSK ICO (internal control operations) SQ, in order to trigger NAPI on the right CPU core. This XSK ICO SQ is protected by a spinlock, as the userspace application may kick the TX from any core. Store the pointers to the UMEMs in the net device private context, independently from the kernel. This way the driver can distinguish between the zero-copy and non-zero-copy UMEMs. The kernel function xdp_get_umem_from_qid does not care about this difference, but the driver is only interested in zero-copy UMEMs, particularly, on the cleanup it determines whether to close the XSK RQ and SQ or not by looking at the presence of the UMEM. Use state_lock to protect the access to this area of UMEM pointers. LRO isn't compatible with XDP, but there may be active UMEMs while XDP is off. If this is the case, don't allow LRO to ensure XDP can be reenabled at any time. The validation of XSK parameters typically happens when XSK queues open. However, when the interface is down or the XDP program isn't set, it's still possible to have active AF_XDP sockets and even to open new, but the XSK queues will be closed. To cover these cases, perform the validation also in these flows: 1. A new UMEM is registered, but the XSK queues aren't going to be created due to missing XDP program or interface being down. 2. MTU changes while there are UMEMs registered. Having this early check prevents mlx5e_open_channels from failing at a later stage, where recovery is impossible and the application has no chance to handle the error, because it got the successful return value for an MTU change or XSK open operation. The performance testing was performed on a machine with the following configuration: - 24 cores of Intel Xeon E5-2620 v3 @ 2.40 GHz - Mellanox ConnectX-5 Ex with 100 Gbit/s link The results with retpoline disabled, single stream: txonly: 33.3 Mpps (21.5 Mpps with queue and app pinned to the same CPU) rxdrop: 12.2 Mpps l2fwd: 9.4 Mpps The results with retpoline enabled, single stream: txonly: 21.3 Mpps (14.1 Mpps with queue and app pinned to the same CPU) rxdrop: 9.9 Mpps l2fwd: 6.8 Mpps Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:28 +02:00
Maxim Mikityanskiy	32a2365397	net/mlx5e: Move queue param structs to en/params.h structs mlx5e_{rq,sq,cq,channel}_param are going to be used in the upcoming XSK RX and TX patches. Move them to a header file to make them accessible from other C files. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:28 +02:00
Maxim Mikityanskiy	0a06382fa4	net/mlx5e: Encapsulate open/close queues into a function Create new functions mlx5e_{open,close}_queues to encapsulate opening and closing RQs and SQs, and call the new functions from mlx5e_{open,close}_channel. It simplifies the existing functions a bit and prepares them for the upcoming AF_XDP changes. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:28 +02:00
Maxim Mikityanskiy	a011b49f4e	net/mlx5e: Consider XSK in XDP MTU limit calculation Use the existing mlx5e_get_linear_rq_headroom function to calculate the headroom for mlx5e_xdp_max_mtu. This function takes the XSK headroom into consideration, which will be used in the following patches. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:28 +02:00
Maxim Mikityanskiy	84a0a2310d	net/mlx5e: XDP_TX from UMEM support When an XDP program returns XDP_TX, and the RQ is XSK-enabled, it requires careful handling, because convert_to_xdp_frame creates a new page and copies the data there, while our driver expects the xdp_frame to point to the same memory as the xdp_buff. Handle this case separately: map the page, and in the end unmap it and call xdp_return_frame. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	b9673cf555	net/mlx5e: Share the XDP SQ for XDP_TX between RQs Put the XDP SQ that is used for XDP_TX into the channel. It used to be a part of the RQ, but with introduction of AF_XDP there will be one more RQ that could share the same XDP SQ. This patch is a preparation for that change. Separate XDP_TX statistics per RQ were implemented in one of the previous patches. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	d963fa1511	net/mlx5e: Refactor struct mlx5e_xdp_info Currently, struct mlx5e_xdp_info has some issues that have to be cleaned up before the upcoming AF_XDP support makes things too complicated and messy. This structure is used both when sending the packet and on completion. Moreover, the cleanup procedure on completion depends on the origin of the packet (XDP_REDIRECT, XDP_TX). Adding AF_XDP support will add new flows that use this structure even differently. To avoid overcomplicating the code, this commit refactors the usage of this structure in the following ways: 1. struct mlx5e_xdp_info is split into two different structures. One is struct mlx5e_xdp_xmit_data, a transient structure that doesn't need to be stored and is only used while sending the packet. The other is still struct mlx5e_xdp_info that is stored in a FIFO and contains the fields needed on completion. 2. The fields of struct mlx5e_xdp_info that are used in different flows are put into a union. A special enum indicates the cleanup mode and helps choose the right union member. This approach is clear and explicit. Although it could be possible to "guess" the mode by looking at the values of the fields and at the XDP SQ type, it wouldn't be that clear and extendable and would require looking through the whole chain to understand what's going on. For the reference, there are the fields of struct mlx5e_xdp_info that are used in different flows (including AF_XDP ones): Packet origin \| Fields used on completion \| Cleanup steps -----------------------+---------------------------+------------------ XDP_REDIRECT, \| xdpf, dma_addr \| DMA unmap and XDP_TX from XSK RQ \| \| xdp_return_frame. -----------------------+---------------------------+------------------ XDP_TX from regular RQ \| di \| Recycle page. -----------------------+---------------------------+------------------ AF_XDP TX \| (none) \| Increment the \| \| producer index in \| \| Completion Ring. On send, the same set of mlx5e_xdp_xmit_data fields is used in all flows: DMA and virtual addresses and length. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	ed084fb604	net/mlx5e: Allow ICO SQ to be used by multiple RQs Prepare to creation of the XSK RQ, which will require posting UMRs, too. The same ICO SQ will be used for both RQs and also to trigger interrupts by posting NOPs. UMR WQEs can't be reused any more. Optimization introduced in commit `ab966d7e4f` ("net/mlx5e: RX, Recycle buffer of UMR WQEs") is reverted. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	a069e977d6	net/mlx5e: Calculate linear RX frag size considering XSK Additional conditions introduced: - XSK implies XDP. - Headroom includes the XSK headroom if it exists. - No space is reserved for struct shared_skb_info in XSK mode. - Fragment size smaller than the XSK chunk size is not allowed. A new auxiliary function mlx5e_get_linear_rq_headroom with the support for XSK is introduced. Use this function in the implementation of mlx5e_get_rq_headroom. Change headroom to u32 to match the headroom field in struct xdp_umem. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	6ed9350fe0	net/mlx5e: Replace deprecated PCI_DMA_TODEVICE The PCI API for DMA is deprecated, and PCI_DMA_TODEVICE is just defined to DMA_TO_DEVICE for backward compatibility. Just use DMA_TO_DEVICE. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	4bce4e5cb6	xsk: Return the whole xdp_desc from xsk_umem_consume_tx Some drivers want to access the data transmitted in order to implement acceleration features of the NICs. It is also useful in AF_XDP TX flow. Change the xsk_umem_consume_tx API to return the whole xdp_desc, that contains the data pointer, length and DMA address, instead of only the latter two. Adapt the implementation of i40e and ixgbe to this change. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:27 +02:00
Maxim Mikityanskiy	123e8da1d3	xsk: Change the default frame size to 4096 and allow controlling it The typical XDP memory scheme is one packet per page. Change the AF_XDP frame size in libbpf to 4096, which is the page size on x86, to allow libbpf to be used with the drivers with the packet-per-page scheme. Add a command line option -f to xdpsock to allow to specify a custom frame size. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:26 +02:00
Maxim Mikityanskiy	2761ed4b6e	libbpf: Support getsockopt XDP_OPTIONS Query XDP_OPTIONS in libbpf to determine if the zero-copy mode is active or not. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:26 +02:00
Maxim Mikityanskiy	2640d3c812	xsk: Add getsockopt XDP_OPTIONS Make it possible for the application to determine whether the AF_XDP socket is running in zero-copy mode. To achieve this, add a new getsockopt option XDP_OPTIONS that returns flags. The only flag supported for now is the zero-copy mode indicator. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:26 +02:00
Maxim Mikityanskiy	d57d76428a	xsk: Add API to check for available entries in FQ Add a function that checks whether the Fill Ring has the specified amount of descriptors available. It will be useful for mlx5e that wants to check in advance, whether it can allocate a bulk of RX descriptors, to get the best performance. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:26 +02:00
Maxim Mikityanskiy	e18953240d	net/mlx5e: Attach/detach XDP program safely When an XDP program is set, a full reopen of all channels happens in two cases: 1. When there was no program set, and a new one is being set. 2. When there was a program set, but it's being unset. The full reopen is necessary, because the channel parameters may change if XDP is enabled or disabled. However, it's performed in an unsafe way: if the new channels fail to open, the old ones are already closed, and the interface goes down. Use the safe way to switch channels instead. The same way is already used for other configuration changes. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:53:26 +02:00
Roman Gushchin	e5c891a349	bpf: fix cgroup bpf release synchronization Since commit `4bfc0bb2c6` ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself"), cgroup_bpf release occurs asynchronously (from a worker context), and before the release of the cgroup itself. This introduced a previously non-existing race between the release and update paths. E.g. if a leaf's cgroup_bpf is released and a new bpf program is attached to the one of ancestor cgroups at the same time. The race may result in double-free and other memory corruptions. To fix the problem, let's protect the body of cgroup_bpf_release() with cgroup_mutex, as it was effectively previously, when all this code was called from the cgroup release path with cgroup mutex held. Also let's skip cgroups, which have no chances to invoke a bpf program, on the update path. If the cgroup bpf refcnt reached 0, it means that the cgroup is offline (no attached processes), and there are no associated sockets left. It means there is no point in updating effective progs array! And it can lead to a leak, if it happens after the release. So, let's skip such cgroups. Big thanks for Tejun Heo for discovering and debugging of this problem! Fixes: `4bfc0bb2c6` ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-06-27 22:51:58 +02:00
David S. Miller	d7ee287827	Generic DIM From: Tal Gilboa and Yamin Fridman Implement net DIM over a generic DIM library, add RDMA DIM dim.h lib exposes an implementation of the DIM algorithm for dynamically-tuned interrupt moderation for networking interfaces. We want a similar functionality for other protocols, which might need to optimize interrupts differently. Main motivation here is DIM for NVMf storage protocol. Current DIM implementation prioritizes reducing interrupt overhead over latency. Also, in order to reduce DIM's own overhead, the algorithm might take some time to identify it needs to change profiles. While this is acceptable for networking, it might not work well on other scenarios. Here we propose a new structure to DIM. The idea is to allow a slightly modified functionality without the risk of breaking Net DIM behavior for netdev. We verified there are no degradations in current DIM behavior with the modified solution. Suggested solution: - Common logic is implemented in lib/dim/dim.c - Net DIM (existing) logic is implemented in lib/dim/net_dim.c, which uses the common logic in dim.c - Any new DIM logic will be implemented in "lib/dim/new_dim.c". This new implementation will expose modified versions of profiles, dim_step() and dim_decision(). - DIM API is declared in include/linux/dim.h for all implementations. Pros for this solution are: - Zero impact on existing net_dim implementation and usage - Relatively more code reuse (compared to two separate solutions) - Increased extensibility -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl0SiDAACgkQSD+KveBX +j4mlQgAoabbtxwIapMg62tKhnlr/3hEEyuEjiQRhzMUb5y5zOIJNRL82Nerv4K3 oD7v76emw/Xb1ErPe4yJJRdkpbpabEAFKbqkM0R7xFPNK33ZbphlU1d9YGgUKp1W Uieydn5kfubzi6p7R1TC7c2qW1Wr6rE1wnNJyC/sy8zFEP/OEH0NmgPDyOs7ckWI xaoJ1lhNWMoGN+Qk+fxazd1VtsKdcbc9JvLD2e8ON5qWdnuNiLhawcJVoWzfsCAa V/Tdff+Rj9bVFsi468u20og4EdV3ceAOTzMnhCYpAtYZHVbVk6MFgh87uQYkTrfY sW8BDmWKkRdHtzHd3a3yQiVDnz/iuw== =bAZn -----END PGP SIGNATURE----- Merge tag 'blk-dim-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mamameed says: ==================== Generic DIM From: Tal Gilboa and Yamin Fridman Implement net DIM over a generic DIM library, add RDMA DIM dim.h lib exposes an implementation of the DIM algorithm for dynamically-tuned interrupt moderation for networking interfaces. We want a similar functionality for other protocols, which might need to optimize interrupts differently. Main motivation here is DIM for NVMf storage protocol. Current DIM implementation prioritizes reducing interrupt overhead over latency. Also, in order to reduce DIM's own overhead, the algorithm might take some time to identify it needs to change profiles. While this is acceptable for networking, it might not work well on other scenarios. Here we propose a new structure to DIM. The idea is to allow a slightly modified functionality without the risk of breaking Net DIM behavior for netdev. We verified there are no degradations in current DIM behavior with the modified solution. Suggested solution: - Common logic is implemented in lib/dim/dim.c - Net DIM (existing) logic is implemented in lib/dim/net_dim.c, which uses the common logic in dim.c - Any new DIM logic will be implemented in "lib/dim/new_dim.c". This new implementation will expose modified versions of profiles, dim_step() and dim_decision(). - DIM API is declared in include/linux/dim.h for all implementations. Pros for this solution are: - Zero impact on existing net_dim implementation and usage - Relatively more code reuse (compared to two separate solutions) - Increased extensibility ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 12:42:51 -07:00
Christian Lamparter	a653f2f538	net: dsa: qca8k: introduce reset via gpio feature The QCA8337(N) has a RESETn signal on Pin B42 that triggers a chip reset if the line is pulled low. The datasheet says that: "The active low duration must be greater than 10 ms". This can hopefully fix some of the issues related to pin strapping in OpenWrt for the EA8500 which suffers from detection issues after a SoC reset. Please note that the qca8k_probe() function does currently require to read the chip's revision register for identification purposes. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:17:30 -07:00
Christian Lamparter	e7dd8a8948	dt-bindings: net: dsa: qca8k: document reset-gpios property This patch documents the qca8k's reset-gpios property that can be used if the QCA8337N ends up in a bad state during reset. Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:17:30 -07:00
David Ahern	b2c709cce6	ipv6: Convert gateway validation to use fib6_info Gateway validation does not need a dst_entry, it only needs the fib entry to validate the gateway resolution and egress device. So, convert ip6_nh_lookup_table from ip6_pol_route to fib6_table_lookup and ip6_route_check_nh to use fib6_lookup over rt6_lookup. ip6_pol_route is a call to fib6_table_lookup and if successful a call to fib6_select_path. From there the exception cache is searched for an entry or a dst_entry is created to return to the caller. The exception entry is not relevant for gateway validation, so what matters are the calls to fib6_table_lookup and then fib6_select_path. Similarly, rt6_lookup can be replaced with a call to fib6_lookup with RT6_LOOKUP_F_IFACE set in flags. Again, the exception cache search is not relevant, only the lookup with path selection. The primary difference in the lookup paths is the use of rt6_select with fib6_lookup versus rt6_device_match with rt6_lookup. When you remove complexities in the rt6_select path, e.g., 1. saddr is not set for gateway validation, so RT6_LOOKUP_F_HAS_SADDR is not relevant 2. rt6_check_neigh is not called so that removes the RT6_NUD_FAIL_DO_RR return and round-robin logic. the code paths are believed to be equivalent for the given use case - validate the gateway and optionally given the device. Furthermore, it aligns the validation with onlink code path and the lookup path actually used for rx and tx. Adjust the users, ip6_route_check_nh_onlink and ip6_route_check_nh to handle a fib6_info vs a rt6_info when performing validation checks. Existing selftests fib-onlink-tests.sh and fib_tests.sh are used to verify the changes. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:10:23 -07:00
David S. Miller	5b1bf3f644	Merge branch 'FDB-VLAN-and-PTP-fixes-for-SJA1105-DSA' Vladimir Oltean says: ==================== FDB, VLAN and PTP fixes for SJA1105 DSA This patchset is an assortment of fixes for the net-next version of the sja1105 DSA driver: - Avoid a kernel panic when the driver fails to probe or unregisters - Finish Arnd Bermann's idea of compiling PTP support as part of the main DSA driver and not separately - Better handling of initial port-based VLAN as well as VLANs for dsa_8021q FDB entries - Fix address learning for the SJA1105 P/Q/R/S family - Make static FDB entries persistent across switch resets - Fix reporting of statically-added FDB entries in 'bridge fdb show' ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:22 -07:00
Vladimir Oltean	d763778224	net: dsa: sja1105: Implement is_static for FDB entries on E/T The first generation switches don't tell us through the dynamic config interface whether the dumped FDB entries are static or not (the LOCKEDS bit from P/Q/R/S). However, now that we're keeping a mirror of all 'bridge fdb' commands in the static config, this is an opportunity to compare a dumped FDB entry to the driver's private database. After all, what makes an entry static is that we added it. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:22 -07:00
Vladimir Oltean	b3ee526a88	net: dsa: sja1105: Use correct dsa_8021q VIDs for FDB commands A FDB entry means that "frames that match this VID and DMAC must be forwarded to this port". In the case of dsa_8021q however, the VID is not a single one (and neither two, as my previous patch assumed). The VID can be set either by the CPU port (1 tx_vid), or by any of the other front-panel port (n-1 rx_vid's). Fixes: `93647594d8` ("net: dsa: sja1105: Hide the dsa_8021q VLANs from the bridge fdb command") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	17ae655540	net: dsa: sja1105: Populate is_static for FDB entries on P/Q/R/S The reason why this wasn't tackled earlier is that I had hoped I understood the user manual wrong. But unfortunately hacks are required in order to retrieve the static/dynamic nature of FDB entries on SJA1105 P/Q/R/S, since this info is stored in the writeback buffer of the dynamic config command. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	4a95078636	net: dsa: sja1105: Add a high-level overview of the dynamic config interface When trying to add support for LOCKEDS (static FDB entries) on SJA1105 P/Q/R/S, at first I didn't remember how the abstraction I created worked, and actually thought it works by mistake. To avoid other people staring at the code and not making much sense out of it, add some comments at the top of the file. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	60f6053ff1	net: dsa: sja1105: Back up static FDB entries in kernel memory After commit `8456721dd4` ("net: dsa: sja1105: Add support for configuring address ageing time"), we started to reset the switch rather often (each time the bridge core changes the ageing time on a switch port). The unfortunate reality is that SJA1105 doesn't have any {cold, warm, whatever} reset mode in which it accepts a new configuration stream without flushing the FDB. Instead, in its world, the FDB is an optional part of the static configuration. So we play its game, and do what we also do for VLANs: for each 'bridge fdb' command, we add the FDB entry through the dynamic interface, and we append the in-kernel static config memory with info that we're going to use later, when the next reset command is going to be issued. The result is that 'bridge fdb' commands are now persistent (dynamically learned entries are lost, but that's ok). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	6c56e167cc	net: dsa: sja1105: Make P/Q/R/S learn MAC addresses At the end of the commit `1da7382134` ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series") message, I said that: At the moment only FDB entries installed statically through 'bridge fdb' are visible in the dump callback - the dynamically learned ones are still under investigation. It looks like the reason why they were not visible in 'bridge fdb' was that they were never learned - always flooded. SJA1105 P/Q/R/S manual says about the MAXADDRP[port] field: Specify the maximum number of MAC address dynamically learned from the respective port. It is used to limit the number of learned MAC addresses per port. It looks like not providing a value in the static config (aka providing zeroes) is enough for it to not store the learned addresses in the FDB. For now we divide the 1024 entry FDB "equally" amongst the 5 ports. This may be revisited if the situation calls for that - for now I'm happy that learning works. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	0803948e23	net: dsa: sja1105: Actually implement the P/Q/R/S FDB bits In commit `1da7382134` ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series"), these bits were set in the static config, but apparently they did not do anything. The reason is that the packing accessors for them were part of a patch I forgot to send. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	e3502b8297	net: dsa: sja1105: Make vid 1 the default pvid In SJA1105 there is no concept of 'default values' per se, everything needs to be driver-supplied through the static configuration tables. The issue is that the hardware manual says that 'at least the default untagging VLAN' is mandatory to be provided through the static config. But VLAN 0 isn't a very good initial pvid - its use is reserved for priority-tagged frames, and the layers of the stack that care about those already make sure that this VLAN is installed, as can be seen in the message below: 8021q: adding VLAN 0 to HW filter on device swp2 So change the pvid provided through the static configuration to 1, which matches the bridge core's defaults. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	29dd908d35	net: dsa: sja1105: Cancel PTP delayed work on unregister Currently when the driver unloads and PTP is enabled, the delayed work that prevents the timecounter from expiring becomes a ticking time bomb. The kernel will schedule the work thread within 60 seconds of driver removal, but the work handler is no longer there, leading to this strange and inconclusive stack trace: [ 64.473112] Unable to handle kernel paging request at virtual address 79746970 [ 64.480340] pgd = 008c4af9 [ 64.483042] [79746970] *pgd=00000000 [ 64.486620] Internal error: Oops: 80000005 [#1] SMP ARM [ 64.491820] Modules linked in: [ 64.494871] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-rc5-01634-ge3a2773ba9e5 #1246 [ 64.503007] Hardware name: Freescale LS1021A [ 64.507259] PC is at 0x79746970 [ 64.510393] LR is at call_timer_fn+0x3c/0x18c [ 64.514729] pc : [<79746970>] lr : [<c03bd734>] psr: 60010113 [ 64.520965] sp : c1901de0 ip : 00000000 fp : c1903080 [ 64.526163] r10: c1901e38 r9 : ffffe000 r8 : c19064ac [ 64.531363] r7 : 79746972 r6 : e98dd260 r5 : 00000100 r4 : c1a9e4a0 [ 64.537859] r3 : c1900000 r2 : ffffa400 r1 : 79746972 r0 : e98dd260 [ 64.544359] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 64.551460] Control: 10c5387d Table: a8a2806a DAC: 00000051 [ 64.557176] Process swapper/0 (pid: 0, stack limit = 0x1ddb27f0) [ 64.563147] Stack: (0xc1901de0 to 0xc1902000) [ 64.567481] 1de0: eb6a4918 3d60d7c3 c1a9e554 e98dd260 eb6a34c0 c1a9e4a0 ffffa400 c19064ac [ 64.575616] 1e00: ffffe000 c03bd95c c1901e34 c1901e34 eb6a34c0 c1901e30 c1903d00 c186f4c0 [ 64.583751] 1e20: c1906488 29e34000 c1903080 c03bdca4 00000000 eaa6f218 00000000 eb6a45c0 [ 64.591886] 1e40: eb6a45c0 20010193 00000003 c03c0a68 20010193 3f7231be c1903084 00000002 [ 64.600022] 1e60: 00000082 00000001 ffffe000 c1a9e0a4 00000100 c0302298 02b64722 0000000f [ 64.608157] 1e80: c186b3c8 c1877540 c19064ac 0000000a c186b350 ffffa401 c1903d00 c1107348 [ 64.616292] 1ea0: 00200102 c0d87a14 ea823c00 ffffe000 00000012 00000000 00000000 ea810800 [ 64.624427] 1ec0: f0803000 c1876ba8 00000000 c034c784 c18774b8 c039fb50 c1906c90 c1978aac [ 64.632562] 1ee0: f080200c f0802000 c1901f10 c0709ca8 c03091a0 60010013 ffffffff c1901f44 [ 64.640697] 1f00: 00000000 c1900000 c1876ba8 c0301a8c 00000000 000070a0 eb6ac1a0 c031da60 [ 64.648832] 1f20: ffffe000 c19064ac c19064f0 00000001 00000000 c1906488 c1876ba8 00000000 [ 64.656967] 1f40: ffffffff c1901f60 c030919c c03091a0 60010013 ffffffff 00000051 00000000 [ 64.665102] 1f60: ffffe000 c0376aa4 c1a9da37 ffffffff 00000037 3f7231be c1ab20c0 000000cc [ 64.673238] 1f80: c1906488 c1906480 ffffffff 00000037 c1ab20c0 c1ab20c0 00000001 c0376e1c [ 64.681373] 1fa0: c1ab2118 c1700ea8 ffffffff ffffffff 00000000 c1700754 c17dfa40 ebfffd80 [ 64.689509] 1fc0: 00000000 c17dfa40 3f7733be 00000000 00000000 c1700330 00000051 10c0387d [ 64.697644] 1fe0: 00000000 8f000000 410fc075 10c5387d 00000000 00000000 00000000 00000000 [ 64.705788] [<c03bd734>] (call_timer_fn) from [<c03bd95c>] (expire_timers+0xd8/0x144) [ 64.713579] [<c03bd95c>] (expire_timers) from [<c03bdca4>] (run_timer_softirq+0xe4/0x1dc) [ 64.721716] [<c03bdca4>] (run_timer_softirq) from [<c0302298>] (__do_softirq+0x130/0x3c8) [ 64.729854] [<c0302298>] (__do_softirq) from [<c034c784>] (irq_exit+0xbc/0xd8) [ 64.737040] [<c034c784>] (irq_exit) from [<c039fb50>] (__handle_domain_irq+0x60/0xb4) [ 64.744833] [<c039fb50>] (__handle_domain_irq) from [<c0709ca8>] (gic_handle_irq+0x58/0x9c) [ 64.753143] [<c0709ca8>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90) [ 64.760583] Exception stack(0xc1901f10 to 0xc1901f58) [ 64.765605] 1f00: 00000000 000070a0 eb6ac1a0 c031da60 [ 64.773740] 1f20: ffffe000 c19064ac c19064f0 00000001 00000000 c1906488 c1876ba8 00000000 [ 64.781873] 1f40: ffffffff c1901f60 c030919c c03091a0 60010013 ffffffff [ 64.788456] [<c0301a8c>] (__irq_svc) from [<c03091a0>] (arch_cpu_idle+0x38/0x3c) [ 64.795816] [<c03091a0>] (arch_cpu_idle) from [<c0376aa4>] (do_idle+0x1bc/0x298) [ 64.803175] [<c0376aa4>] (do_idle) from [<c0376e1c>] (cpu_startup_entry+0x18/0x1c) [ 64.810707] [<c0376e1c>] (cpu_startup_entry) from [<c1700ea8>] (start_kernel+0x480/0x4ac) [ 64.818839] Code: bad PC value [ 64.821890] ---[ end trace e226ed97b1c584cd ]--- [ 64.826482] Kernel panic - not syncing: Fatal exception in interrupt [ 64.832807] CPU1: stopping [ 64.835501] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 5.2.0-rc5-01634-ge3a2773ba9e5 #1246 [ 64.845013] Hardware name: Freescale LS1021A [ 64.849266] [<c0312394>] (unwind_backtrace) from [<c030cc74>] (show_stack+0x10/0x14) [ 64.856972] [<c030cc74>] (show_stack) from [<c0ff4138>] (dump_stack+0xb4/0xc8) [ 64.864159] [<c0ff4138>] (dump_stack) from [<c0310854>] (handle_IPI+0x3bc/0x3dc) [ 64.871519] [<c0310854>] (handle_IPI) from [<c0709ce8>] (gic_handle_irq+0x98/0x9c) [ 64.879050] [<c0709ce8>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90) [ 64.886489] Exception stack(0xea8cbf60 to 0xea8cbfa8) [ 64.891514] bf60: 00000000 0000307c eb6c11a0 c031da60 ffffe000 c19064ac c19064f0 00000002 [ 64.899649] bf80: 00000000 c1906488 c1876ba8 00000000 00000000 ea8cbfb0 c030919c c03091a0 [ 64.907780] bfa0: 600d0013 ffffffff [ 64.911250] [<c0301a8c>] (__irq_svc) from [<c03091a0>] (arch_cpu_idle+0x38/0x3c) [ 64.918609] [<c03091a0>] (arch_cpu_idle) from [<c0376aa4>] (do_idle+0x1bc/0x298) [ 64.925967] [<c0376aa4>] (do_idle) from [<c0376e1c>] (cpu_startup_entry+0x18/0x1c) [ 64.933496] [<c0376e1c>] (cpu_startup_entry) from [<803025cc>] (0x803025cc) [ 64.940422] Rebooting in 3 seconds.. In this case, what happened is that the DSA driver failed to probe at boot time due to a PHY issue during phylink_connect_phy: [ 2.245607] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy [ 2.258051] sja1105 spi0.1: failed to create slave for port 0.0 Fixes: `bb77f36ac2` ("net: dsa: sja1105: Add support for the PTP clock") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
Vladimir Oltean	3d64ea387c	net: dsa: sja1105: Build PTP support in main DSA driver As Arnd Bergmann pointed out in commit `78fe8a28fb` ("net: dsa: sja1105: fix ptp link error"), there is no point in having PTP support as a separate loadable kernel module. So remove the exported symbols and make sja1105.ko contain PTP support or not based on CONFIG_NET_DSA_SJA1105_PTP. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:03:21 -07:00
David S. Miller	c881e10e3f	Merge branch 'net-dsa-microchip-Convert-to-regmap' Marek Vasut says: ==================== net: dsa: microchip: Convert to regmap This patchset converts KSZ9477 switch driver to regmap. This was tested with extra patches on KSZ8795. This was also tested on KSZ9477 on Microchip KSZ9477EVB board, which I now have. ==================== Signed-off-by: Marek Vasut <marex@denx.de>	2019-06-27 11:00:32 -07:00
Marek Vasut	d4bcd99cd9	net: dsa: microchip: Replace ad-hoc bit manipulation with regmap Regmap provides bit manipulation functions to set/clear bits, use those insted of reimplementing them. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:32 -07:00
Marek Vasut	255b59ad0d	net: dsa: microchip: Factor out regmap config generation into common header The regmap config tables are rather similar for various generations of the KSZ8xxx/KSZ9xxx switches. Introduce a macro which allows generating those tables without duplication. Note that $regalign parameter is not used right now, but will be used in KSZ87xx series switches. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:32 -07:00
Marek Vasut	ee394fea6f	net: dsa: microchip: Dispose of ksz_io_ops Since the driver now uses regmap , get rid of ad-hoc ksz_io_ops abstraction, which no longer has any meaning. Moreover, since regmap has it's own locking, get rid of the register access mutex. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:31 -07:00
Marek Vasut	46558d601c	net: dsa: microchip: Initial SPI regmap support Add basic SPI regmap support into the driver. Previous patches unconver that ksz_spi_write() is always ever called with len = 1, 2 or 4. We can thus drop the if (len > SPI_TX_BUF_LEN) check and we can also drop the allocation of the txbuf which is part of the driver data and wastes 256 bytes for no reason. Regmap covers the whole thing now. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:31 -07:00
Marek Vasut	ff509dab43	net: dsa: microchip: Factor out register access opcode generation Factor out the code which sends out the register read/write opcodes to the switch, since the code differs in single bit between read and write. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:31 -07:00
Marek Vasut	5ce9676e8b	net: dsa: microchip: Use PORT_CTRL_ADDR() instead of indirect function call The indirect function call to dev->dev_ops->get_port_addr() is expensive especially if called for every single register access, and only returns the value of PORT_CTRL_ADDR() macro. Use PORT_CTRL_ADDR() macro directly instead. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:31 -07:00
Marek Vasut	bafea01f65	net: dsa: microchip: Move ksz_cfg and ksz_port_cfg to ksz9477.c These functions are only used by the KSZ9477 code, move them from the header into that code. Note that these functions will be soon replaced by regmap equivalents. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-27 11:00:31 -07:00

... 3 4 5 6 7 ...

843416 Commits All Branches Search

843416 Commits

All Branches