OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Edward Cree	f2ed621fad	sfc: return errors from efx_mcdi_set_id_led, and de-indirect W=1 warnings indicated that 'rc' was unused in efx_mcdi_set_id_led(); change the function to return int instead of void and plumb the rc through the caller efx_ethtool_phys_id(). Since (post-Falcon) all sfc NICs use MCDI for this, there's no point in indirecting through a nic_type method, so remove that and just call efx_mcdi_set_id_led() directly. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:28:50 -07:00
Edward Cree	b1d11fdbe5	sfc: fix kernel-doc on struct efx_loopback_state Missing 'struct' keyword caused "cannot understand function prototype" warnings. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:28:50 -07:00
Edward Cree	b6d96931ca	sfc: fix unused-but-set-variable warning in efx_farch_filter_remove_safe Thanks to some past refactor, 'spec' is not actually used in this function; the code using it moved to the callee efx_farch_filter_remove. Remove the variable to fix a W=1 warning. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:28:50 -07:00
Edward Cree	35ff765f8d	sfc: fix W=1 warnings in efx_farch_handle_rx_not_ok Some of these RX-event flags aren't used at all, so remove them. Others are used only #ifdef DEBUG to log a message; suppress the unused-var warnings #ifndef DEBUG with a void cast. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:28:50 -07:00
David S. Miller	bd10d45905	Merge branch 'Add-ip6_fragment-in-ipv6_stub' wenxu says: ==================== Add ip6_fragment in ipv6_stub Add ip6_fragment in ipv6_stub and use it in openvswitch This version add default function eafnosupport_ipv6_fragment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:26:39 -07:00
wenxu	a7c978c6c9	openvswitch: using ip6_fragment in ipv6_stub Using ipv6_stub->ipv6_fragment to avoid the netfilter dependency Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:26:39 -07:00
wenxu	1d97898b36	ipv6: add ipv6_fragment hook in ipv6_stub Add ipv6_fragment to ipv6_stub to avoid calling netfilter when access ip6_fragment. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:26:39 -07:00
David S. Miller	353ff8ccad	Merge branch 'gtp-minor-enhancements' Nicolas Dichtel says: ==================== gtp: minor enhancements The first patch removes a useless rcu lock and the second relax alloc constraints when a PDP context is added. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:24:35 -07:00
Nicolas Dichtel	151ea46f3d	gtp: relax alloc constraint when adding a pdp When a PDP context is added, the rtnl lock is held, thus no need to force a GFP_ATOMIC. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:24:34 -07:00
Nicolas Dichtel	e2d1baca2b	gtp: remove useless rcu_read_lock() The rtnl lock is taken just the line above, no need to take the rcu also. Fixes: `1788b8569f` ("gtp: fix use-after-free in gtp_encap_destroy()") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:24:34 -07:00
Russell King	e859a60add	net: phylink: avoid oops during initialisation If we intend to use PCS operations, mac_pcs_get_state() will not be implemented, so will be NULL. If we also intend to register the PCS operations in mac_prepare() or mac_config(), then this leads to an attempt to call NULL function pointer during phylink_start(). Avoid this, but we must report the link is down. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:23:16 -07:00
David S. Miller	3b44c79360	Merge branch 'hinic-add-debugfs-support' Luo bin says: ==================== hinic: add debugfs support add debugfs node for querying sq/rq info and function table ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:21:27 -07:00
Luo bin	5215e16244	hinic: add support to query function table add debugfs node for querying function table, for example: cat /sys/kernel/debug/hinic/0000:15:00.0/func_table/valid Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:21:27 -07:00
Luo bin	626f060311	hinic: add support to query rq info add debugfs node for querying rq info, for example: cat /sys/kernel/debug/hinic/0000:15:00.0/RQs/0x0/rq_hw_pi Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:21:27 -07:00
Luo bin	253ac3a979	hinic: add support to query sq info add debugfs node for querying sq info, for example: cat /sys/kernel/debug/hinic/0000:15:00.0/SQs/0x0/sq_pi Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:21:26 -07:00
Magnus Karlsson	acabf32805	xsk: Documentation for XDP_SHARED_UMEM between queues and netdevs Add documentation for the XDP_SHARED_UMEM feature when a UMEM is shared between different queues and/or netdevs. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-16-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:18:00 +02:00
Cristian Dumitrescu	35149b2c04	samples/bpf: Add new sample xsk_fwd.c This sample code illustrates the packet forwarding between multiple AF_XDP sockets in multi-threading environment. All the threads and sockets are sharing a common buffer pool, with each socket having its own private buffer cache. The sockets are created with the xsk_socket__create_shared() function, which allows multiple AF_XDP sockets to share the same UMEM object. Example 1: Single thread handling two sockets. Packets received from socket A (on top of interface IFA, queue QA) are forwarded to socket B (on top of interface IFB, queue QB) and vice-versa. The thread is affinitized to CPU core C: ./xsk_fwd -i IFA -q QA -i IFB -q QB -c C Example 2: Two threads, each handling two sockets. Packets from socket A are sent to socket B (by thread X), packets from socket B are sent to socket A (by thread X); packets from socket C are sent to socket D (by thread Y), packets from socket D are sent to socket C (by thread Y). The two threads are bound to CPU cores CX and CY: ./xdp_fwd -i IFA -q QA -i IFB -q QB -i IFC -q QC -i IFD -q QD -c CX -c CY Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-15-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:17:55 +02:00
Magnus Karlsson	2f6324a393	libbpf: Support shared umems between queues and devices Add support for shared umems between hardware queues and devices to the AF_XDP part of libbpf. This so that zero-copy can be achieved in applications that want to send and receive packets between HW queues on one device or between different devices/netdevs. In order to create sockets that share a umem between hardware queues and devices, a new function has been added called xsk_socket__create_shared(). It takes the same arguments as xsk_socket_create() plus references to a fill ring and a completion ring. So for every socket that share a umem, you need to have one more set of fill and completion rings. This in order to maintain the single-producer single-consumer semantics of the rings. You can create all the sockets via the new xsk_socket__create_shared() call, or create the first one with xsk_socket__create() and the rest with xsk_socket__create_shared(). Both methods work. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-14-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:05 +02:00
Magnus Karlsson	a1132430c2	xsk: Add shared umem support between devices Add support to share a umem between different devices. This mode can be invoked with the XDP_SHARED_UMEM bind flag. Previously, sharing was only supported within the same device. Note that when sharing a umem between devices, just as in the case of sharing a umem between queue ids, you need to create a fill ring and a completion ring and tie them to the socket (with two setsockopts, one for each ring) before you do the bind with the XDP_SHARED_UMEM flag. This so that the single-producer single-consumer semantics of the rings can be upheld. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-13-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	b5aea28dca	xsk: Add shared umem support between queue ids Add support to share a umem between queue ids on the same device. This mode can be invoked with the XDP_SHARED_UMEM bind flag. Previously, sharing was only supported within the same queue id and device, and you shared one set of fill and completion rings. However, note that when sharing a umem between queue ids, you need to create a fill ring and a completion ring and tie them to the socket before you do the bind with the XDP_SHARED_UMEM flag. This so that the single-producer single-consumer semantics can be upheld. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-12-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	9647c57b11	xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for better performance Test for dma_need_sync earlier to increase performance. xsk_buff_dma_sync_for_cpu() takes an xdp_buff as parameter and from that the xsk_buff_pool reference is dug out. Perf shows that this dereference causes a lot of cache misses. But as the buffer pool is now sent down to the driver at zero-copy initialization time, we might as well use this pointer directly, instead of going via the xsk_buff and we can do so already in xsk_buff_dma_sync_for_cpu() instead of in xp_dma_sync_for_cpu. This gets rid of these cache misses. Throughput increases with 3% for the xdpsock l2fwd sample application on my machine. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-11-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	8ef4e27eb3	xsk: Rearrange internal structs for better performance Rearrange the xdp_sock, xdp_umem and xsk_buff_pool structures so that they get smaller and align better to the cache lines. In the previous commits of this patch set, these structs have been reordered with the focus on functionality and simplicity, not performance. This patch improves throughput performance by around 3%. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-10-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	921b68692a	xsk: Enable sharing of dma mappings Enable the sharing of dma mappings by moving them out from the buffer pool. Instead we put each dma mapped umem region in a list in the umem structure. If dma has already been mapped for this umem and device, it is not mapped again and the existing dma mappings are reused. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-9-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	7f7ffa4e9c	xsk: Move addrs from buffer pool to umem Replicate the addrs pointer in the buffer pool to the umem. This mapping will be the same for all buffer pools sharing the same umem. In the buffer pool we leave the addrs pointer for performance reasons. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-8-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	a5aa8e529e	xsk: Move xsk_tx_list and its lock to buffer pool Move the xsk_tx_list and the xsk_tx_list_lock from the umem to the buffer pool. This so that we in a later commit can share the umem between multiple HW queues. There is one xsk_tx_list per device and queue id, so it should be located in the buffer pool. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-7-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	c2d3d6a474	xsk: Move queue_id, dev and need_wakeup to buffer pool Move queue_id, dev, and need_wakeup from the umem to the buffer pool. This so that we in a later commit can share the umem between multiple HW queues. There is one buffer pool per dev and queue id, so these variables should belong to the buffer pool, not the umem. Need_wakeup is also something that is set on a per napi level, so there is usually one per device and queue id. So move this to the buffer pool too. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-6-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	7361f9c3d7	xsk: Move fill and completion rings to buffer pool Move the fill and completion rings from the umem to the buffer pool. This so that we in a later commit can share the umem between multiple HW queue ids. In this case, we need one fill and completion ring per queue id. As the buffer pool is per queue id and napi id this is a natural place for it and one umem struture can be shared between these buffer pools. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-5-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	1c1efc2af1	xsk: Create and free buffer pool independently from umem Create and free the buffer pool independently from the umem. Move these operations that are performed on the buffer pool from the umem create and destroy functions to new create and destroy functions just for the buffer pool. This so that in later commits we can instantiate multiple buffer pools per umem when sharing a umem between HW queues and/or devices. We also erradicate the back pointer from the umem to the buffer pool as this will not work when we introduce the possibility to have multiple buffer pools per umem. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-4-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	c4655761d3	xsk: i40e: ice: ixgbe: mlx5: Rename xsk zero-copy driver interfaces Rename the AF_XDP zero-copy driver interface functions to better reflect what they do after the replacement of umems with buffer pools in the previous commit. Mostly it is about replacing the umem name from the function names with xsk_buff and also have them take the a buffer pool pointer instead of a umem. The various ring functions have also been renamed in the process so that they have the same naming convention as the internal functions in xsk_queue.h. This so that it will be clearer what they do and also for consistency. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-3-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:04 +02:00
Magnus Karlsson	1742b3d528	xsk: i40e: ice: ixgbe: mlx5: Pass buffer pool to driver instead of umem Replace the explicit umem reference passed to the driver in AF_XDP zero-copy mode with the buffer pool instead. This in preparation for extending the functionality of the zero-copy mode so that umems can be shared between queues on the same netdev and also between netdevs. In this commit, only an umem reference has been added to the buffer pool struct. But later commits will add other entities to it. These are going to be entities that are different between different queue ids and netdevs even though the umem is shared between them. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-2-git-send-email-magnus.karlsson@intel.com	2020-08-31 21:15:03 +02:00
Johannes Berg	c30a3c957c	netlink: policy: correct validation type check In the policy export for binary attributes I erroneously used a != NLA_VALIDATE_NONE comparison instead of checking for the two possible values, which meant that if a validation function pointer ended up aliasing the min/max as negatives, we'd hit a warning in nla_get_range_unsigned(). Fix this to correctly check for only the two types that should be handled here, i.e. range with or without warn-too-long. Reported-by: syzbot+353df1490da781637624@syzkaller.appspotmail.com Fixes: `8aa26c575f` ("netlink: make NLA_BINARY validation more flexible") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-31 12:01:15 -07:00
Alexei Starovoitov	29523c5e67	bpf: Fix build without BPF_LSM. resolve_btfids doesn't like empty set. Add unused ID when BPF_LSM is off. Fixes: `1e6c62a882` ("bpf: Introduce sleepable BPF programs") Reported-by: Björn Töpel <bjorn.topel@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Song Liu <songliubraving@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200831163132.66521-1-alexei.starovoitov@gmail.com	2020-08-31 20:56:10 +02:00
Alexei Starovoitov	9667305c63	bpf: Fix build without BPF_SYSCALL, but with BPF_JIT. When CONFIG_BPF_SYSCALL is not set, but CONFIG_BPF_JIT=y the kernel build fails: In file included from ../kernel/bpf/trampoline.c:11: ../kernel/bpf/trampoline.c: In function ‘bpf_trampoline_update’: ../kernel/bpf/trampoline.c:220:39: error: ‘call_rcu_tasks_trace’ undeclared ../kernel/bpf/trampoline.c: In function ‘__bpf_prog_enter_sleepable’: ../kernel/bpf/trampoline.c:411:2: error: implicit declaration of function ‘rcu_read_lock_trace’ ../kernel/bpf/trampoline.c: In function ‘__bpf_prog_exit_sleepable’: ../kernel/bpf/trampoline.c:416:2: error: implicit declaration of function ‘rcu_read_unlock_trace’ This is due to: obj-$(CONFIG_BPF_JIT) += trampoline.o obj-$(CONFIG_BPF_JIT) += dispatcher.o There is a number of functions that arch/x86/net/bpf_jit_comp.c is using from these two files, but none of them will be used when only cBPF is on (which is the case for BPF_SYSCALL=n BPF_JIT=y). Add rcu_trace functions to rcupdate_trace.h. The JITed code won't execute them and BPF trampoline logic won't be used without BPF_SYSCALL. Fixes: `1e6c62a882` ("bpf: Introduce sleepable BPF programs") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/bpf/20200831155155.62754-1-alexei.starovoitov@gmail.com	2020-08-31 20:54:57 +02:00
Daniel Borkmann	10496f261e	Merge branch 'bpf-sleepable' Alexei Starovoitov says: ==================== v2->v3: - switched to minimal allowlist approach. Essentially that means that syscall entry, few btrfs allow_error_inject functions, should_fail_bio(), and two LSM hooks: file_mprotect and bprm_committed_creds are the only hooks that allow attaching of sleepable BPF programs. When comprehensive analysis of LSM hooks will be done this allowlist will be extended. - added patch 1 that fixes prototypes of two mm functions to reliably work with error injection. It's also necessary for resolve_btfids tool to recognize these two funcs, but that's secondary. v1->v2: - split fmod_ret fix into separate patch - added denylist v1: This patch set introduces the minimal viable support for sleepable bpf programs. In this patch only fentry/fexit/fmod_ret and lsm progs can be sleepable. Only array and pre-allocated hash and lru maps allowed. Here is 'perf report' difference of sleepable vs non-sleepable: 3.86% bench [k] __srcu_read_unlock 3.22% bench [k] __srcu_read_lock 0.92% bench [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep 0.50% bench [k] bpf_trampoline_10297 0.26% bench [k] __bpf_prog_exit_sleepable 0.21% bench [k] __bpf_prog_enter_sleepable vs 0.88% bench [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry 0.84% bench [k] bpf_trampoline_10297 0.13% bench [k] __bpf_prog_enter 0.12% bench [k] __bpf_prog_exit vs 0.79% bench [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep 0.72% bench [k] bpf_trampoline_10381 0.31% bench [k] __bpf_prog_exit_sleepable 0.29% bench [k] __bpf_prog_enter_sleepable Sleepable vs non-sleepable program invocation overhead is only marginally higher due to rcu_trace. srcu approach is much slower. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2020-08-28 21:20:38 +02:00
Alexei Starovoitov	e68a144547	selftests/bpf: Add sleepable tests Modify few tests to sanity test sleepable bpf functionality. Running 'bench trig-fentry-sleep' vs 'bench trig-fentry' and 'perf report': sleepable with SRCU: 3.86% bench [k] __srcu_read_unlock 3.22% bench [k] __srcu_read_lock 0.92% bench [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep 0.50% bench [k] bpf_trampoline_10297 0.26% bench [k] __bpf_prog_exit_sleepable 0.21% bench [k] __bpf_prog_enter_sleepable sleepable with RCU_TRACE: 0.79% bench [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry_sleep 0.72% bench [k] bpf_trampoline_10381 0.31% bench [k] __bpf_prog_exit_sleepable 0.29% bench [k] __bpf_prog_enter_sleepable non-sleepable with RCU: 0.88% bench [k] bpf_prog_740d4210cdcd99a3_bench_trigger_fentry 0.84% bench [k] bpf_trampoline_10297 0.13% bench [k] __bpf_prog_enter 0.12% bench [k] __bpf_prog_exit Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-6-alexei.starovoitov@gmail.com	2020-08-28 21:20:33 +02:00
Alexei Starovoitov	2b288740a1	libbpf: Support sleepable progs Pass request to load program as sleepable via ".s" suffix in the section name. If it happens in the future that all map types and helpers are allowed with BPF_F_SLEEPABLE flag "fmod_ret/" and "lsm/" can be aliased to "fmod_ret.s/" and "lsm.s/" to make all lsm and fmod_ret programs sleepable by default. The fentry and fexit programs would always need to have sleepable vs non-sleepable distinction, since not all fentry/fexit progs will be attached to sleepable kernel functions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: KP Singh <kpsingh@google.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-5-alexei.starovoitov@gmail.com	2020-08-28 21:20:33 +02:00
Alexei Starovoitov	07be4c4a3e	bpf: Add bpf_copy_from_user() helper. Sleepable BPF programs can now use copy_from_user() to access user memory. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-4-alexei.starovoitov@gmail.com	2020-08-28 21:20:33 +02:00
Alexei Starovoitov	1e6c62a882	bpf: Introduce sleepable BPF programs Introduce sleepable BPF programs that can request such property for themselves via BPF_F_SLEEPABLE flag at program load time. In such case they will be able to use helpers like bpf_copy_from_user() that might sleep. At present only fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only when they are attached to kernel functions that are known to allow sleeping. The non-sleepable programs are relying on implicit rcu_read_lock() and migrate_disable() to protect life time of programs, maps that they use and per-cpu kernel structures used to pass info between bpf programs and the kernel. The sleepable programs cannot be enclosed into rcu_read_lock(). migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs should not be enclosed in migrate_disable() as well. Therefore rcu_read_lock_trace is used to protect the life time of sleepable progs. There are many networking and tracing program types. In many cases the 'struct bpf_prog *' pointer itself is rcu protected within some other kernel data structure and the kernel code is using rcu_dereference() to load that program pointer and call BPF_PROG_RUN() on it. All these cases are not touched. Instead sleepable bpf programs are allowed with bpf trampoline only. The program pointers are hard-coded into generated assembly of bpf trampoline and synchronize_rcu_tasks_trace() is used to protect the life time of the program. The same trampoline can hold both sleepable and non-sleepable progs. When rcu_read_lock_trace is held it means that some sleepable bpf program is running from bpf trampoline. Those programs can use bpf arrays and preallocated hash/lru maps. These map types are waiting on programs to complete via synchronize_rcu_tasks_trace(); Updates to trampoline now has to do synchronize_rcu_tasks_trace() and synchronize_rcu_tasks() to wait for sleepable progs to finish and for trampoline assembly to finish. This is the first step of introducing sleepable progs. Eventually dynamically allocated hash maps can be allowed and networking program types can become sleepable too. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com	2020-08-28 21:20:33 +02:00
Alexei Starovoitov	76cd61739f	mm/error_inject: Fix allow_error_inject function signatures. 'static' and 'static noinline' function attributes make no guarantees that gcc/clang won't optimize them. The compiler may decide to inline 'static' function and in such case ALLOW_ERROR_INJECT becomes meaningless. The compiler could have inlined __add_to_page_cache_locked() in one callsite and didn't inline in another. In such case injecting errors into it would cause unpredictable behavior. It's worse with 'static noinline' which won't be inlined, but it still can be optimized. Like the compiler may decide to remove one argument or constant propagate the value depending on the callsite. To avoid such issues make sure that these functions are global noinline. Fixes: `af3b854492` ("mm/page_alloc.c: allow error injection") Fixes: `cfcbfb1382` ("mm/filemap.c: enable error injection at add_to_page_cache()") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-2-alexei.starovoitov@gmail.com	2020-08-28 21:20:32 +02:00
Alex Dewar	0f091e4331	netlabel: remove unused param from audit_log_format() Commit `d3b990b7f3` ("netlabel: fix problems with mapping removal") added a check to return an error if ret_val != 0, before ret_val is later used in a log message. Now it will unconditionally print "... res=1". So just drop the check. Addresses-Coverity: ("Dead code") Fixes: `d3b990b7f3` ("netlabel: fix problems with mapping removal") Signed-off-by: Alex Dewar <alex.dewar90@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 09:08:51 -07:00
David S. Miller	f3fb15b93a	Merge branch 'ionic-memory-usage-rework' Shannon Nelson says: ==================== ionic memory usage rework Previous review comments have suggested [1],[2] that this driver needs to rework how queue resources are managed and reconfigured so that we don't do a full driver reset and to better handle potential allocation failures. This patchset is intended to address those comments. The first few patches clean some general issues and simplify some of the memory structures. The last 4 patches specifically address queue parameter changes without a full ionic_stop()/ionic_open(). [1] https://lore.kernel.org/netdev/20200706103305.182bd727@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/ [2] https://lore.kernel.org/netdev/20200724.194417.2151242753657227232.davem@davemloft.net/ v3: use PTR_ALIGN without typecast fix up Neel's attribution v2: use PTR_ALIGN recovery if netif_set_real_num_tx/rx_queues fails less racy queue bring up after reconfig common-ize the reconfig queue stop and start ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	6f7d6f0fd7	ionic: pull reset_queues into tx_timeout handler Convert tx_timeout handler to not do the full reset. As this was the last user of ionic_reset_queues(), we can drop it. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	101b40a017	ionic: change queue count with no reset Add to our new ionic_reconfigure_queues() to also be able to change the number of queues in use, and to change the queue interrupt layout between split and combined. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	a34e25ab97	ionic: change the descriptor ring length without full reset The original way of changing ring length was to completely tear down the lif's queue structure and then rebuild it, while running the risk of allocations that might fail in the middle and leave us with a broken driver. Instead, we can set up all the new queue and descriptor allocations first, then swap them out and delete the old allocations. If the new allocations fail, we report the error, stay with the old setup and continue running. This gives us a safer path, and a smaller window of time where we're not processing traffic. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	f053e1f870	ionic: change mtu without full queue rebuild We really don't need to tear down and rebuild the whole queue structure when changing the MTU; we can simply stop the queues, clean and refill, then restart the queues. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	f1d2e894f1	ionic: use index not pointer for queue tracking Use index counters rather than pointers for tracking head and tail in the queues to save a little memory and to perhaps slightly faster queue processing. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	ea5a8b09dc	ionic: reduce contiguous memory allocation requirement Split out the queue descriptor blocks into separate dma allocations to make for smaller blocks. Co-developed-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	d4881430f5	ionic: clean up unnecessary non-static functions ionic_open() and ionic_stop() are not referenced outside of their defining file, so make them static. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	34dec947b9	ionic: rework and simplify handling of the queue stats block Use a block of stats structs attached to the lif instead of little ones attached to each qcq. This simplifies our memory management and gets rid of a lot of unnecessary indirection. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00
Shannon Nelson	30b87ab4c0	ionic: remove lif list concept As we aren't yet supporting multiple lifs, we can remove complexity by removing the list concept and related code, to be re-engineered later when actually needed. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-28 08:01:30 -07:00

1 2 3 4 5 ...

949719 Commits All Branches Search

949719 Commits

All Branches