OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
David S. Miller	4cbd7a7d3c	Merge branch 'dsa-Plug-in-PHYLINK-support' Florian Fainelli says: ==================== net: dsa: Plug in PHYLINK support This patch series adds PHYLINK support to DSA which is necessary to support more complex PHY and pluggable modules setups. Patch series can be found here: https://github.com/ffainelli/linux/commits/dsa-phylink-v2 This was tested on: - dsa-loop - bcm_sf2 - mv88e6xxx - b53 With a variety of test cases: - internal & external MDIO PHYs - MoCA with link notification through interrupt/MMIO register - built-in PHYs - ifconfig up/down for several cycles works - bind/unbind of the drivers Changes in v2: - fixed link configuration for mv88e6xxx (Andrew) after introducing polling This is technically v2 of what was posted back in March 2018, changes from last time: - fixed probe/remove of drivers - fixed missing gpiod_put() for link GPIOs - fixed polling of link GPIOs (Russell I would need your SoB on the patch you provided offline initially, added some modifications to it) - tested across a wider set of platforms And everything should still work as expected. Please be aware of the following: - switch drivers (like bcm_sf2) which may have user-facing network ports using fixed links would need to implement phylink_mac_ops to remain functional. PHYLINK does not create a phy_device for fixed links, therefore our call to adjust_link() from phylink_mac_link_{up,down} would not be calling into the driver. This should not affect CPU/DSA ports which are configured through adjust_link() but have no network devices - support for SFP/SFF is now possible, but switch drivers will still need some modifications to properly support those, including, but not limited to using the correct binding information. This will be submitted on top of this series Please do test on your respective platforms/switches and let me know if you find any issues, hopefully everything still works like before. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:06 -04:00
Florian Fainelli	58d56fcc39	net: dsa: bcm_sf2: Get rid of PHYLIB functions Now that we have converted the bcm_sf2 driver to implement PHYLINK MAC operations, we can remove the PHYLIB callbacks: adjust_link() and fixed_link_update() which are no longer called by DSA. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:06 -04:00
Florian Fainelli	aab9c4067d	net: dsa: Plug in PHYLINK support Add support for PHYLINK within the DSA subsystem in order to support more complex devices such as pluggable (SFP) and non-pluggable (SFF) modules, 10G PHYs, and traditional PHYs. Using PHYLINK allows us to drop some amount of complexity we had while probing fixed and non-fixed PHYs using Device Tree. Because PHYLINK separates the Ethernet MAC/port configuration into different stages, we let switch drivers implement those, and for now, we maintain functionality by calling dsa_slave_adjust_link() during phylink_mac_link_{up,down} which provides semantically equivalent steps. Drivers willing to take advantage of PHYLINK should implement the phylink_mac_* operations that DSA wraps. We cannot quite remove the adjust_link() callback just yet, because a number of drivers rely on that for configuring their "CPU" and "DSA" ports, this is done dsa_port_setup_phy_of() and dsa_port_fixed_link_register_of() still. Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2) will need to implement phylink_mac_ops from now on to preserve functionality, since PHYLINK does not create a phy_device instance for fixed links. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:06 -04:00
Russell King	c9a2356f35	net: dsa: mv88e6xxx: add PHYLINK support Add rudimentary phylink support to mv88e6xxx. This allows the driver using user ports with fixed links to keep operating normally. User ports with normal PHYs are not affected since the switch automatically manages their link parameters. User facing ports which use a SFP/SFF with a non-fixed link mode might require a call to phylink_mac_change() to operate properly. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> [Andrew: fixed link setting after adding link polling] Signed-off-by: Andrew Lunn <andrew@lunn.ch> [florian: expand commit message] Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:06 -04:00
Florian Fainelli	c4aef9fc0d	net: dsa: Eliminate dsa_slave_get_link() Since we use PHYLIB to manage the per-port link indication, this will also be reflected correctly in the network device's carrier state, so we can use ethtool_op_get_link() instead. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:06 -04:00
Florian Fainelli	bc0cb653a3	net: dsa: bcm_sf2: Implement phylink_mac_ops Make the bcm_sf2 driver implement phylink_mac_ops since it needs to support a wide variety of network interfaces: internal & external MDIO PHYs, fixed PHYs, MoCA with MMIO link status. A large amount of what needs to be done already exists under bcm_sf2_sw_adjust_link() so we are essentially breaking this down into the necessary operation for PHYLINK to work: mac_config, mac_link_up, mac_link_down and validate. We can now entirely get rid of most of what fixed_link_update() provided because only the link information is actually necessary. We still have to force DUPLEX_FULL for legacy Device Tree bindings that did not specify that before. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:05 -04:00
Florian Fainelli	11d8f3ddab	net: dsa: Add PHYLINK switch operations In preparation for adding support for PHYLINK within DSA, define a number of operations that we will need and that switch drivers can start implementing. Proper integration with PHYLINK will follow in subsequent patches. We start selecting PHYLINK (which implies PHYLIB) in net/dsa/Kconfig such that drivers can be guaranteed that this dependency is properly taken care of and can start referencing PHYLINK helper functions without requiring stubs or anything. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:05 -04:00
Russell King	9cd00a8aa4	net: phy: phylink: Poll link GPIOs When using a fixed link with a link GPIO, we need to poll that GPIO to determine link state changes. This is consistent with what fixed_phy.c does. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:05 -04:00
Florian Fainelli	daab3349ad	net: phy: phylink: Release link GPIO We are not releasing the link GPIO descriptor with gpiod_put() which results in subsequent probing to get -EBUSY when calling fwnode_get_named_gpiod(). Fix this by doing the release in phylink_destroy(). Fixes: `9525ae8395` ("phylink: add phylink infrastructure") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:05 -04:00
Florian Fainelli	bb322a9038	net: phy: phylink: Use gpiod_get_value_cansleep() The GPIO provider for the link GPIO line might require the use of the _cansleep() API, utilize that. This is safe to do since we run in workqueue context. Fixes: `9525ae8395` ("phylink: add phylink infrastructure") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:03:05 -04:00
Andrey Ignatov	1b97013bfb	ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed earlier in `919483096b`. * udp_sendmsg one was there since the beginning when linux sources were first added to git; * ping_v4_sendmsg one was copy/pasted in `c319b4d76b`. Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options have to be freed if they were allocated previously. Add label so that future callers (if any) can use it instead of kfree() before return that is easy to forget. Fixes: `c319b4d76b` (net: ipv4: add IPPROTO_ICMP socket kind) Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 12:00:58 -04:00
Christophe JAILLET	8ccc113172	mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()' Resources are not freed in the reverse order of the allocation. Labels are also mixed-up. Fix it and reorder code and labels in the error handling path of 'mlxsw_core_bus_device_register()' Fixes: `ef3116e540` ("mlxsw: spectrum: Register KVD resources with devlink") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 11:56:05 -04:00
David S. Miller	89dd2e752c	Merge branch 'bonding-bug-fixes-and-regressions' Debabrata Banerjee says: ==================== bonding: bug fixes and regressions Fixes to bonding driver for balance-alb mode, suitable for stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 11:50:41 -04:00
Debabrata Banerjee	21706ee8a4	bonding: send learning packets for vlans on slave There was a regression at some point from the intended functionality of commit `f60c3704e8` ("bonding: Fix alb mode to only use first level vlans.") Given the return value vlan_get_encap_level() we need to store the nest level of the bond device, and then compare the vlan's encap level to this. Without this, this check always fails and learning packets are never sent. In addition, this same commit caused a regression in the behavior of balance_alb, which requires learning packets be sent for all interfaces using the slave's mac in order to load balance properly. For vlan's that have not set a user mac, we can send after checking one bit. Otherwise we need send the set mac, albeit defeating rx load balancing for that vlan. Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 11:50:41 -04:00
Debabrata Banerjee	4fa8667ca3	bonding: do not allow rlb updates to invalid mac Make sure multicast, broadcast, and zero mac's cannot be the output of rlb updates, which should all be directed arps. Receive load balancing will be collapsed if any of these happen, as the switch will broadcast. Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 11:50:41 -04:00
Steven Rostedt (VMware)	dc432c3d7f	tracing: Fix regex_match_front() to not over compare the test string The regex match function regex_match_front() in the tracing filter logic, was fixed to test just the pattern length from testing the entire test string. That is, it went from strncmp(str, r->pattern, len) to strcmp(str, r->pattern, r->len). The issue is that str is not guaranteed to be nul terminated, and if r->len is greater than the length of str, it can access more memory than is allocated. The solution is to add a simple test if (len < r->len) return 0. Cc: stable@vger.kernel.org Fixes: `285caad415` ("tracing/filters: Fix MATCH_FRONT_ONLY filter matching") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2018-05-11 10:56:42 -04:00
Rafael J. Wysocki	ef050374e1	Merge branches 'pm-pci' and 'pm-docs' * pm-pci: PCI / PM: Check device_may_wakeup() in pci_enable_wake() PCI / PM: Always check PME wakeup capability for runtime wakeup support * pm-docs: PM: docs: intel_pstate: fix Active Mode w/o HWP paragraph PM: docs: sleep-states: Fix a typo ("includig")	2018-05-11 15:17:18 +02:00
Jann Horn	0a0b987344	compat: fix 4-byte infoleak via uninitialized struct field Commit `3a4d44b616` ("ntp: Move adjtimex related compat syscalls to native counterparts") removed the memset() in compat_get_timex(). Since then, the compat adjtimex syscall can invoke do_adjtimex() with an uninitialized ->tai. If do_adjtimex() doesn't write to ->tai (e.g. because the arguments are invalid), compat_put_timex() then copies the uninitialized ->tai field to userspace. Fix it by adding the memset() back. Fixes: `3a4d44b616` ("ntp: Move adjtimex related compat syscalls to native counterparts") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-05-10 17:51:58 -07:00
Dave Airlie	72777fe797	Merge branch 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Single amdgpu regression fix * 'drm-fixes-4.17' of git://people.freedesktop.org/~agd5f/linux: drm/amd/pp: Fix performance drop on Fiji	2018-05-11 10:37:17 +10:00
Daniel Borkmann	a84880ef43	Merge branch 'bpf-perf-rb-libbpf' Jakub Kicinski says: ==================== This series started out as a follow up to the bpftool perf event dumping patches. As suggested by Daniel patch 1 makes use of PERF_SAMPLE_TIME to simplify code and improve accuracy of timestamps. Remaining patches are trying to move perf event loop into libbpf as suggested by Alexei. One user for this new function is bpftool which links with libbpf nicely, the other, unfortunately, is in samples/bpf. Remaining patches make samples/bpf link against full libbpf.a (not just a handful of objects). Once we have full power of libbpf at our disposal we can convert some of XDP samples to use libbpf loader instead of bpf_load.c. My understanding is that this is the desired direction, at least for networking code. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:48:32 +02:00
Jakub Kicinski	be5bca44aa	samples: bpf: convert some XDP samples from bpf_load to libbpf Now that we can use full powers of libbpf in BPF samples, we should perhaps make the simplest XDP programs not depend on bpf_load helpers. This way newcomers will be exposed to the recommended library from the start. Use of bpf_prog_load_xattr() will also make it trivial to later on request offload of the programs by simply adding ifindex to the xattr. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:44:17 +02:00
Jakub Kicinski	17387dd5ac	tools: bpf: don't complain about no kernel version for networking code BPF programs only have to specify the target kernel version for tracing related hooks, in networking world that requirement does not really apply. Loosen the checks in libbpf to reflect that. bpf_object__open() users will continue to see the error for backward compatibility (and because prog_type is not available there). Error code for NULL file name is changed from ENOENT to EINVAL, as it seems more appropriate, hopefully, that's an OK change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:52 +02:00
Jakub Kicinski	2eb57bb8f6	tools: bpf: improve comments in libbpf.h Fix spelling mistakes, improve and clarify the language of comments in libbpf.h. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:52 +02:00
Jakub Kicinski	d0cabbb021	tools: bpf: move the event reading loop to libbpf There are two copies of event reading loop - in bpftool and trace_helpers "library". Consolidate them and move the code to libbpf. Return codes from trace_helpers are kept, but renamed to include LIBBPF prefix. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:52 +02:00
Jakub Kicinski	5f9380572b	samples: bpf: compile and link against full libbpf samples/bpf currently cherry-picks object files from tools/lib/bpf to link against. Just compile the full library and link statically against it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:52 +02:00
Jakub Kicinski	74662ea5d4	samples: bpf: rename struct bpf_map_def to avoid conflict with libbpf Both tools/lib/bpf/libbpf.h and samples/bpf/bpf_load.h define their own version of struct bpf_map_def. The version in bpf_load.h has more fields. libbpf does not support inner maps and its definition of struct bpf_map_def lacks the related fields. Rename the definition in bpf_load.h (samples/bpf) to avoid conflicts. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:51 +02:00
Jakub Kicinski	e3687510fc	tools: bpftool: use PERF_SAMPLE_TIME instead of reading the clock Ask the kernel to include sample time in each even instead of reading the clock. This is also more accurate because our clock reading was done when user space would dump the buffer, not when sample was produced. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:40:51 +02:00
Prashant Bhole	cb9c28ef57	bpf: sync tools bpf.h uapi header Sync the header from include/uapi/linux/bpf.h which was updated to add fib lookup helper function. This fixes selftests/bpf build failure. Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:35:25 +02:00
Joe Stringer	91bc07c9e8	selftests/bpf: Fix bash reference in Makefile '\|& ...' is a bash 4.0+ construct which is not guaranteed to be available when using '$(shell ...)' in a Makefile. Fall back to the more portable '2>&1 \| ...'. Fixes the following warning during compilation: /bin/sh: 1: Syntax error: "&" unexpected Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 01:32:07 +02:00
Roi Dayan	f85900c3e1	net/mlx5e: Err if asked to offload TC match on frag being first The HW doesn't support matching on frag first/later, return error if we are asked to offload that. Fixes: `3f7d0eb42d` ("net/mlx5e: Offload TC matching on packets being IP fragments") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-10 16:10:13 -07:00
Adi Nissim	88d725bbb4	net/mlx5: E-Switch, Include VF RDMA stats in vport statistics The host side reporting of VF vport statistics didn't include the VF RDMA traffic. Fixes: `3b751a2a41` ("net/mlx5: E-Switch, Introduce get vf statistics") Signed-off-by: Adi Nissim <adin@mellanox.com> Reported-by: Ariel Almog <ariela@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-10 16:10:13 -07:00
Daniel Jurgens	1ef903bf79	net/mlx5: Free IRQs in shutdown path Some platforms require IRQs to be free'd in the shutdown path. Otherwise they will fail to be reallocated after a kexec. Fixes: `8812c24d28` ("net/mlx5: Add fast unload support in shutdown flow") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-05-10 16:10:03 -07:00
David Howells	6b47fe1d1c	rxrpc: Trace UDP transmission failure Add a tracepoint to log transmission failure from the UDP transport socket being used by AF_RXRPC. Signed-off-by: David Howells <dhowells@redhat.com>	2018-05-10 23:26:01 +01:00
David Howells	494337c918	rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages Add a tracepoint to log received ICMP/ICMP6 events and other error messages. Signed-off-by: David Howells <dhowells@redhat.com>	2018-05-10 23:26:01 +01:00
David Howells	93864fc3ff	rxrpc: Fix the min security level for kernel calls Fix the kernel call initiation to set the minimum security level for kernel initiated calls (such as from kAFS) from the sockopt value. Fixes: `19ffa01c9c` ("rxrpc: Use structs to hold connection params and protocol info") Signed-off-by: David Howells <dhowells@redhat.com>	2018-05-10 23:26:01 +01:00
David Howells	f2aeed3a59	rxrpc: Fix error reception on AF_INET6 sockets AF_RXRPC tries to turn on IP_RECVERR and IP_MTU_DISCOVER on the UDP socket it just opened for communications with the outside world, regardless of the type of socket. Unfortunately, this doesn't work with an AF_INET6 socket. Fix this by turning on IPV6_RECVERR and IPV6_MTU_DISCOVER instead if the socket is of the AF_INET6 family. Without this, kAFS server and address rotation doesn't work correctly because the algorithm doesn't detect received network errors. Fixes: `75b54cb57c` ("rxrpc: Add IPv6 support") Signed-off-by: David Howells <dhowells@redhat.com>	2018-05-10 23:26:00 +01:00
David Howells	c54e43d752	rxrpc: Fix missing start of call timeout The expect_rx_by call timeout is supposed to be set when a call is started to indicate that we need to receive a packet by that point. This is currently put back every time we receive a packet, but it isn't started when we first send a packet. Without this, the call may wait forever if the server doesn't deign to reply. Fix this by setting the timeout upon a successful UDP sendmsg call for the first DATA packet. The timeout is initiated only for initial transmission and not for subsequent retries as we don't want the retry mechanism to extend the timeout indefinitely. Fixes: `a158bdd324` ("rxrpc: Fix call timeouts") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2018-05-10 23:26:00 +01:00
Daniel Borkmann	ff1f56d987	Merge branch 'bpf-fib-lookup-helper' David Ahern says: ==================== Provide a helper for doing a FIB and neighbor lookup in the kernel tables from an XDP program. The helper provides a fastpath for forwarding packets. If the packet is a local delivery or for any reason is not a simple lookup and forward, the packet is expected to continue up the stack for full processing. The response from a FIB and neighbor lookup is either the egress index with the bpf_fib_lookup struct filled in with dmac and gateway or 0 meaning the packet should continue up the stack. In time we can revisit this to return the FIB lookup result errno if it is one of the special RTN_'s such as RTN_BLACKHOLE (-EINVAL) so that the XDP programs can do an early drop if desired. Patches 1-6 do some more refactoring to IPv6 with the end goal of extracting a FIB lookup function that aligns with fib_lookup for IPv4, basically returning a fib6_info without creating a dst based entry. Patch 7 adds lookup functions to the ipv6 stub. These are needed since bpf is built into the kernel and ipv6 may not be built or loaded. Patch 8 adds the bpf helper and 9 adds a sample program. v3 - remove ETH_ALEN and in6_addr from uapi header v2 - removed pkt_access from bpf_func_proto as noticed by Daniel - added check in that IPv6 forwarding is enabled - added DaveM's ack on patches 1-7 and 9 based on v1 response and fact that no changes were made to them in v2 v1 - updated commit messages and cover letter - added comment to sample program noting lack of verification on egress device supporting XDP RFC v2 - fixed use of foward helper from cls_act as noted by Daniel - in patch 1 rename fib6_lookup_1 as well for consistency ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:59 +02:00
David Ahern	fe616055f7	samples/bpf: Add example of ipv4 and ipv6 forwarding in XDP Simple example of fast-path forwarding. It has a serious flaw in not verifying the egress device index supports XDP forwarding. If the egress device does not packets are dropped. Take this only as a simple example of fast-path forwarding. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:57 +02:00
David Ahern	87f5fc7e48	bpf: Provide helper to do forwarding lookups in kernel FIB table Provide a helper for doing a FIB and neighbor lookup in the kernel tables from an XDP program. The helper provides a fastpath for forwarding packets. If the packet is a local delivery or for any reason is not a simple lookup and forward, the packet continues up the stack. If it is to be forwarded, the forwarding can be done directly if the neighbor is already known. If the neighbor does not exist, the first few packets go up the stack for neighbor resolution. Once resolved, the xdp program provides the fast path. On successful lookup the nexthop dmac, current device smac and egress device index are returned. The API supports IPv4, IPv6 and MPLS protocols, but only IPv4 and IPv6 are implemented in this patch. The API includes layer 4 parameters if the XDP program chooses to do deep packet inspection to allow compare against ACLs implemented as FIB rules. Header rewrite is left to the XDP program. The lookup takes 2 flags: - BPF_FIB_LOOKUP_DIRECT to do a lookup that bypasses FIB rules and goes straight to the table associated with the device (expert setting for those looking to maximize throughput) - BPF_FIB_LOOKUP_OUTPUT to do a lookup from the egress perspective. Default is an ingress lookup. Initial performance numbers collected by Jesper, forwarded packets/sec: Full stack XDP FIB lookup XDP Direct lookup IPv4 1,947,969 7,074,156 7,415,333 IPv6 1,728,000 6,165,504 7,262,720 These number are single CPU core forwarding on a Broadwell E5-1650 v4 @ 3.60GHz. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:57 +02:00
David Ahern	65a2022e89	net/ipv6: Add fib lookup stubs for use in bpf helper Add stubs to retrieve a handle to an IPv6 FIB table, fib6_get_table, a stub to do a lookup in a specific table, fib6_table_lookup, and a stub for a full route lookup. The stubs are needed for core bpf code to handle the case when the IPv6 module is not builtin. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:57 +02:00
David Ahern	d4bea421f7	net/ipv6: Update fib6 tracepoint to take fib6_info Similar to IPv4, IPv6 should use the FIB lookup result in the tracepoint. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:57 +02:00
David Ahern	138118ec96	net/ipv6: Add fib6_lookup Add IPv6 equivalent to fib_lookup. Does a fib lookup, including rules, but returns a FIB entry, fib6_info, rather than a dst based rt6_info. fib6_lookup is any where from 140% (MULTIPLE_TABLES config disabled) to 60% faster than any of the dst based lookup methods (without custom rules) and 25% faster with custom rules (e.g., l3mdev rule). Since the lookup function has a completely different signature, fib6_rule_action is split into 2 paths: the existing one is renamed __fib6_rule_action and a new one for the fib6_info path is added. fib6_rule_action decides which to call based on the lookup_ptr. If it is fib6_table_lookup then the new path is taken. Caller must hold rcu lock as no reference is taken on the returned fib entry. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:56 +02:00
David Ahern	cc065a9eb9	net/ipv6: Refactor fib6_rule_action Move source address lookup from fib6_rule_action to a helper. It will be used in a later patch by a second variant for fib6_rule_action. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:56 +02:00
David Ahern	1d053da910	net/ipv6: Extract table lookup from ip6_pol_route ip6_pol_route is used for ingress and egress FIB lookups. Refactor it moving the table lookup into a separate fib6_table_lookup that can be invoked separately and export the new function. ip6_pol_route now calls fib6_table_lookup and uses the result to generate a dst based rt6_info. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:56 +02:00
David Ahern	3b290a31bb	net/ipv6: Rename rt6_multipath_select Rename rt6_multipath_select to fib6_multipath_select and export it. A later patch wants access to it similar to IPv4's fib_select_path. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:56 +02:00
David Ahern	6454743bc1	net/ipv6: Rename fib6_lookup to fib6_node_lookup Rename fib6_lookup to fib6_node_lookup to better reflect what it returns. The fib6_lookup name will be used in a later patch for an IPv6 equivalent to IPv4's fib_lookup. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:10:56 +02:00
Wang YanQing	68625b7631	bpf, doc: clarification for the meaning of 'id' For me, as a reader whose mother language isn't English, the old words bring a little difficulty to catch the meaning, this patch rewords the subsection in a more clarificatory way. This patch also add blank lines as separator at two places to improve readability. Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-11 00:07:14 +02:00
David S. Miller	ca3943c4aa	linux-can-fixes-for-4.17-20180510 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEENrCndlB/VnAEWuH5k9IU1zQoZfEFAlr0dbcTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCT0hTXNChl8Vj0CAC3JNk7QXU+WIEwtdKZU8GW1z+gBtXb xpaidY91djsj/2L3xzgIF+gRL6BHFQnK+0ylqmHsk38QVijl6SsWp8LPOy35u1wN yGdlEZNvnajguWENUr8cnNAtkICa7b1JR6Eyqt8ZY5Ugns2G+js6tqX3FCxkpu2I ZRMheSJ6tQcw1SjTQgC6rhsYFipSxOEdqNzdLDo3K4Ttmb2osoHnBcVZllUwZeTu iSUktcCjrbv24JNkyf1HE5wt8X3zT5nnSiAPs6JW3xwcS/kw4QOsZsFx7JwoVssB oulQ1YGz/uT13rPh40MsPNyiG65BpQCCu3JKmgJGF4Kbmd8CO3NyzA17 =G0cT -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-4.17-20180510' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== this is a pull request for net/master consisting of 2 patches. Both patches are from Lukas Wunner and fix two problems found in the hi311x CAN driver under high load situations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 17:57:11 -04:00
Colin Ian King	2fdae0349f	qed: fix spelling mistake: "taskelt" -> "tasklet" Trivial fix to spelling mistake in DP_VERBOSE message text Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 17:55:55 -04:00

... 3 4 5 6 7 ...

754311 Commits All Branches Search

754311 Commits

All Branches