OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Ido Schimmel	fc922bb0dd	mlxsw: spectrum_router: Use one LPM tree for all virtual routers The number of LPM trees available for lookup is much smaller than the number of virtual routers, which are used to implement VRFs. In addition, an LPM tree can only be used by one protocol - either IPv4 or IPv6. Therefore, in order to increase the number of supported virtual routers to the maximum we need to be able to share LPM trees across virtual routers instead of trying to find an optimized tree for each. Do that by allocating one LPM tree for each protocol, but make sure it will only include prefixes that are actually used, so as to not perform unnecessary lookups. Since changing the structure of a bound tree isn't recommended, whenever a new tree it required, it's first created and then bound to each virtual router, replacing the old one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:14:03 -07:00
Ido Schimmel	0adb214ba2	mlxsw: spectrum_router: Pass argument explicitly Instead of relying on the LPM tree to be assigned to the virtual router before binding the two, lets pass it explicitly. This will later allow us to return upon binding error instead of having to perform a rollback of the assignment. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:14:03 -07:00
Ido Schimmel	cc70267008	mlxsw: spectrum_router: Return void from deletion functions There is no point in returning a value from function whose return value is never checked. Even if the return value was checked, there wouldn't be anything to do about it, as these functions are either called from error or deletion paths. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-14 11:14:03 -07:00
Ido Schimmel	65e65ec137	mlxsw: spectrum_router: Don't ignore IPv6 notifications We now have all the necessary IPv6 infrastructure in place, so stop ignoring these notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:01 -07:00
Ido Schimmel	f36f5ac677	mlxsw: spectrum_router: Abort on source-specific routes Without resorting to ACLs, the device performs route lookup solely based on the destination IP address. In case source-specific routing is needed, an error is returned and the abort mechanism is activated, thus allowing the kernel to take over forwarding decisions. Instead of aborting, we can trap specific destination prefixes where source-specific routes are present, but this will result in a lot more code that is unlikely to ever be used. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:01 -07:00
Ido Schimmel	0a7fd1ac2a	mlxsw: spectrum_router: Add support for route replace In case we got a replace event, then the replaced route must exist. If the route isn't capable of multipath, then replace first matching non-multipath capable route. If the route is capable of multipath and matching multipath capable route is found, then replace it. Otherwise, replace first matching non-multipath capable route. The new route is inserted before the replaced one. In case the replaced route is currently offloaded, then it's overwritten in the device's table by the new route and later deleted, thus not impacting routed traffic. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:01 -07:00
Ido Schimmel	428b851f56	mlxsw: spectrum_router: Add support for IPv6 routes addition / deletion Allow directly connected and remote unicast IPv6 routes to be programmed to the device's tables. As with IPv4, identical routes - sharing the same destination prefix - are ordered in a FIB node according to their table ID and then the metric. While the kernel doesn't share the same trie for the local and main table, this does happen in the device, so ordering according to table ID is needed. Since individual nexthops can be added and deleted in IPv6, each FIB entry stores a linked list of the rt6_info structs it represents. Upon the addition or deletion of a nexthop, a new nexthop group is allocated according to the new configuration and the old one is destroyed. Identical groups aren't currently consolidated, but will be in a follow-up patchset. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Ido Schimmel	583419fdf2	mlxsw: spectrum_router: Sanitize IPv6 FIB rules We only allow FIB offload in the presence of default rules or an l3mdev rule. In a similar fashion to IPv4 FIB rules, sanitize IPv6 rules. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Ido Schimmel	66a5763ac1	mlxsw: spectrum_router: Demultiplex FIB event based on family The FIB notification block currently only handles IPv4 events, but we want to start handling IPv6 events soon, so lay the groundwork now. Do that by preparing the work item and process it according to the notified address family. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:36:00 -07:00
Ido Schimmel	64e5e8252d	mlxsw: spectrum_router: Ignore address families other than IPv4 We're about to add IPv6 notifications in the FIB notification chain, but the driver currently doesn't support these, so ignore them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:35:59 -07:00
Ido Schimmel	04b1d4e50e	net: core: Make the FIB notification chain generic The FIB notification chain is currently soley used by IPv4 code. However, we're going to introduce IPv6 FIB offload support, which requires these notification as well. As explained in commit `c3852ef7f2` ("ipv4: fib: Replay events when registering FIB notifier"), upon registration to the chain, the callee receives a full dump of the FIB tables and rules by traversing all the net namespaces. The integrity of the dump is ensured by a per-namespace sequence counter that is incremented whenever a change to the tables or rules occurs. In order to allow more address families to use the chain, each family is expected to register its fib_notifier_ops in its pernet init. These operations allow the common code to read the family's sequence counter as well as dump its tables and rules in the given net namespace. Additionally, a 'family' parameter is added to sent notifications, so that listeners could distinguish between the different families. Implement the common code that allows listeners to register to the chain and for address families to register their fib_notifier_ops. Subsequent patches will implement these operations in IPv6. In the future, ipmr and ip6mr will be extended to provide these notifications as well. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-03 15:35:59 -07:00
Ido Schimmel	77d964e66c	mlxsw: spectrum_router: Refresh offload indication upon group refresh Now that we provide offload indication using the nexthop's flags we must refresh the offload indication whenever the offload state within the group changes. This didn't matter until now, as offload indication was provided using the FIB info flags and multipath routes were marked as offloaded as long as one of the nexthops was offloaded. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 17:00:24 -07:00
Ido Schimmel	1353ee7073	mlxsw: spectrum_router: Don't check state when refreshing offload indication Previous patch removed the reliance on the counter in the FIB info to set the offload indication, so we no longer need to keep an offload state on each FIB entry and can just set or unset the RTNH_F_OFFLOAD flag in each nexthop. This is also necessary because we're going to need to refresh the offload indication whenever the nexthop group associated with the FIB entry is refreshed. Current check would prevent us from marking a newly resolved nexthop as offloaded if the FIB entry is already marked as offloaded. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 17:00:23 -07:00
Ido Schimmel	3984d1a89f	mlxsw: spectrum_router: Provide offload indication using nexthop flags In a similar fashion to previous patch, use the nexthop flags to provide offload indication instead of the FIB info's flags. In case a nexthop in a multipath route can't be offloaded (gateway's MAC can't be resolved, for example), then its offload flag isn't set. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-02 17:00:23 -07:00
David S. Miller	29fda25a2d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 10:07:50 -07:00
Petr Machata	213666a356	mlxsw: spectrum_router: Simplify a piece of code Express the same logic more succinctly. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	56b8a9ed27	mlxsw: spectrum_router: Clarify a piece of code Prefer logical operator that expresses the intent to bitwise one that happens to give the same result. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	f1b1f273ae	mlxsw: spectrum_router: Simplify a piece of code Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Petr Machata	8de3c17819	mlxsw: spectrum_router: Fix a typo Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-31 14:44:33 -07:00
Ido Schimmel	b5f3e0d430	mlxsw: spectrum_router: Fix build when IPv6 isn't enabled When IPv6 isn't enabled the following error is generated: ERROR: "nd_tbl" [drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined! Fix it by replacing 'arp_tbl' and 'nd_tbl' with 'tbl->family' wherever possible and reference 'nd_tbl' only when IPV6 is enabled. Fixes: `d5eb89cf68` ("mlxsw: spectrum_router: Reflect IPv6 neighbours to the device") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 17:15:17 -07:00
Ido Schimmel	4a3c67a6e7	mlxsw: spectrum_router: Don't batch neighbour deletion Current firmware supported by the driver doesn't support batch deletion of IPv6 neighbours on a given router interface (RIF). Until a new version that supports this functionality is made available, delete neighbours one by one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 16:16:20 -07:00
Ido Schimmel	1819ae3dfe	mlxsw: spectrum_router: Don't offload routes next in list Each FIB node holds a linked list of routes sharing the same prefix and length. In the case of IPv4 it's ordered according to table ID, metric and TOS and only the first route in the list is actually programmed to the device. In case a gatewayed route is added somewhere in the list, then after its nexthop group will be refreshed and become valid (due to the resolution of its gateway), it'll mistakenly overwrite the existing entry. Example: 192.168.200.0/24 dev enp3s0np3 scope link metric 1000 offload 192.168.200.0/24 via 192.168.100.1 dev enp3s0np3 metric 1000 offload Both routes are marked as offloaded despite the fact only the first one should actually be present in the device's table. When refreshing the nexthop group, don't write the route to the device's table unless it's the first in its node. Fixes: `9aecce1c7d` ("mlxsw: spectrum_router: Correctly handle identical routes") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-24 14:14:48 -07:00
Ido Schimmel	7dcc18adad	mlxsw: spectrum_router: Update prefix count for IPv6 The number of possible prefix lengths for IPv6 is 129 and not 128. Fixes following warning from UBSAN when /128 routes are offloaded: UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2510:27 index 128 is out of range for type 'long unsigned int [128]' Fixes: `5e9c16cc83` ("mlxsw: spectrum_router: Implement private fib") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:34 -07:00
Ido Schimmel	80c238f91b	mlxsw: spectrum_router: Rename functions to add / delete a FIB entry These functions aren't specific to IPv4 and can be re-used for IPv6. Drop the '4' designation from their name. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:34 -07:00
Ido Schimmel	9efbee6fea	mlxsw: spectrum_router: Drop unnecessary parameter Functions that take as argument a FIB entry don't need to take FIB node as well, as it can be extracted from the entry. Remove unnecessary FIB node parameter. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:34 -07:00
Ido Schimmel	0e6ea2a4ea	mlxsw: spectrum_router: Mark IPv4 specific function accordingly The functions to create and destroy a nexthop group are IPv4 specific and should be renamed accordingly, so that they won't be confused with the IPv6 specific functions in follow-up patches. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	4f1c7f1f2e	mlxsw: spectrum_router: Create IPv4 specific entry struct Some of the parameters stored in the FIB entry structure are specific to IPv4 and therefore better placed in an IPv4 specific structure. Create an IPv4 specific structure that encapsulates the common FIB entry structure and contains IPv4 specific parameters. In a follow-up patchset an IPv6 specific structure will be introduced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	bc65a8a4f4	mlxsw: spectrum_router: Set abort trap for IPv6 When we fail to insert a route we invoke the abort mechanism which flushes all the tables and inserts a default route in each, so that all packets incoming to the router will be trapped to the CPU. Upon abort, add an IPv6 default route to the IPv6 tables. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	9dbf4d76d0	mlxsw: spectrum_router: Allow IPv6 routes to be programmed Take advantage of previous patch and allow the RALUE register to be called with IPv6 routes. In order to re-use as much code as possible between IPv4 and IPv6, only the lowest-level function that actually does the register packing is demuxed based on the passed protocol. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	a3d9bc506d	mlxsw: spectrum_router: Extend virtual routers with IPv6 support A Virtual Router (VR) is an entity which corresponds to a VRF and performs FIB lookup in an LPM tree according to the {VR, IP Proto} -> Tree binding. Extend the virtual router data structure towards IPv6 FIB offload. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	731ea1ca42	mlxsw: spectrum_router: Make FIB node retrieval family agnostic A FIB node is an entity which stores routes sharing the same prefix and length. The data structure itself is already family agnostic, but we make some of its operations agnostic as well and thus re-use them for IPv6 offload. Instead of passing an IPv4-specific structure to fib4_node_get(), pass general routing parameters and rename the function accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	160e22aa26	mlxsw: spectrum_router: Don't create FIB node during lookup When looking up a FIB entry we shouldn't create the FIB node where it's supposed to be linked in case the node doesn't already exist. Instead, lookup the node and fail if it doesn't exist. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Ido Schimmel	58adf2c480	mlxsw: spectrum_router: Don't assume neighbour type Thankfully, the neighbour subsystem is agnostic to the upper protocol and used by both IPv4 and IPv6. By removing assumptions regarding the neighbour type we can thus re-use much of the neighbour-related code for both IPv4 and IPv6. For each nexthop, store its gateway IP and for nexthop group store the neighbour table used by its nexthops. Use this information throughout the code and remove assumption about the neighbour type. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	a6c9b5d199	mlxsw: spectrum_router: Set activity interval according to both neighbour tables The neighbours' activity is currently dumped according to the ARP table's DELAY_PROBE time, but with the introduction of IPv6 offload we should set the interval according to the minimum between the ARP and ndisc tables. Signed-off-by: Arkadi Sharshvesky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	60f040ca11	mlxsw: spectrum_router: Periodically dump active IPv6 neighbours In addition to IPv4, periodically dump IPv6 neighbours and update the kernel about them. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:33 -07:00
Arkadi Sharshevsky	d5eb89cf68	mlxsw: spectrum_router: Reflect IPv6 neighbours to the device As with IPv4, listen to NEIGH_UPDATE events from the ndisc table and program relevant neighbours to the device's neighbour table. Note that neighbours with a link-local IP address aren't programmed, as packets with a link-local destination IP are trapped after LPM lookup and never reach the neighbour table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	5ea1237f94	mlxsw: spectrum_router: Configure RIFs based on IPv6 addresses When a netdev is configured with an IP address a router interface (RIF) should be configured for it in the device. Allow configuration of RIFs based on IPv6 address notifications as well as IPv4. Note that the RIF exists as long as an IP address is configured on the netdev, regardless of the address family. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Ido Schimmel	0d284818af	mlxsw: spectrum_router: Flood unregistered multicast packets to router Up until now we only flooded broadcast packets to the router when an L3 interface was configured on top of a bridge. However, IPv6 Neighbour Discovery packets are trapped to the CPU inside the router and these can be sent with a multicast address. Flood unregistered multicast packets to the router port, so that relevant packets could be trapped there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Arkadi Sharshevsky	e29237e7bb	mlxsw: spectrum_router: Enable IPv6 router Before we add IPv6 constructs like traps and router interfaces, we first need to enable IPv6 routing in the device. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-18 11:57:32 -07:00
Ido Schimmel	7387dbbcdb	mlxsw: spectrum_router: Fix use-after-free in route replace While working on IPv6 route replace I realized we can have a use-after-free in IPv4 in case the replaced route is offloaded and the only one using its FIB info. The problem is that fib_table_insert() drops the reference on the FIB info of the replaced routes which is eventually freed via call_rcu(). Since the driver doesn't hold a reference on this FIB info it can cause a use-after-free when it tries to clear the RTNH_F_OFFLOAD flag stored in fi->fib_flags. After running the following commands in a loop for enough time with a KASAN enabled kernel I finally got the below trace. $ ip route add 192.168.50.0/24 via 192.168.200.1 dev enp3s0np3 $ ip route replace 192.168.50.0/24 dev enp3s0np5 $ ip route del 192.168.50.0/24 dev enp3s0np5 BUG: KASAN: use-after-free in mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] Read of size 4 at addr ffff8803717d9820 by task kworker/u4:2/55 [...] ? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] ? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] ? mlxsw_sp_router_neighs_update_work+0x1cd0/0x1ce0 [mlxsw_spectrum] ? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] __asan_load4+0x61/0x80 mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum] mlxsw_sp_fib_entry_offload_refresh+0xb6/0x370 [mlxsw_spectrum] mlxsw_sp_router_fib_event_work+0xd1c/0x2780 [mlxsw_spectrum] [...] Freed by task 5131: save_stack_trace+0x16/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x70/0xc0 kfree+0x144/0x570 free_fib_info_rcu+0x2e7/0x410 rcu_process_callbacks+0x4f8/0xe30 __do_softirq+0x1d3/0x9e2 Fix this by taking a reference on the FIB info when creating the nexthop group it represents and drop it when the group is destroyed. Fixes: `599cf8f95f` ("mlxsw: spectrum_router: Add support for route replace") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-12 08:15:52 -07:00
Ido Schimmel	a4e75b76b2	mlxsw: spectrum_router: Add missing rollback With this patch the error path of mlxsw_sp_nexthop_init() is symmetric with mlxsw_sp_nexthop_fini(). Noticed during code review. Fixes: `a8c9701427` ("mlxsw: spectrum_router: Refactor nexthop init routine") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-12 08:15:51 -07:00
David S. Miller	b079115937	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-30 12:43:08 -04:00
Ido Schimmel	6b27c8adf2	mlxsw: spectrum_router: Fix NULL pointer dereference In case a VLAN device is enslaved to a bridge we shouldn't create a router interface (RIF) for it when it's configured with an IP address. This is already handled by the driver for other types of netdevs, such as physical ports and LAG devices. If this IP address is then removed and the interface is subsequently unlinked from the bridge, a NULL pointer dereference can happen, as the original 802.1d FID was replaced with an rFID which was then deleted. To reproduce: $ ip link set dev enp3s0np9 up $ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111 $ ip link set dev enp3s0np9.111 up $ ip link add name br0 type bridge $ ip link set dev br0 up $ ip link set enp3s0np9.111 master br0 $ ip address add dev enp3s0np9.111 192.168.0.1/24 $ ip address del dev enp3s0np9.111 192.168.0.1/24 $ ip link set dev enp3s0np9.111 nomaster Fixes: `99724c18fc` ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Petr Machata <petrm@mellanox.com> Tested-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 12:59:48 -04:00
Ido Schimmel	d7a60306c6	mlxsw: spectrum_router: Mark only first LPM tree as reserved In new firmware versions (that we can now enforce via request_firmware()), only the first LPM tree is reserved and not the first two as in older versions. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-08 14:33:40 -04:00
Ido Schimmel	de5ed99e97	mlxsw: spectrum_router: Align RIF index allocation with existing code The way we usually allocate an index is by letting the allocation function return an error instead of an invalid index. Do the same for RIF index. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-04 23:49:49 -04:00
Ido Schimmel	e4f3c1c17b	mlxsw: spectrum_router: Implement common RIF core The mlxsw driver currently implements three types of RIFs. VLAN and FID RIFs for L3 interfaces on top of VLAN-aware and VLAN-unaware bridges (respectively) and Subport RIFs for all other L3 interfaces. All the RIF types follow a common configuration procedure, which only differs in the type-specific bits. The patch exploits this fact and consolidates the common code paths, thereby simplifying the code and making it more extensible. This work also prepares the driver for use with future ASICs, where the range of the Subport RIFs will be extended and their configuration modified accordingly. By merely implementing a new RIF operations and selecting it during initialization, the same driver could be re-used. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:49 -04:00
Ido Schimmel	a110748725	mlxsw: spectrum: Implement common FID core The device supports three types of FIDs. 802.1Q and 802.1D FIDs for VLAN-aware and VLAN-unaware bridges (respectively) and rFIDs to transport packets to the router block. The different users (e.g., bridge, router, ACLs) of the FIDs infrastructure need not know about the internal FIDs implementation and can therefore interact with it using a restricted set of exported functions. By encapsulating the entire FID logic and hiding it from the rest of the driver we get a code base that it much simpler and easier to work with and extend. For example, in the current Spectrum ASIC only 802.1D FIDs can be assigned a VNI, but future ASICs will also support 802.1Q FIDs. With this patch in place, support for future ASICs can be easily added by implementing a new FID operations according to their capabilities. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:49 -04:00
Ido Schimmel	c9ec53f034	mlxsw: spectrum_router: Determine VR first when creating RIF All RIF types are associated with a virtual router (VR), so determine VR first when creating a RIF. That way, we can more easily integrate the common RIF core in the following patches. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:49 -04:00
Ido Schimmel	8e3482d6ad	mlxsw: spectrum_router: Flood packets to router after RIF creation If a packet ingress the router but can't be assigned an ingress RIF, it's dropped. Therefore, in the case of RIF configured on top of a bridge, it makes sense to start flooding broadcast packets to the router only after the RIF was created. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:48 -04:00
Ido Schimmel	1b8f09a05f	mlxsw: spectrum_router: Destroy RIF only based on its struct Now that all the information to create a RIF is contained within the RIF struct itself, we can also simplify the destruction logic. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:48 -04:00
Ido Schimmel	ab01ae9169	mlxsw: spectrum_router: Configure RIFs based on RIF struct All the information necessary for the configuration of RIFs can now be found in the RIF struct itself, so reduce the arguments list. This gets us one step closer to the common RIF core. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:48 -04:00
Ido Schimmel	4d93ceebf0	mlxsw: spectrum_router: Extend the RIF struct Currently, when a Subport RIF is configured, the LAG status and VLAN of the underlying port are read from the port itself. This is problematic, as we would like to have common code to configure all types of RIFs, which aren't necessarily bound to a port. Instead, embed the RIF in a struct specific to the Subport type, which contains all the necessary information. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:48 -04:00
Ido Schimmel	a13a594da0	mlxsw: spectrum_router: Allocate RIF prior to its configuration In the following patches the RIF's configuration function is going to expect a RIF struct with all the necessary information. Therefore, allocate the RIF just before it's configured to the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:47 -04:00
Ido Schimmel	caa3ddf8e3	mlxsw: spectrum_router: Allocate FID prior to RIF configuration The following patches are going to re-arrange the FID and RIF code, so that when the RIF is configured to the device based on the information present in the RIF struct (which points to a FID). For this reason, move the FID allocation to just before the RIF configuration. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:47 -04:00
Ido Schimmel	c57529e1d5	mlxsw: spectrum: Replace vPorts with Port-VLAN As explained in the cover letter, since the introduction of the bridge offload in the mlxsw driver, information related to the offloaded bridge and bridge ports was stored in the individual port struct, mlxsw_sp_port. This lead to a bloated struct storing both physical properties of the port (e.g., autoneg status) as well as logical properties of an upper bridge port (e.g., learning, mrouter indication). While this might work well for simple devices, it proved to be hard to extend when stacked devices were taken into account and more advanced use-cases (e.g., IGMP snooping) considered. This patch removes the excess information from the above struct and instead stores it in more appropriate structs that represent the bridge port, the bridge itself and a VLAN configured on the bridge port. The membership of a port in a bridge is denoted using the Port-VLAN struct, which points to the bridge port and also member in the bridge VLAN group of the VLAN it represents. This allows us to completely remove the vPort abstraction and consolidate many of the code paths relating to VLAN-aware and unaware bridges. Note that the FID / vFID code is currently duplicated, but this will soon go away when the common FID core will be introduced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:47 -04:00
Ido Schimmel	ed9ddd3aad	mlxsw: spectrum: Don't create FIDs upon creation of VLAN uppers Up until now we used to create FIDs upon the creation of VLAN uppers on top of the VLAN-aware bridge. This was done so that in case a router interface (RIF) was configured on top of the bridge, the FID would already be there. Instead, simplify the code and only create the FID upon RIF creation. This is an intermediary step towards the introduction of the common FID core, in which this code would be completely removed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:46 -04:00
Ido Schimmel	7cbecf245a	mlxsw: spectrum_router: Replace vPorts with Port-VLAN We're going to get rid of vPorts completely later in the patchset, but the router code is self-contained, so it's a good candidate to start the transition with. Convert all the functions that expects to operate on a vPort to operate on a Port-VLAN instead. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:45 -04:00
Ido Schimmel	ce95e15456	mlxsw: spectrum: Change signature of FID leave function When a vPort is destroyed, it leaves the FID it's currently mapped to (if any) and drops the reference. The FID's leave function expects to get the vPort as its argument, but this will have to change when the vPort model is retired. Change the function signature to expect a Port-VLAN struct instead and patch the call sites accordingly. The code introduced in this patch will be removed later in the patchset, but this intermediary step is required in order to ease the code review. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:45 -04:00
Ido Schimmel	4aafc368da	mlxsw: spectrum: Set port's mode according to FID mappings We currently transition the port to "Virtual mode" upon the creation of its first VLAN upper, as we need to classify incoming packets to a FID using {Port, VID} and not only the VID. However, it's more appropriate to transition the port to this mode when the {Port, VID} are actually mapped to a FID. Either during the enslavement of the VLAN upper to a VLAN-unaware bridge or the configuration of a router port. Do this change now in preparation for the introduction of the FID core, where this operation will be encapsulated. To prevent regressions, this patch also explicitly configures an OVS slave to "Virtual mode". Otherwise, a packet that didn't hit an ACL rule could be classified to an existing FID based on a global VID-to-FID mapping, thus not incurring a FID mis-classification, which would otherwise trap the packet to the CPU to be processed by the OVS daemon. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-26 15:18:45 -04:00
Ido Schimmel	03ea01e9db	mlxsw: spectrum_router: Adjust RIF configuration for new firmware versions In new firmware versions, when configuring a {Port, VID} as a router interface, the driver is responsible for enabling the STP filter and disabling learning. Otherwise, packets are discarded. This change doesn't break existing firmware versions, but is required for newer firmware versions. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-25 17:46:17 -04:00
David S. Miller	c6cd850d65	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-05-18 16:11:32 -04:00
Arkadi Sharshevsky	6b1206bbbc	mlxsw: spectrum_router: Fix rif counter freeing routine During rif counter freeing the counter index can be invalid. Add check of validity before freeing the counter. Fixes: `e0c0afd8aa` ("mlxsw: spectrum: Support for counters on router interfaces") Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-18 11:04:00 -04:00
Ido Schimmel	348b8fc3cf	mlxsw: spectrum_router: Initialize RIFs in a separate function The router interfaces (RIFs) array is currently initialized together with the general router configuration. However, in a follow-up patchset we're going to introduce a common RIF core that will require us to initialize more RIF constructs, so move the RIF initialization to its own function. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 14:06:54 -04:00
Ido Schimmel	7e39d1153d	mlxsw: spectrum_router: Move FIB notification block to router struct The FIB notification block logically belongs inside the router specific struct, so move it there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 14:06:54 -04:00
Ido Schimmel	5f9efffbdb	mlxsw: spectrum_router: Move RIFs array to its rightful place The router interfaces (RIFs) array is of no interest to code outside the routing realm, so declare it inside the router specific struct instead of the chip-wide one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 14:06:54 -04:00
Ido Schimmel	5f6935c6a4	mlxsw: spectrum_switchdev: Reduce scope of bridge struct Some attributes in the global chip struct are only relevant for bridge operation, so encapsulate them in their own struct that isn't exposed to non-bridge code. This will also help us later, when we add more bridge-specific attributes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 14:06:54 -04:00
Ido Schimmel	9011b677e7	mlxsw: spectrum_router: Reduce scope of router struct In a similar fashion to previous patch, the router structure ('mlxsw_sp_router') doesn't need to be accessible to anyone, but the router code located at spectrum_router.c Make this apparent and reduce its scope by defining it there. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-17 14:06:54 -04:00
Ido Schimmel	b1e455260c	mlxsw: spectrum_router: Simplify VRF enslavement When a netdev is enslaved to a VRF master, its router interface (RIF) needs to be destroyed (if exists) and a new one created using the corresponding virtual router (VR). >From the driver's perspective, the above is equivalent to an inetaddr event sent for this netdev. Therefore, when a port netdev (or its uppers) are enslaved to a VRF master, call the same function that would've been called had a NETDEV_UP was sent for this netdev in the inetaddr notification chain. This patch also fixes a bug when a LAG netdev with an existing RIF is enslaved to a VRF. Before this patch, each LAG port would drop the reference on the RIF, but would re-join the same one (in the wrong VR) soon after. With this patch, the corresponding RIF is first destroyed and a new one is created using the correct VR. Fixes: `7179eb5acd` ("mlxsw: spectrum_router: Add support for VRFs") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-01 11:47:58 -04:00
Jiri Pirko	2b94e58df5	mlxsw: spectrum: Allow ports to work under OVS master >From now on, a port can become a slave of OVS master. All vlans are enabled, STP state is set to "forwarding". It is up to the OVS userspace daemon to setup the flows either in kernel or in HW using TC flower offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-20 15:32:31 -04:00
Arkadi Sharshevsky	fd1b9d4192	mlxsw: spectrum_router: Add rif helper functions Add rif helper function to access the rif index and rif devices ifindex. This functions will be used by dpipe in order to dump the rif table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky	e0c0afd8aa	mlxsw: spectrum: Support for counters on router interfaces Add support for counter allocation on router interfaces. The allocation depends on the counter state of relevant table. In case the counting is disabled or no counters left the counter index will be set as invalid. Also a counter pool for router allocation is added. Signed-off-by: Arakdi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-28 17:11:55 -07:00
Arkadi Sharshevsky	1312444374	mlxsw: spectrum_kvdl: Cosmetic kvdl allocator API change Currently the return allocated index and err value are multiplexed. This patch changes the API to decouple the ret value from the allocated index. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-25 19:56:15 -07:00
Ido Schimmel	5ec2ee7dd2	mlxsw: Query maximum number of ports from firmware We currently hard code the maximum number of ports in the driver, but this may change in future devices, so query it from the firmware instead. Fallback to a maximum of 64 ports in case this number can't be queried. This should only happen in SwitchX-2 for which this number is correct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Ido Schimmel	8494ab06e0	mlxsw: spectrum_router: Query number of LPM trees from firmware Instead of hard coding the number of LPM trees in the driver, query it from the firmware, as it may change in future devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-24 13:53:28 -07:00
Arkadi Sharshevsky	bf95233e20	mlxsw: spectrum: Cosmetic naming change Currently the struct representing router interface "mlxsw_sp_rif" is reffered as "r" in various places in the driver. Furthermore it contains a member which specify the index which is called "rif". This patch change "r" to "rif" and "rif" to "rif_index". Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-21 17:16:10 -07:00
Ido Schimmel	c7f6e6658b	mlxsw: spectrum_router: Don't abort on l3mdev rules Now that port netdevs can be enslaved to a VRF master we need to make sure the device's routing tables won't be flushed upon the insertion of a l3mdev rule. Note that we assume the notified l3mdev rule is a simple rule as used by the VRF master. We don't check for the presence of other selectors such as 'iif' and 'oif'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:35 -07:00
Ido Schimmel	3d70e458be	mlxsw: spectrum_router: Add support for VRFs on top of bridges In a similar fashion to the previous patch, allow bridges and VLAN devices on top of bridges to be enslaved to a VRF master device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:35 -07:00
Ido Schimmel	7179eb5acd	mlxsw: spectrum_router: Add support for VRFs Allow port netdevs, LAG and VLAN devices stacked on top of these to be enslaved to a VRF master device. Upon enslavement, create a router interface (RIF) for the enslaved netdev and associate it with a virtual router (VR) based on the VRF's table ID. If a RIF already exists for the netdev (f.e., due to the existence of an IP address), then it's deleted and a new one is created with the appropriate VR binding. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:35 -07:00
Ido Schimmel	9db032bb1e	mlxsw: spectrum_router: Don't destroy RIF if L3 slave We usually destroy the netdev's router interface (RIF) when the last IP address is removed from it. However, we shouldn't do that if it's enslaved to an L3 master device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:34 -07:00
Ido Schimmel	57837885e3	mlxsw: spectrum_router: Associate RIFs with correct VR When a router interface (RIF) is created due to a netdev being enslaved to a VRF master, then it should be associated with the appropriate virtual router (VR) and not the default one. If netdev is a VRF slave, lookup the VR based on the VRF's table ID. Otherwise default to the MAIN table. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:34 -07:00
Ido Schimmel	5d7bfd1419	ipv4: fib_rules: Dump FIB rules when registering FIB notifier In commit `c3852ef7f2` ("ipv4: fib: Replay events when registering FIB notifier") we dumped the FIB tables and replayed the events to the passed notification block. However, we merely sent a RULE_ADD notification in case custom rules were in use. As explained in previous patches, this approach won't work anymore. Instead, we should notify the caller about all the FIB rules and let it act accordingly. Upon registration to the FIB notification chain, replay a RULE_ADD notification for each programmed FIB rule, custom or not. The integrity of the dump is ensured by the mechanism introduced in the above mentioned commit. Prevent regressions by making sure current listeners correctly sanitize the notified rules. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 10:18:34 -07:00
Ido Schimmel	b5d90e6d6b	mlxsw: spectrum_router: Make abort mechanism VR-aware When the abort mechanism is invoked it binds the first virtual router (VR) to an LPM tree and inserts a default route to direct packets to the CPU. With VRFs, we can have router interfaces (RIFs) bound to multiple VRs, so we need to make sure packets are trapped from all VRs and not just the first one. Upon abort invocation, bind all active VRs to the same LPM tree and insert a default route in each. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	6913229eea	mlxsw: spectrum_router: Explicitly Associate RIFs with VRs Up until now we implicitly associated all the router interfaces (RIFs) with the first virtual router (VR). This must be changed in order to enable VRF offload. Otherwise, a packet received via a VRF slave would do a FIB lookup in the same table used by other VRFs. Instead, bind the RIF to a VR according to the table where FIB lookup should be performed for packets received via the RIF. Currently, we only care about the MAIN and LOCAL tables (which we squash together). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	76610ebbde	mlxsw: spectrum_router: Refactor virtual router handling A virtual router (VR) is an entity within the device to which routing tables and interfaces can be bound to. It can be used to implement VRFs. In the initial implementation we associated the VR with a specific protocol (e.g., IPv4) and an LPM tree. However, this isn't really accurate, as the same VR can be used for both IPv4 and IPv6 traffic, by binding a different LPM tree to a {VR, Proto} pair. This patch aims to restructure the VR code according to the above logic, so that VRs are more accurately represented by the driver's data structures. The main motivation behind this change is to prepare the driver for VRF offload. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	382dbb4014	mlxsw: spectrum_router: Simplify LPM tree allocation When looking for a new LPM tree we should always consider all the unused trees. It doesn't matter if the new tree is required due to changes in currently used prefixes inside an existing routing table or because a route was inserted into an empty table. Both cases are functionally identical and therefore should be treated the same. When looking for a new LPM tree, consider all unused trees and don't reserve trees for specific cases. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	4724ba561a	mlxsw: spectrum_router: Place RIF related code with router code The inetaddr notification block is currently implemented in the main driver file, but this isn't really appropriate, as it mainly creates and destroys router interfaces (RIFs) which belong with the rest of the router code. This will become even more apparent later on when we'll need to bind these RIFs to virtual routers according to the VRF's table. Structure the driver better and prevent unnecessary function exports by moving the RIF related code with the rest of the router code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	97989ee0f5	mlxsw: spectrum_router: Allow more route types to be programmed Allow 'unreachable', 'blackhole' and 'prohibit' route types to be programmed into the device by sending any packet hitting them to the CPU. This is needed so that users will be able to program a default route into the VRF's table, thereby preventing lookup from leaking to other tables. Audit the code paths to make sure we don't rely on the presence of a nexthop netdev, as it doesn't exist for above mentioned route types. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 09:36:06 -08:00
Ido Schimmel	f7df4923fa	mlxsw: spectrum_router: Avoid potential packets loss When the structure of the LPM tree changes (f.e., due to the addition of a new prefix), we unbind the old tree and then bind the new one. This may result in temporary packet loss. Instead, overwrite the old binding with the new one. Fixes: `6b75c4807d` ("mlxsw: spectrum_router: Add virtual router management") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-01 09:50:58 -08:00
Ido Schimmel	599cf8f95f	mlxsw: spectrum_router: Add support for route replace Upon the reception of an ENTRY_REPLACE notification, resolve the FIB node corresponding to the prefix and length and insert the new route before the first matching entry. Since the notification also signals the deletion of the replaced route, delete it from the driver's cache. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:32:14 -05:00
Ido Schimmel	4283bce5f8	mlxsw: spectrum_router: Add support for route append When a new route is appended, it's placed after existing routes sharing the same parameters (prefix, length, table ID, TOS and priority). While the device supports only one route with the same prefix and length in a single table, it's important to correctly place the appended route in the driver's cache, as when a route is deleted the next one is programmed into the device. Following the reception of an ENTRY_APPEND notification, resolve the FIB node corresponding to the prefix and length and correctly place the new entry in its entry list. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:32:13 -05:00
Ido Schimmel	9aecce1c7d	mlxsw: spectrum_router: Correctly handle identical routes In the device, routes are indexed in a routing table based on the prefix and its length. This is in contrast to the kernel's FIB where several FIB aliases can exist with these parameters being identical. In such cases, the routes will be sorted by table ID (LOCAL first, then MAIN), TOS and finally priority (metric). During lookup, these routes will be evaluated in order. In case the packet's TOS field is non-zero and a FIB alias with a matching TOS is found, then it's selected. Otherwise, the lookup defaults to the route with TOS 0 (if it exists). However, if the requested scope is narrower than the one found, then the lookup continues. To best reflect the kernel's datapath we should take the above into account. Given a prefix and its length, the reflected route will always be the first one in the FIB alias list. However, if the route has a non-zero TOS then its action will be converted to trap instead of forward, since we currently don't support TOS-based routing. If this turns out to be a real issue, we can add support for that using policy-based switching. The route's scope can be effectively ignored as any packet being routed by the device would've been looked-up using the widest scope (UNIVERSE). To achieve that we need to do two changes. Firstly, we need to create another struct (FIB node) that will hold the list of FIB entries sharing the same prefix and length. This struct will be hashed using these two parameters. Secondly, we need to change the route reflection to match the above logic, so that the first FIB entry in the list will be programmed into the device while the rest will remain in the driver's cache in case of subsequent changes. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:32:13 -05:00
Ido Schimmel	df6dd79be8	mlxsw: spectrum_router: Don't reflect LINKDOWN nexthops The kernel resolves the nexthops for a given route using FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a route with one of its nexthops being LINKDOWN. In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then we shouldn't reflect the nexthop to the device's table. Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD and reflect it to the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:43:59 -05:00
Ido Schimmel	9665b74562	mlxsw: spectrum_router: Flush resources when RIF is deleted When the last IP address is removed from a netdev, its RIF is deleted. However, if user didn't first remove neighbours and nexthops using this interface, then they would still be present in the device's tables. Therefore, whenever a RIF is deleted, make sure all the neighbours and nexthops (adjacency entries) using it are removed from the relevant tables as well. The action associated with any route using this RIF would be refreshed, most likely to trap. If the kernel decides to remove the route (f.e., because all the nexthops are now DEAD), then an event would be sent, causing the route to be removed from the device. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:19 -05:00
Ido Schimmel	ad178c8eef	mlxsw: spectrum_router: Reflect nexthop status changes When a packet hits a multipath route in the device's routing table, a hash is computed over its headers, which is then used to select the appropriate nexthop from the device's adjacency table. There are situations in which the kernel removes a nexthop from a multipath route (e.g., no carrier) and the device should do the same. Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from the device's adjacency table and refresh all the routes using the nexthop group. If all the nexthops of a multipath route are invalid, then any packet hitting the route would be trapped to the CPU for forwarding. If all the nexthops are DEAD, then the kernel would remove the route entirely. On the other hand, if all the nexthops are merely LINKDOWN, then the kernel would keep the route and forward any incoming packet using a different route. While the last case might sound like a problem, it's expected that a routing daemon running in user space would remove such a route from the FIB as it's dumped with the DEAD flag set. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	70ad35067c	mlxsw: spectrum_router: Use trap action only for some route types The device can have one of three actions associated with a route: 1) Remote - packets continue to the adjacency table 2) Local - packets continue to the neighbour table 3) Trap - packets continue to the CPU The first two actions can also trap packets to the CPU, but they do so using a different trap ID, which has a lower traffic class and less allotted bandwidth. We currently use the third action for both RTN_{LOCAL,BROADCAST} routes and RTN_UNICAST routes not pointing to the switch ports. However, packets that merely need to be forwarded by the switch are likely not control packets and can be therefore scheduled towards the CPU using a lower traffic class. Achieve the above by assigning the third action only to local and broadcast routes and have any other route use either of the first two actions, based on whether the route is gatewayed or not. This will also allow us to refresh routes using the local action and have them trap packets when their RIF is no longer valid following a NH_DEL event. One side effect of this patch is that we no longer give special treatment to multipath routes using both switch and non-switch ports towards their nexthops. If at least one of the nexthops can be resolved, then the device will forward the packets instead of trapping them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	4b41147751	mlxsw: spectrum_router: Determine offload status using generic function The previous patch introduced a generic function to determine whether a route should be offloaded or not. Make use of it here. In the future we're going to add more conditions to this test (e.g., whether TOS is non-zero), so it makes sense to centralize it instead of open coding it in a few places. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:18 -05:00
Ido Schimmel	013b20f953	mlxsw: spectrum_router: More accurately set offload flag We currently set the RTNH_F_OFFLOAD flag for all routes using remote action, but this isn't always correct. If none of the nexthops associated with a gatewayed route can be offloaded into the device, then any packet hitting it would be trapped to the CPU and forwarded by the kernel. Solve this by pushing the setting of the offload flag to after the route was programmed into the device, thereby allowing us to take all the parameters into account. This change will also help us further in the patchset, when we refresh routes following the reception of NH_{ADD,DEL} events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	a8c9701427	mlxsw: spectrum_router: Refactor nexthop init routine The nexthop init and de-init functions both have symmetric parts concerned with the reflection of the neighbour entry into the device's adjacency table, in case it's used by a gatewayed route. These sections of code also need to be called when a nexthop is marked as valid / invalid following NH_{ADD,DEL} events. Break these out into appropriate functions, so that they could be invoked following the reception of above events. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	c8b030774f	mlxsw: spectrum_router: Remove FIB info from FIB entry struct After the previous changes, the FIB info is embedded in every nexthop group struct, which in turn is embedded in every FIB entry struct. We can therefore safely remove the FIB info from the entry struct. This has the added advantage of making the router-related structs more generic and suitable for use with IPv6 offloads. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00
Ido Schimmel	b8399a1e5a	mlxsw: spectrum_router: Store routes in a more generic way Up until now, the only FIB entries that were associated with a nexthop group were routes to remote networks where all the nexthop devices had a valid router interface (RIF). This is in contrast to the FIB code, where all the routes are associated with a FIB info. The same design choice needs to be applied to the driver's cache. Based on the NH_{ADD,DEL} events which will be added later in the patchset, we need to be able to change the action (forward / trap) associated with all the routes using the nexthop group. However, if we can't link between the nexthop and the routes using it, then the above is impossible. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-08 15:25:17 -05:00

1 2 3 4 5

204 Commits