linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Jarod Wilson	bf3a058de5	mlx5: become aware of when running as a bonding slave I've been unable to get my hands on suitable supported hardware to date, but I believe this ought to be all that is needed to enable the mlx5 driver to also work with bonding active-backup crypto offload passthru. CC: Boris Pismenny <borisp@mellanox.com> CC: Saeed Mahameed <saeedm@mellanox.com> CC: Leon Romanovsky <leon@kernel.org> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:38:57 -07:00
Jarod Wilson	0dea9ea97e	ixgbe_ipsec: become aware of when running as a bonding slave Slave devices in a bond doing hardware encryption also need to be aware that they're slaves, so we operate on the slave instead of the bonding master to do the actual hardware encryption offload bits. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org Acked-by: Jeff Kirsher <Jeffrey.t.kirsher@intel.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:38:57 -07:00
Jarod Wilson	272c2330ad	xfrm: bail early on slave pass over skb This is prep work for initial support of bonding hardware encryption pass-through support. The bonding driver will fill in the slave_dev pointer, and we use that to know not to skb_push() again on a given skb that was already processed on the bond device. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jakub Kicinski <kuba@kernel.org> CC: Steffen Klassert <steffen.klassert@secunet.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: netdev@vger.kernel.org CC: intel-wired-lan@lists.osuosl.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:38:56 -07:00
David S. Miller	389cc2f326	Merge branch 'devlink-Support-get-set-mac-address-of-a-port-function' Parav Pandit says: ==================== devlink: Support get,set mac address of a port function Currently, ip link set dev <pfndev> vf <vf_num> <param> <value> has below few limitations. 1. Command is limited to set VF parameters only. It cannot set the default MAC address for the PCI PF. 2. It can be set only on system where PCI SR-IOV capability exists. In smartnic based system, eswitch of a NIC resides on a different embedded cpu which has the VF and PF representors for the SR-IOV functions of a host system in which this smartnic is plugged-in. 3. It cannot setup the function attributes of sub-function described in detail in comprehensive RFC [1] and [2]. This series covers the first small part to let user query and set MAC address (hardware address) of a PCI PF/VF which is represented by devlink port pcipf, pcivf port flavours respectively. Whenever a devlink port manages a function connected to a devlink port, it allows to query and set its hardware address. Driver implements necessary get/set callback functions if it supports port function for a given port type. Patch summary: Patch-1 Prepares devlink port fill routines for extack Patch-2 and 3 extended devlink interface to get/set port function attributes, mainly hardware address to start with. Patch-2 Extended port dump command to query port function hardware address Patch-3 Introduces a command to set the hardware address of a port function Patch-4 to 9 refactors and implement devlink callbacks in mlx5_core driver. Patch-4 Constify the mac address pointer in set routines Patch-5 Introduces eswich check helper to use in devlink facing callbacks Patch-6 Moves port index, port number conversion routine to eswitch header file Patch-7 Implements port function query devlink callback Patch-8 Refactors mac address setting routine to uniformly use state_lock Patch-9 Implements port function set devlink callback [1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/ [2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:30:08 -07:00
Parav Pandit	330077d14d	net/mlx5: E-switch, Supporting setting devlink port function mac address Enable user to set mac address of the PCI PF and VF port function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	1094795ce4	net/mlx5: Split mac address setting function for using state_lock Refactor mac address setting function to let caller hold the necessary state_lock mutex, so that subsequent patch and use this helper routine. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	f099fde16d	net/mlx5: E-switch, Support querying port function mac address Support querying mac address of the eswitch devlink port function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	443bf36eb5	net/mlx5: Move helper to eswitch layer To use port number to port index conversion at eswitch level, move it to eswitch header. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	bd93975353	net/mlx5: E-switch, Introduce and use eswitch support check helper Introduce an helper routine to get esw from a devlink device and use it at eswitch callbacks and in subsequent patch. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	fa997825eb	net/mlx5: Constify mac address pointer Since none of the functions need to modify the input mac address, constify them. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	a1e8ae907c	net/devlink: Support setting hardware address of port function PCI PF and VF devlink port can manage the function represented by a devlink port. Allow users to set port function's hardware address. Example of a PCI VF port which supports a port function: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:00:00:00:00:00 $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55 $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:11:22:33:44:55 Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	2a916ecc40	net/devlink: Support querying hardware address of port function PCI PF and VF devlink port can manage the function represented by a devlink port. Enable users to query port function's hardware address. Example of a PCI VF port which supports a port function: $ devlink port show pci/0000:06:00.0/2 pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 function: hw_addr 00:11:22:33:44:66 $ devlink port show pci/0000:06:00.0/2 -jp { "port": { "pci/0000:06:00.0/2": { "type": "eth", "netdev": "enp6s0pf0vf1", "flavour": "pcivf", "pfnum": 0, "vfnum": 1, "function": { "hw_addr": "00:11:22:33:44:66" } } } } Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:19 -07:00
Parav Pandit	a829eb0d5d	net/devlink: Prepare devlink port functions to fill extack Prepare devlink port related functions to optionally fill up the extack information which will be used in subsequent patch by port function attribute(s). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 15:29:18 -07:00
David S. Miller	29a720c104	Merge branch 'Marvell-mvpp2-improvements' Russell King says: ==================== Marvell mvpp2 improvements This series primarily cleans up mvpp2, but also fixes a left-over from `91a208f218` ("net: phylink: propagate resolved link config via mac_link_up()"). Patch 1 introduces some port helpers: mvpp2_port_supports_xlg() - does the port support the XLG MAC mvpp2_port_supports_rgmii() - does the port support RGMII modes Patch 2 introduces mvpp2_phylink_to_port(), rather than having repeated open coding of container_of(). Patch 3 introduces mvpp2_modify(), which reads-modifies-writes a register - I've converted the phylink specific code to use this helper. Patch 4 moves the hardware control of the pause modes from mvpp2_xlg_config() (which is called via the phylink_config method) to mvpp2_mac_link_up() - a change that was missed in the above referenced commit. v2: remove "inline" in patch 2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 21:38:26 -07:00
Russell King	63d78cc976	net: mvpp2: set xlg flow control in mvpp2_mac_link_up() Set the flow control settings in mvpp2_mac_link_up() for 10G links just as we do for 1G and slower links. This is now the preferred location. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 21:38:26 -07:00
Russell King	bd45f644a8	net: mvpp2: add register modification helper Add a helper to read-modify-write a register, and use it in the phylink helpers. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 21:38:26 -07:00
Russell King	6c2b49eb96	net: mvpp2: add mvpp2_phylink_to_port() helper Add a helper to convert the struct phylink_config pointer passed in from phylink to the drivers internal struct mvpp2_port. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 21:38:26 -07:00
Russell King	a9a3320227	net: mvpp2: add port support helpers The mvpp2 code has tests scattered amongst the code to determine whether the port supports the XLG, and whether the port supports RGMII mode. Rather than having these tests scattered, provide a couple of helper functions, so that future additions can ensure that they get these tests correct. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 21:38:26 -07:00
Gaurav Singh	8bf1539515	Remove redundant skb null check Remove the redundant null check for skb. Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 21:29:27 -07:00
David S. Miller	c8f8a9f8e5	Merge branch 'tcp-remove-two-indirect-calls-from-xmit-path' Eric Dumazet says: ==================== tcp: remove two indirect calls from xmit path __tcp_transmit_skb() does two indirect calls per packet, lets get rid of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:47:53 -07:00
Eric Dumazet	dd2e0b86fc	tcp: remove indirect calls for icsk->icsk_af_ops->send_check Mitigate RETPOLINE costs in __tcp_transmit_skb() by using INDIRECT_CALL_INET() wrapper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:47:53 -07:00
Eric Dumazet	05e22e8395	tcp: remove indirect calls for icsk->icsk_af_ops->queue_xmit Mitigate RETPOLINE costs in __tcp_transmit_skb() by using INDIRECT_CALL_INET() wrapper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:47:53 -07:00
Tao Ren	902053f17d	of: mdio: preserve phy dev_flags in of_phy_connect() Replace assignment "=" with OR "\|=" for "phy->dev_flags" so "dev_flags" configured in phy probe() function can be preserved. The idea is similar to commit `e7312efbd5` ("net: phy: modify assignment to OR for dev_flags in phy_attach_direct"). Signed-off-by: Tao Ren <rentao.bupt@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:33:17 -07:00
Amritha Nambiar	78e57f152c	net: Avoid overwriting valid skb->napi_id This will be useful to allow busy poll for tunneled traffic. In case of busy poll for sessions over tunnels, the underlying physical device's queues need to be polled. Tunnels schedule NAPI either via netif_rx() for backlog queue or schedule the gro_cell_poll(). netif_rx() propagates the valid skb->napi_id to the socket. OTOH, gro_cell_poll() stamps the skb->napi_id again by calling skb_mark_napi_id() with the tunnel NAPI which is not a busy poll candidate. This was preventing tunneled traffic to use busy poll. A valid NAPI ID in the skb indicates it was already marked for busy poll by a NAPI driver and hence needs to be copied into the socket. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:30:59 -07:00
Gaurav Singh	8eaf8d9940	Remove redundant condition in qdisc_graft parent cannot be NULL here since its in the else part of the if (parent == NULL) condition. Remove the extra check on parent pointer. Signed-off-by: Gaurav Singh <gaurav1086@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:29:52 -07:00
David S. Miller	cd39983857	Merge branch 'Ocelot-Felix-driver-cleanup' Vladimir Oltean says: ==================== Ocelot/Felix driver cleanup Some of the code in the mscc felix and ocelot drivers was added while in a bit of a hurry. Let's take a moment and put things in relative order. First 3 patches are sparse warning fixes. Patches 4-9 perform some further splitting between mscc_felix, mscc_ocelot, and the common hardware library they share. Meaning that some code is being moved from the library into just the mscc_ocelot module. Patches 10-12 refactor the naming conventions in the existing VCAP code (for tc-flower offload), since we're going to be adding some more code for VCAP IS1 (previous tentatives already submitted here: https://patchwork.ozlabs.org/project/netdev/cover/20200602051828.5734-1-xiaoliang.yang_1@nxp.com/), and that code would be confusing to read and maintain using current naming conventions. No functional modification is intended. I checked that the VCAP IS2 code still works by applying a tc ingress filter with an EtherType key and 'drop' action. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	c73b0ad36e	net: mscc: ocelot: unexpose ocelot_vcap_policer_{add,del} Remove the function prototypes from ocelot_police.h and make these functions static. We need to move them above their callers. Note that moving the implementations to ocelot_police.c is not trivially possible due to dependency on is2_entry_set() which is static to ocelot_vcap.c. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	aae4e500e1	net: mscc: ocelot: generalize the "ACE/ACL" names Access Control Lists (and their respective Access Control Entries) are specifically entries in the VCAP IS2, the security enforcement block, according to the documentation. Let's rename the structures and functions to something more generic, so that VCAP IS1 structures (which would otherwise have to be called Ingress Classification Entries) can reuse the same code without confusion. Some renaming that was done: struct ocelot_ace_rule -> struct ocelot_vcap_filter struct ocelot_acl_block -> struct ocelot_vcap_block enum ocelot_ace_type -> enum ocelot_vcap_key_type struct ocelot_ace_vlan -> struct ocelot_vcap_key_vlan enum ocelot_ace_action -> enum ocelot_vcap_action struct ocelot_ace_stats -> struct ocelot_vcap_stats enum ocelot_ace_type -> enum ocelot_vcap_key_type struct ocelot_ace_frame_* -> struct ocelot_vcap_key_* No functional change is intended. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	3c83654f24	net: mscc: ocelot: rename ocelot_ace.{c, h} to ocelot_vcap.{c,h} Access Control Lists (and their respective Access Control Entries) are specifically entries in the VCAP IS2, the security enforcement block, according to the documentation. Let's rename the files that deal with generic operations on the VCAP TCAM, so that VCAP IS1 and ES0 can reuse the same code without confusion. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	9c90eea310	net: mscc: ocelot: move net_device related functions to ocelot_net.c The ocelot hardware library shouldn't contain too much net_device specific code, since it is shared with DSA which abstracts that structure away. So much as much of this code as possible into the mscc_ocelot driver and outside of the common library. We're making an exception for MDB and LAG code. That is not yet exported to DSA, but when it will, most of the code that's already in ocelot.c will remain there. So, there's no point in moving code to ocelot_net.c just to move it back later. We could have moved all net_device code to ocelot_vsc7514.c directly, but let's operate under the assumption that if a new switchdev ocelot driver gets added, it'll define its SoC-specific stuff in a new ocelot_vsc*.c file and it'll reuse the rest of the code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	d9feb90499	net: mscc: ocelot: move ocelot_regs.c into ocelot_vsc7514.c ocelot_regs.c actually shouldn't be part of the common library. It describes the register map of the VSC7514 switch. The way ocelot switches work, they'll have highly optimized register maps, so another SoC will likely have the same registers but laid out completely different in memory (so there's little room for reusing this structure). So move it to ocelot_vsc7514.c instead. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	14addfb635	net: mscc: ocelot: rename MSCC_OCELOT_SWITCH_OCELOT to MSCC_OCELOT_SWITCH Putting 'ocelot' in the config's name twice just to say that 'it's the ocelot driver running on the ocelot SoC' is a bit confusing. Instead, it's just the ocelot driver. Now that we've renamed the previous symbol that was holding the MSCC_OCELOT_SWITCH_OCELOT into *_LIB (because that's what it is), we're free to use this name for the driver. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	f4d0323bae	net: mscc: ocelot: convert MSCC_OCELOT_SWITCH into a library Hide the CONFIG_MSCC_OCELOT_SWITCH option from users. It is meant to be only a hardware library which is selected by the drivers that use it (ocelot, felix). Since it is "selected" from Kconfig, all its dependencies are manually transferred to the driver that selects it. This is because "select" in Kconfig language is a bit of a mess, and doesn't handle dependencies of selected options quite right. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	56583862b8	net: mscc: ocelot: rename module to mscc_ocelot mscc_ocelot is a slightly better name for a module than ocelot_board or ocelot_vsc7514 is, so let's use that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	589aa6e7c9	net: mscc: ocelot: rename ocelot_board.c to ocelot_vsc7514.c To follow the model of felix and seville where we have one platform-specific file, rename this file to the actual SoC it serves. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	ff4b0bc623	net: mscc: ocelot: access EtherType using __be16 Get rid of sparse "cast to restricted __be16" warnings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	7eb5c96a7c	net: mscc: ocelot: use plain int when interacting with TCAM tables sparse is rightfully complaining about the fact that: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits] 26 \| __builtin_constant_p((l) > (h)), (l) > (h), 0))) \| ^ note: in expansion of macro ‘GENMASK_INPUT_CHECK’ 39 \| (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l)) \| ^~~~~~~~~~~~~~~~~~~ note: in expansion of macro ‘GENMASK’ 127 \| mask = GENMASK(width, 0); \| ^~~~~~~ So replace the variables that go into GENMASK with plain, signed integer types. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	3ab4ceb6e9	net: dsa: felix: make vcap is2 keys and actions static Get rid of some sparse warnings. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:22 -07:00
David S. Miller	60cb8d3d71	Merge branch 'Strict-mode-for-VRF' Andrea Mayer says: ==================== Strict mode for VRF This patch set adds the new "strict mode" functionality to the Virtual Routing and Forwarding infrastructure (VRF). Hereafter we discuss the requirements and the main features of the "strict mode" for VRF. On VRF creation, it is necessary to specify the associated routing table used during the lookup operations. Currently, there is no mechanism that avoids creating multiple VRFs sharing the same routing table. In other words, it is not possible to force a one-to-one relationship between a specific VRF and the table associated with it. The "strict mode" imposes that each VRF can be associated to a routing table only if such routing table is not already in use by any other VRF. In particular, the strict mode ensures that: 1) given a specific routing table, the VRF (if exists) is uniquely identified; 2) given a specific VRF, the related table is not shared with any other VRF. Constraints (1) and (2) force a one-to-one relationship between each VRF and the corresponding routing table. The strict mode feature is designed to be network-namespace aware and it can be directly enabled/disabled acting on the "strict_mode" parameter. Read and write operations are carried out through the classic sysctl command on net.vrf.strict_mode path, i.e: sysctl -w net.vrf.strict_mode=1. Only two distinct values {0,1} are accepted by the strict_mode parameter: - with strict_mode=0, multiple VRFs can be associated with the same table. This is the (legacy) default kernel behavior, the same that we experience when the strict mode patch set is not applied; - with strict_mode=1, the one-to-one relationship between the VRFs and the associated tables is guaranteed. In this configuration, the creation of a VRF which refers to a routing table already associated with another VRF fails and the error is returned to the user. The kernel keeps track of the associations between a VRF and the routing table during the VRF setup, in the "management" plane. Therefore, the strict mode does not impact the performance or the intrinsic functionality of the data plane in any way. When the strict mode is active it is always possible to disable the strict mode, while the reverse operation is not always allowed. Setting the strict_mode parameter to 0 is equivalent to removing the one-to-one constraint between any single VRF and its associated routing table. Conversely, if the strict mode is disabled and there are multiple VRFs that refer to the same routing table, then it is prohibited to set the strict_mode parameter to 1. In this configuration, any attempt to perform the operation will lead to an error and it will be reported to the user. To enable strict mode once again (by setting the strict_mode parameter to 1), you must first remove all the VRFs that share common tables. There are several use cases which can take advantage from the introduction of the strict mode feature. In particular, the strict mode allows us to: i) guarantee the proper functioning of some applications which deal with routing protocols; ii) perform some tunneling decap operations which require to use specific routing tables for segregating and forwarding the traffic. Considering (i), the creation of different VRFs that point to the same table leads to the situation where two different routing entities believe they have exclusive access to the same table. This leads to the situation where different routing daemons can conflict for gaining routes control due to overlapping tables. By enabling strict mode it is possible to prevent this situation which often occurs due to incorrect configurations done by the users. The ability to enable/disable the strict mode functionality does not depend on the tool used for configuring the networking. In essence, the strict mode patch solves, at the kernel level, what some other patches [1] had tried to solve at the userspace level (using only iproute2) with all the related problems. Considering (ii), the introduction of the strict mode functionality allows us implementing the SRv6 End.DT4 behavior. Such behavior terminates a SR tunnel and it forwards the IPv4 traffic according to the routes present in the routing table supplied during the configuration. The SRv6 End.DT4 can be realized exploiting the routing capabilities made available by the VRF infrastructure. This behavior could leverage a specific VRF for forcing the traffic to be forwarded in accordance with the routes available in the VRF table. Anyway, in order to make the End.DT4 properly work, it must be guaranteed that the table used for the route lookup operations is bound to one and only one VRF. In this way, it is possible to use the table for uniquely retrieving the associated VRF and for routing packets. I would like to thank David Ahern for his constant and valuable support during the design and development phases of this patch set. Comments, suggestions and improvements are very welcome! ==================== Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:22:23 -07:00
Andrea Mayer	8735e6eaa4	selftests: add selftest for the VRF strict mode The new strict mode functionality is tested in different configurations and on different network namespaces. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:22:23 -07:00
Andrea Mayer	a59a8ffd4a	vrf: add l3mdev registration for table to VRF device lookup During the initialization phase of the VRF module, the callback for table to VRF device lookup is registered in l3mdev. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:22:23 -07:00
Andrea Mayer	33306f1aaf	vrf: add sysctl parameter for strict mode Add net.vrf.strict_mode sysctl parameter. When net.vrf.strict_mode=0 (default) it is possible to associate multiple VRF devices to the same table. Conversely, when net.vrf.strict_mode=1 a table can be associated to a single VRF device. When switching from net.vrf.strict_mode=0 to net.vrf.strict_mode=1, a check is performed to verify that all tables have at most one VRF associated, otherwise the switch is not allowed. The net.vrf.strict_mode parameter is per network namespace. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:22:22 -07:00
Andrea Mayer	c8baec3857	vrf: track associations between VRF devices and tables Add the data structures and the processing logic to keep track of the associations between VRF devices and routing tables. When a VRF is instantiated, it needs to refer to a given routing table. For each table, we explicitly keep track of the existing VRFs that refer to the table. Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:22:22 -07:00
Andrea Mayer	49042c220b	l3mdev: add infrastructure for table to VRF mapping Add infrastructure to l3mdev (the core code for Layer 3 master devices) in order to find out the corresponding VRF device for a given table id. Therefore, the l3mdev implementations: - can register a callback that returns the device index of the l3mdev associated with a given table id; - can offer the lookup function (table to VRF device). Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:22:22 -07:00
Gustavo A. R. Silva	c5eb179edd	net/sched: cls_u32: Use struct_size() in kzalloc() Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. This code was detected with the help of Coccinelle and, audited and fixed manually. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-19 20:19:24 -07:00
Gustavo A. R. Silva	11a33de2df	taprio: Use struct_size() in kzalloc() Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. Also, remove unnecessary variable _size_. This code was detected with the help of Coccinelle and, audited and fixed manually. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-19 20:18:43 -07:00
David S. Miller	1075a4744a	Merge branch 'Clause-45-PHY-probing-improvements' Russell King says: ==================== Clause 45 PHY probing improvements Last time this series was posted back in May, Florian reviewed the patches, which was the only feedback I received. I'm now posting them without the RFC tag. This series aims to improve the probing for Clause 45 PHYs. The first four patches clean up get_phy_device() and called functions, updating the kernel doc, adding information about the various error return values. We then provide better kerneldoc for get_phy_device(), describing what is going on, and more importantly what the various return codes mean. Patch 6 adds support for probing MMDs >= 8 to check for their presence. Patch 7 changes get_phy_c45_ids() to only set the returned devices_in_package if we successfully find a PHY. Patch 8 splits the use of "devices in package" from the "mmds present". Patch 9 expands our ID reading to cover the other MMDs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-19 20:17:15 -07:00
Russell King	389a338999	net: phy: read MMD ID from all present MMDs Expand the device_ids[] array to allow all MMD IDs to be read rather than just the first 8 MMDs, but only read the ID if the MDIO_STAT2 register reports that a device really is present here for these new devices to maintain compatibility with our current behaviour. Note that only a limited number of devices have MDIO_STAT2. 88X3310 PHY vendor MMDs do are marked as present in the devices_in_package, but do not contain IEE 802.3 compatible register sets in their lower space. This avoids reading incorrect values as MMD identifiers. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-19 20:17:15 -07:00
Russell King	320ed3bf90	net: phy: split devices_in_package We have two competing requirements for the devices_in_package field. We want to use it as a bit array indicating which MMDs are present, but we also want to know if the Clause 22 registers are present. Since "devices in package" is a term used in the 802.3 specification, keep this as the as-specified values read from the PHY, and introduce a new member "mmds_present" to indicate which MMDs are actually present in the PHY, derived from the "devices in package" value. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-19 20:17:15 -07:00
Russell King	5ba33cf483	net: phy: set devices_in_package only after validation Only set the devices_in_package to a non-zero value if we find a valid value for this field, so we avoid leaving it set to e.g. 0x1fffffff. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-19 20:17:15 -07:00

1 2 3 4 5 ...

932973 Commits All Branches Search

932973 Commits

All Branches