OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Russell King (Oracle)	dd805cf3e8	net: dsa: add support for mac_prepare() and mac_finish() calls Add DSA support for the phylink mac_prepare() and mac_finish() calls. These were introduced as part of the PCS support to allow MACs to perform preparatory steps prior to configuration, and finalisation steps after the MAC and PCS has been configured. Introducing phylink_pcs support to the mv88e6xxx DSA driver needs some code moved out of its mac_config() stage into the mac_prepare() and mac_finish() stages, and this commit facilitates such code in DSA drivers. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-26 10:39:40 +01:00
Vladimir Oltean	ff6ac4d013	net: dsa: make dsa_port_supports_hwtstamp() construct a fake ifreq dsa_master_ioctl() is in the process of getting converted to a different API, where we won't have access to a struct ifreq * anymore, but rather, to a struct kernel_hwtstamp_config. Since ds->ops->port_hwtstamp_get() still uses struct ifreq *, this creates a difficult situation where we have to make up such a dummy pointer. The conversion is a bit messy, because it forces a "good" implementation of ds->ops->port_hwtstamp_get() to return -EFAULT in copy_to_user() because of the NULL ifr->ifr_data pointer. However, it works, and it is only a transient step until ds->ops->port_hwtstamp_get() gets converted to the new API which passes struct kernel_hwtstamp_config and does not call copy_to_user(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-03 10:04:27 +01:00
Vladimir Oltean	eb1ab7650d	net: dsa: fix db type confusion in host fdb/mdb add/del We have the following code paths: Host FDB (unicast RX filtering): dsa_port_standalone_host_fdb_add() dsa_port_bridge_host_fdb_add() \| \| +--------------+ +------------+ \| \| v v dsa_port_host_fdb_add() dsa_port_standalone_host_fdb_del() dsa_port_bridge_host_fdb_del() \| \| +--------------+ +------------+ \| \| v v dsa_port_host_fdb_del() Host MDB (multicast RX filtering): dsa_port_standalone_host_mdb_add() dsa_port_bridge_host_mdb_add() \| \| +--------------+ +------------+ \| \| v v dsa_port_host_mdb_add() dsa_port_standalone_host_mdb_del() dsa_port_bridge_host_mdb_del() \| \| +--------------+ +------------+ \| \| v v dsa_port_host_mdb_del() The logic added by commit `5e8a1e03aa` ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB") zeroes out db.bridge.num if the switch doesn't support ds->fdb_isolation (the majority doesn't). This is done for a reason explained in commit `c26933639b` ("net: dsa: request drivers to perform FDB isolation"). Taking a single code path as example - dsa_port_host_fdb_add() - the others are similar - the problem is that this function handles: - DSA_DB_PORT databases, when called from dsa_port_standalone_host_fdb_add() - DSA_DB_BRIDGE databases, when called from dsa_port_bridge_host_fdb_add() So, if dsa_port_host_fdb_add() were to make any change on the "bridge.num" attribute of the database, this would only be correct for a DSA_DB_BRIDGE, and a type confusion for a DSA_DB_PORT bridge. However, this bug is without consequences, for 2 reasons: - dsa_port_standalone_host_fdb_add() is only called from code which is (in)directly guarded by dsa_switch_supports_uc_filtering(ds), and that function only returns true if ds->fdb_isolation is set. So, the code only executed for DSA_DB_BRIDGE databases. - Even if the code was not dead for DSA_DB_PORT, we have the following memory layout: struct dsa_bridge { struct net_device dev; unsigned int num; bool tx_fwd_offload; refcount_t refcount; }; struct dsa_db { enum dsa_db_type type; union { const struct dsa_port dp; // DSA_DB_PORT struct dsa_lag lag; struct dsa_bridge bridge; // DSA_DB_BRIDGE }; }; So, the zeroization of dsa_db :: bridge :: num on a dsa_db structure of type DSA_DB_PORT would access memory which is unused, because we only use dsa_db :: dp for DSA_DB_PORT, and this is mapped at the same address with dsa_db :: dev for DSA_DB_BRIDGE, thanks to the union definition. It is correct to fix up dsa_db :: bridge :: num only from code paths that come from the bridge / switchdev, so move these there. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230329133819.697642-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-30 23:19:55 -07:00
Vladimir Oltean	19d05ea712	net: dsa: move tag_8021q headers to their proper place tag_8021q definitions are all over the place. Some are exported to linux/dsa/8021q.h (visible by DSA core, taggers, switch drivers and everyone else), and some are in dsa_priv.h. Move the structures that don't need external visibility into tag_8021q.c, and the ones which don't need the world or switch drivers to see them into tag_8021q.h. We also have the tag_8021q.h inclusion from switch.c, which is basically the entire reason why tag_8021q.c was built into DSA in commit `8b6e638b4b` ("net: dsa: build tag_8021q.c as part of DSA core"). I still don't know how to better deal with that, so leave it alone. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-22 20:41:53 -08:00
Vladimir Oltean	47d2ce03dc	net: dsa: rename dsa2.c back into dsa.c and create its header The previous change moved the code into the larger file (dsa2.c) to minimize the delta. Rename that now to dsa.c, and create dsa.h, where all related definitions from dsa_priv.h go. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-22 20:41:53 -08:00
Vladimir Oltean	0c603136e1	net: dsa: move headers exported by switch.c to switch.h Reduce code bloat in dsa_priv.h by moving the prototypes exported by switch.h into their own header file. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-22 20:41:51 -08:00
Vladimir Oltean	09f9234168	net: dsa: move headers exported by slave.c to slave.h Minimize the use of the bloated dsa_priv.h by moving the prototypes exported by slave.c to their own header file. This is just approximate to get the code structure right. There are some interdependencies with static inline code left in dsa_priv.h, so leave slave.h included from there for now. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-22 20:41:49 -08:00
Vladimir Oltean	022bba63c3	net: dsa: move headers exported by port.c to port.h Minimize the use of the bloated dsa_priv.h by moving the prototypes exported by port.c to their own header file. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-22 20:41:48 -08:00
Jakub Kicinski	224b744abf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/linux/bpf.h `1f6e04a1c7` ("bpf: Fix offset calculation error in __copy_map_value and zero_map_value") `aa3496accc` ("bpf: Refactor kptr_off_tab into btf_record") `f71b2f6417` ("bpf: Refactor map->off_arr handling") https://lore.kernel.org/all/20221114095000.67a73239@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-17 18:30:39 -08:00
Vladimir Oltean	53d04b9811	net: dsa: remove phylink_validate() method As of now, no DSA driver uses a custom link mode validation procedure anymore. So remove this DSA operation and let phylink determine what is supported based on config->mac_capabilities (if provided by the driver). Leave a comment why we left the code that we did, and that there is more work to do. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-15 20:34:27 -08:00
Vladimir Oltean	ed1fe1bebe	net: dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims There are multi-generational drivers like mv88e6xxx which have code like this: int mv88e6xxx_port_hwtstamp_get(struct dsa_switch ds, int port, struct ifreq ifr) { if (!chip->info->ptp_support) return -EOPNOTSUPP; ... } DSA wants to deny PTP timestamping on the master if the switch supports timestamping too. However it currently relies on the presence of the port_hwtstamp_get() callback to determine PTP capability, and this clearly does not work in that case (method is present but returns -EOPNOTSUPP). We should not deny PTP on the DSA master for those switches which truly do not support hardware timestamping. Create a dsa_port_supports_hwtstamp() method which actually probes for support by calling port_hwtstamp_get() and seeing whether that returned -EOPNOTSUPP or not. Fixes: `f685e609a3` ("net: dsa: Deny PTP on master if switch supports it") Link: https://patchwork.kernel.org/project/netdevbpf/patch/20221110124345.3901389-1-festevam@gmail.com/ Reported-by: Fabio Estevam <festevam@gmail.com> Reported-by: Steffen Bätz <steffen@innosonix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Fabio Estevam <festevam@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-14 11:30:49 +00:00
Yang Yingliang	557f050166	net: dsa: fix wrong pointer passed to PTR_ERR() in dsa_port_phylink_create() Fix wrong pointer passed to PTR_ERR() in dsa_port_phylink_create() to print error message. Fixes: `cf5ca4ddc3` ("net: dsa: don't leave dangling pointers in dp->pl when failing") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-09 20:01:32 +01:00
Vladimir Oltean	cf5ca4ddc3	net: dsa: don't leave dangling pointers in dp->pl when failing There is a desire to simplify the dsa_port registration path with devlink, and this involves reworking a bit how user ports which fail to connect to their PHY (because it's missing) get reinitialized as UNUSED devlink ports. The desire is for the change to look something like this; basically dsa_port_setup() has failed, we just change dp->type and call dsa_port_setup() again. -/* Destroy the current devlink port, and create a new one which has the UNUSED - * flavour. - / -static int dsa_port_reinit_as_unused(struct dsa_port dp) +static int dsa_port_setup_as_unused(struct dsa_port *dp) { - dsa_port_devlink_teardown(dp); dp->type = DSA_PORT_TYPE_UNUSED; - return dsa_port_devlink_setup(dp); + return dsa_port_setup(dp); } For an UNUSED port, dsa_port_setup() mostly only calls dsa_port_devlink_setup() anyway, so we could get away with calling just that. But if we call the full blown dsa_port_setup(dp) (which will be needed to properly set dp->setup = true), the callee will have the tendency to go through this code block too, and call dsa_port_disable(dp): switch (dp->type) { case DSA_PORT_TYPE_UNUSED: dsa_port_disable(dp); break; That is not very good, because dsa_port_disable() has this hidden inside of it: if (dp->pl) phylink_stop(dp->pl); Fact is, we are not prepared to handle a call to dsa_port_disable() with a struct dsa_port that came from a previous (and failed) call to dsa_port_setup(). We do not clean up dp->pl, and this will make the second call to dsa_port_setup() call phylink_stop() on a dangling dp->pl pointer. Solve this by creating an API for phylink destruction which is symmetric to the phylink creation, and never leave dp->pl set to anything except NULL or a valid phylink structure. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-09-30 18:17:16 -07:00
Vladimir Oltean	acc43b7bf5	net: dsa: allow masters to join a LAG There are 2 ways in which a DSA user port may become handled by 2 CPU ports in a LAG: (1) its current DSA master joins a LAG ip link del bond0 && ip link add bond0 type bond mode 802.3ad ip link set eno2 master bond0 When this happens, all user ports with "eno2" as DSA master get automatically migrated to "bond0" as DSA master. (2) it is explicitly configured as such by the user # Before, the DSA master was eno3 ip link set swp0 type dsa master bond0 The design of this configuration is that the LAG device dynamically becomes a DSA master through dsa_master_setup() when the first physical DSA master becomes a LAG slave, and stops being so through dsa_master_teardown() when the last physical DSA master leaves. A LAG interface is considered as a valid DSA master only if it contains existing DSA masters, and no other lower interfaces. Therefore, we mainly rely on method (1) to enter this configuration. Each physical DSA master (LAG slave) retains its dev->dsa_ptr for when it becomes a standalone DSA master again. But the LAG master also has a dev->dsa_ptr, and this is actually duplicated from one of the physical LAG slaves, and therefore needs to be balanced when LAG slaves come and go. To the switch driver, putting DSA masters in a LAG is seen as putting their associated CPU ports in a LAG. We need to prepare cross-chip host FDB notifiers for CPU ports in a LAG, by calling the driver's ->lag_fdb_add method rather than ->port_fdb_add. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-20 10:32:36 +02:00
Vladimir Oltean	2e359b00a1	net: dsa: propagate extack to port_lag_join Drivers could refuse to offload a LAG configuration for a variety of reasons, mainly having to do with its TX type. Additionally, since DSA masters may now also be LAG interfaces, and this will translate into a call to port_lag_join on the CPU ports, there may be extra restrictions there. Propagate the netlink extack to this DSA method in order for drivers to give a meaningful error message back to the user. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-20 10:32:36 +02:00
Vladimir Oltean	95f510d0b7	net: dsa: allow the DSA master to be seen and changed through rtnetlink Some DSA switches have multiple CPU ports, which can be used to improve CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(), sets up only the first one, leading to suboptimal use of hardware. The desire is to not change the default configuration but to permit the user to create a dynamic mapping between individual user ports and the CPU port that they are served by, configurable through rtnetlink. It is also intended to permit load balancing between CPU ports, and in that case, the foreseen model is for the DSA master to be a bonding interface whose lowers are the physical DSA masters. To that end, we create a struct rtnl_link_ops for DSA user ports with the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that contains the ifindex of the newly desired DSA master. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-20 10:32:35 +02:00
Vladimir Oltean	8f6a19c031	net: dsa: introduce dsa_port_get_master() There is a desire to support for DSA masters in a LAG. That configuration is intended to work by simply enslaving the master to a bonding/team device. But the physical DSA master (the LAG slave) still has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical CPU port. However, we would like to be able to retrieve the LAG that's the upper of the physical DSA master. In preparation for that, introduce a helper called dsa_port_get_master() that replaces all occurrences of the dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will be made later within the helper itself. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-09-20 10:32:35 +02:00
Vladimir Oltean	e09e987315	net: dsa: make phylink-related OF properties mandatory on DSA and CPU ports Early DSA drivers were kind of simplistic in that they assumed a fairly narrow hardware layout. User ports would have integrated PHYs at an internal MDIO address that is derivable from the port number, and shared (DSA and CPU) ports would have an MII-style (serial or parallel) connection to another MAC. Phylib and then phylink were used to drive the internal PHYs, and this needed little to no description through the platform data structures. Bringing up the shared ports at the maximum supported link speed was the responsibility of the drivers. As a result of this, when these early drivers were converted from platform data to the new DSA OF bindings, there was no link information translated into the first DT bindings. https://lore.kernel.org/all/YtXFtTsf++AeDm1l@lunn.ch/ Later, phylink was adopted for shared ports as well, and today we have a workaround in place, introduced by commit `a20f997010` ("net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed"). There, DSA checks for the presence of phy-handle/fixed-link/managed OF properties, and if missing, phylink registration would be skipped. This is because phylink is optional for some drivers (the shared ports already work without it), but the process of starting to register a port with phylink is irreversible: if phylink_create() fails to find the fwnode properties it needs, it bails out and it leaves the ports inoperational (because phylink expects ports to be initially down, so DSA necessarily takes them down, and doesn't know how to put them back up again). DSA being a common framework, new drivers opt into this workaround willy-nilly, but the ideal behavior from the DSA core's side would have been to not interfere with phylink's process of failing at all. This isn't possible because of regression concerns with pre-phylink DT blobs, but at least DSA should put a stop to the proliferation of more of such cases that rely on the workaround to skip phylink registration, and sanitize the environment that new drivers work in. To that end, create a list of compatible strings for which the workaround is preserved, and don't apply the workaround for any drivers outside that list (this includes new drivers). In some cases, we make the assumption that even existing drivers don't rely on DSA's workaround, and we do this by looking at the device trees in which they appear. We can't fully know what is the situation with downstream DT blobs, but we can guess the overall trend by studying the DT blobs that were submitted upstream. If there are upstream blobs that have lacking descriptions, we take it as very likely that there are many more downstream blobs that do so too. If all upstream blobs have complete descriptions, we take that as a hint that the driver is a candidate for enforcing strict DT bindings (considering that most bindings are copy-pasted). If there are no upstream DT blobs, we take the conservative route of allowing the workaround, unless the driver maintainer instructs us otherwise. The driver situation is as follows: ar9331 ~~~~~~ compatible strings: - qca,ar9331-switch 1 occurrence in mainline device trees, part of SoC dtsi (arch/mips/boot/dts/qca/ar9331.dtsi), description is not problematic. Verdict: opt into strict DT bindings and out of workarounds. b53 ~~~ compatible strings: - brcm,bcm5325 - brcm,bcm53115 - brcm,bcm53125 - brcm,bcm53128 - brcm,bcm5365 - brcm,bcm5389 - brcm,bcm5395 - brcm,bcm5397 - brcm,bcm5398 - brcm,bcm53010-srab - brcm,bcm53011-srab - brcm,bcm53012-srab - brcm,bcm53018-srab - brcm,bcm53019-srab - brcm,bcm5301x-srab - brcm,bcm11360-srab - brcm,bcm58522-srab - brcm,bcm58525-srab - brcm,bcm58535-srab - brcm,bcm58622-srab - brcm,bcm58623-srab - brcm,bcm58625-srab - brcm,bcm88312-srab - brcm,cygnus-srab - brcm,nsp-srab - brcm,omega-srab - brcm,bcm3384-switch - brcm,bcm6328-switch - brcm,bcm6368-switch - brcm,bcm63xx-switch I've found at least these mainline DT blobs with problems: arch/arm/boot/dts/bcm47094-linksys-panamera.dts - lacks phy-mode arch/arm/boot/dts/bcm47189-tenda-ac9.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts arch/arm/boot/dts/bcm953012er.dts arch/arm/boot/dts/bcm4708-netgear-r6250.dts arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/bcm53016-meraki-mr32.dts - lacks phy-mode Verdict: opt into DSA workarounds. bcm_sf2 ~~~~~~~ compatible strings: - brcm,bcm4908-switch - brcm,bcm7445-switch-v4.0 - brcm,bcm7278-switch-v4.0 - brcm,bcm7278-switch-v4.8 A single occurrence in mainline (arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi), part of a SoC dtsi, valid description. Florian Fainelli explains that most of the bcm_sf2 device trees lack a full description for the internal IMP ports. Verdict: opt the BCM4908 into strict DT bindings, and opt the rest into the workarounds. Note that even though BCM4908 has strict DT bindings, it still does not register with phylink on the IMP port due to it implementing ->adjust_link(). hellcreek ~~~~~~~~~ compatible strings: - hirschmann,hellcreek-de1soc-r1 No occurrence in mainline device trees. Kurt Kanzenbach explains that the downstream device trees lacked phy-mode and fixed link, and needed work, but were fixed in the meantime. Verdict: opt into strict DT bindings and out of workarounds. lan9303 ~~~~~~~ compatible strings: - smsc,lan9303-mdio - smsc,lan9303-i2c 1 occurrence in mainline device trees: arch/arm/boot/dts/imx53-kp-hsc.dts - no phy-mode, no fixed-link Verdict: opt out of strict DT bindings and into workarounds. lantiq_gswip ~~~~~~~~~~~~ compatible strings: - lantiq,xrx200-gswip - lantiq,xrx300-gswip - lantiq,xrx330-gswip No occurrences in mainline device trees. Martin Blumenstingl confirms that the downstream OpenWrt device trees lack a proper fixed-link and need work, and that the incomplete description can even be seen in the example from Documentation/devicetree/bindings/net/dsa/lantiq-gswip.txt. Verdict: opt out of strict DT bindings and into workarounds. microchip ksz ~~~~~~~~~~~~~ compatible strings: - microchip,ksz8765 - microchip,ksz8794 - microchip,ksz8795 - microchip,ksz8863 - microchip,ksz8873 - microchip,ksz9477 - microchip,ksz9897 - microchip,ksz9893 - microchip,ksz9563 - microchip,ksz8563 - microchip,ksz9567 - microchip,lan9370 - microchip,lan9371 - microchip,lan9372 - microchip,lan9373 - microchip,lan9374 5 occurrences in mainline device trees, all descriptions are valid. But we had a snafu for the ksz8795 and ksz9477 drivers where the phy-mode property would be expected to be located directly under the 'switch' node rather than under a port OF node. It was fixed by commit `edecfa98f6` ("net: dsa: microchip: look for phy-mode in port nodes"). The driver still has compatibility with the old DT blobs. The lan937x support was added later than the above snafu was fixed, and even though it has support for the broken DT blobs by virtue of sharing a common probing function, I'll take it that its DT blobs are correct. Verdict: opt lan937x into strict DT bindings, and the others out. mt7530 ~~~~~~ compatible strings - mediatek,mt7621 - mediatek,mt7530 - mediatek,mt7531 Multiple occurrences in mainline device trees, one is part of an SoC dtsi (arch/mips/boot/dts/ralink/mt7621.dtsi), all descriptions are fine. Verdict: opt into strict DT bindings and out of workarounds. mv88e6060 ~~~~~~~~~ compatible string: - marvell,mv88e6060 no occurrences in mainline, nobody knows anybody who uses it. Verdict: opt out of strict DT bindings and into workarounds. mv88e6xxx ~~~~~~~~~ compatible strings: - marvell,mv88e6085 - marvell,mv88e6190 - marvell,mv88e6250 Device trees that have incomplete descriptions of CPU or DSA ports: arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi - lacks phy-mode arch/arm64/boot/dts/marvell/cn9130-crb.dtsi - lacks phy-mode and fixed-link arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-spb4.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-cfu1.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-dev-rev-c.dts - lacks phy-mode on CPU port, fixed-link on DSA ports arch/arm/boot/dts/vf610-zii-dev-rev-b.dts - lacks phy-mode on CPU port arch/arm/boot/dts/armada-381-netgear-gs110emx.dts - lacks phy-mode arch/arm/boot/dts/vf610-zii-scu4-aib.dts - lacks fixed-link on xgmii DSA ports and/or in-band-status on 2500base-x DSA ports, and phy-mode on CPU port arch/arm/boot/dts/imx6qdl-gw5904.dtsi - lacks phy-mode and fixed-link arch/arm/boot/dts/armada-385-clearfog-gtr-l8.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/vf610-zii-ssmb-dtu.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-dir665.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-rd88f6281.dtsi - lacks phy-mode arch/arm/boot/dts/orion5x-netgear-wnr854t.dts - lacks phy-mode and fixed-link arch/arm/boot/dts/armada-388-clearfog.dts - lacks phy-mode arch/arm/boot/dts/armada-xp-linksys-mamba.dts - lacks phy-mode arch/arm/boot/dts/armada-385-linksys.dtsi - lacks phy-mode arch/arm/boot/dts/imx6q-b450v3.dts arch/arm/boot/dts/imx6q-b850v3.dts - has a phy-handle but not a phy-mode? arch/arm/boot/dts/armada-370-rd.dts - lacks phy-mode arch/arm/boot/dts/kirkwood-linksys-viper.dts - lacks phy-mode arch/arm/boot/dts/imx51-zii-rdu1.dts - lacks phy-mode arch/arm/boot/dts/imx51-zii-scu2-mezz.dts - lacks phy-mode arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi - lacks phy-mode arch/arm/boot/dts/armada-385-clearfog-gtr-s4.dts - lacks phy-mode and fixed-link Verdict: opt out of strict DT bindings and into workarounds. ocelot ~~~~~~ compatible strings: - mscc,vsc9953-switch - felix (arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi) is a PCI device, has no compatible string 2 occurrences in mainline, both are part of SoC dtsi and complete. Verdict: opt into strict DT bindings and out of workarounds. qca8k ~~~~~ compatible strings: - qca,qca8327 - qca,qca8328 - qca,qca8334 - qca,qca8337 5 occurrences in mainline device trees, none of the descriptions are problematic. Verdict: opt into strict DT bindings and out of workarounds. realtek ~~~~~~~ compatible strings: - realtek,rtl8366rb - realtek,rtl8365mb 2 occurrences in mainline, both descriptions are fine, additionally rtl8365mb.c has a comment "The device tree firmware should also specify the link partner of the extension port - either via a fixed-link or other phy-handle." Verdict: opt into strict DT bindings and out of workarounds. rzn1_a5psw ~~~~~~~~~~ compatible strings: - renesas,rzn1-a5psw One single occurrence, part of SoC dtsi (arch/arm/boot/dts/r9a06g032.dtsi), description is fine. Verdict: opt into strict DT bindings and out of workarounds. sja1105 ~~~~~~~ Driver already validates its port OF nodes in sja1105_parse_ports_node(). Verdict: opt into strict DT bindings and out of workarounds. vsc73xx ~~~~~~~ compatible strings: - vitesse,vsc7385 - vitesse,vsc7388 - vitesse,vsc7395 - vitesse,vsc7398 2 occurrences in mainline device trees, both descriptions are fine. Verdict: opt into strict DT bindings and out of workarounds. xrs700x ~~~~~~~ compatible strings: - arrow,xrs7003e - arrow,xrs7003f - arrow,xrs7004e - arrow,xrs7004f no occurrences in mainline, we don't know. Verdict: opt out of strict DT bindings and into workarounds. Because there is a pattern where newly added switches reuse existing drivers more often than introducing new ones, I've opted for deciding who gets to opt into the workaround based on an OF compatible match table in the DSA core. The alternative would have been to add another boolean property to struct dsa_switch, like configure_vlan_while_not_filtering. But this avoids situations where sometimes driver maintainers obfuscate what goes on by sharing a common probing function, and therefore making new switches inherit old quirks. Side note, we also warn about missing properties for drivers that rely on the workaround. This isn't an indication that we'll break compatibility with those DT blobs any time soon, but is rather done to raise awareness about the change, for future DT blob authors. Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Acked-by: Alvin Šipraga <alsi@bang-olufsen.dk> # realtek Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-22 17:45:47 -07:00
Vladimir Oltean	770375ff33	net: dsa: rename dsa_port_link_{,un}register_of There is a subset of functions that applies only to shared (DSA and CPU) ports, yet this is difficult to comprehend by looking at their code alone. These are dsa_port_link_register_of(), dsa_port_link_unregister_of(), and the functions that only these 2 call. Rename this class of functions to dsa_shared_port_* to make this fact more evident, even if this goes against the apparent convention that function names in port.c must start with dsa_port_. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-22 17:45:47 -07:00
Vladimir Oltean	211987f3ac	net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it ds->ops->port_stp_state_set() is, like most DSA methods, optional, and if absent, the port is supposed to remain in the forwarding state (as standalone). Such is the case with the mv88e6060 driver, which does not offload the bridge layer. DSA warns that the STP state can't be changed to FORWARDING as part of dsa_port_enable_rt(), when in fact it should not. The error message is also not up to modern standards, so take the opportunity to make it more descriptive. Fixes: `fd36454131` ("net: dsa: change scope of STP state setter") Reported-by: Sergei Antonov <saproj@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Sergei Antonov <saproj@gmail.com> Link: https://lore.kernel.org/r/20220816201445.1809483-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-08-17 21:58:22 -07:00
Vladimir Oltean	1699b4d502	net: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filtering The "ds" iterator variable used in dsa_port_reset_vlan_filtering() -> dsa_switch_for_each_port() overwrites the "dp" received as argument, which is later used to call dsa_port_vlan_filtering() proper. As a result, switches which do enter that code path (the ones with vlan_filtering_is_global=true) will dereference an invalid dp in dsa_port_reset_vlan_filtering() after leaving a VLAN-aware bridge. Use a dedicated "other_dp" iterator variable to avoid this from happening. Fixes: `d0004a020b` ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-18 20:14:23 -07:00
Vladimir Oltean	4db2a5ef4c	net: dsa: fix dsa_port_vlan_filtering when global The blamed refactoring commit changed a "port" iterator with "other_dp", but still looked at the slave_dev of the dp outside the loop, instead of other_dp->slave from the loop. As a result, dsa_port_vlan_filtering() would not call dsa_slave_manage_vlan_filtering() except for the port in cause, and not for all switch ports as expected. Fixes: `d0004a020b` ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core") Reported-by: Lucian Banu <Lucian.Banu@westermo.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-07-18 20:14:23 -07:00
Vladimir Oltean	72c3b0c735	net: dsa: felix: manage host flooding using a specific driver callback At the time - commit `7569459a52` ("net: dsa: manage flooding on the CPU ports") - not introducing a dedicated switch callback for host flooding made sense, because for the only user, the felix driver, there was nothing different to do for the CPU port than set the flood flags on the CPU port just like on any other bridge port. There are 2 reasons why this approach is not good enough, however. (1) Other drivers, like sja1105, support configuring flooding as a function of {ingress port, egress port}, whereas the DSA ->port_bridge_flags() function only operates on an egress port. So with that driver we'd have useless host flooding from user ports which don't need it. (2) Even with the felix driver, support for multiple CPU ports makes it difficult to piggyback on ->port_bridge_flags(). The way in which the felix driver is going to support host-filtered addresses with multiple CPU ports is that it will direct these addresses towards both CPU ports (in a sort of multicast fashion), then restrict the forwarding to only one of the two using the forwarding masks. Consequently, flooding will also be enabled towards both CPU ports. However, ->port_bridge_flags() gets passed the index of a single CPU port, and that leaves the flood settings out of sync between the 2 CPU ports. This is to say, it's better to have a specific driver method for host flooding, which takes the user port as argument. This solves problem (1) by allowing the driver to do different things for different user ports, and problem (2) by abstracting the operation and letting the driver do whatever, rather than explicitly making the DSA core point to the CPU port it thinks needs to be touched. This new method also creates a problem, which is that cross-chip setups are not handled. However I don't have hardware right now where I can test what is the proper thing to do, and there isn't hardware compatible with multi-switch trees that supports host flooding. So it remains a problem to be tackled in the future. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-12 16:38:55 -07:00
Jakub Kicinski	9b19e57a3c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Build issue in drivers/net/ethernet/sfc/ptp.c `54fccfdd7c` ("sfc: efx_default_channel_type APIs can be static") `49e6123c65` ("net: sfc: fix memory leak due to ptp channel") https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-12 16:15:30 -07:00
Vladimir Oltean	630fd4822a	net: dsa: flush switchdev workqueue on bridge join error path There is a race between switchdev_bridge_port_offload() and the dsa_port_switchdev_sync_attrs() call right below it. When switchdev_bridge_port_offload() finishes, FDB entries have been replayed by the bridge, but are scheduled for deferred execution later. However dsa_port_switchdev_sync_attrs -> dsa_port_can_apply_vlan_filtering() may impose restrictions on the vlan_filtering attribute and refuse offloading. When this happens, the delayed FDB entries will dereference dp->bridge, which is a NULL pointer because we have stopped the process of offloading this bridge. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Workqueue: dsa_ordered dsa_slave_switchdev_event_work pc : dsa_port_bridge_host_fdb_del+0x64/0x100 lr : dsa_slave_switchdev_event_work+0x130/0x1bc Call trace: dsa_port_bridge_host_fdb_del+0x64/0x100 dsa_slave_switchdev_event_work+0x130/0x1bc process_one_work+0x294/0x670 worker_thread+0x80/0x460 ---[ end trace 0000000000000000 ]--- Error: dsa_core: Must first remove VLAN uppers having VIDs also present in bridge. Fix the bug by doing what we do on the normal bridge leave path as well, which is to wait until the deferred FDB entries complete executing, then exit. The placement of dsa_flush_workqueue() after switchdev_bridge_port_unoffload() guarantees that both the FDB additions and deletions on rollback are waited for. Fixes: `d7d0d423db` ("net: dsa: flush switchdev workqueue when leaving the bridge") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220507134550.1849834-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-09 18:08:04 -07:00
Jakub Kicinski	0e55546b18	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/linux/netdevice.h net/core/dev.c `6510ea973d` ("net: Use this_cpu_inc() to increment net->core_stats") `794c24e992` ("net-core: rx_otherhost_dropped to core_stats") https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/ drivers/net/wan/cosa.c `d48fea8401` ("net: cosa: fix error check return value of register_chrdev()") `89fbca3307` ("net: wan: remove support for COSA and SRP synchronous serial boards") https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-28 13:02:01 -07:00
Miaoqian Lin	fc06b2867f	net: dsa: Add missing of_node_put() in dsa_port_link_register_of The device_node pointer is returned by of_parse_phandle() with refcount incremented. We should use of_node_put() on it when done. of_node_put() will check for NULL value. Fixes: `a20f997010` ("net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-22 13:06:58 +01:00
Vladimir Oltean	be6ff9665d	net: dsa: don't emit targeted cross-chip notifiers for MTU change A cross-chip notifier with "targeted_match=true" is one that matches only the local port of the switch that emitted it. In other words, passing through the cross-chip notifier layer serves no purpose. Eliminate this concept by calling directly ds->ops->port_change_mtu instead of emitting a targeted cross-chip notifier. This leaves the DSA_NOTIFIER_MTU event being emitted only for MTU updates on the CPU port, which need to be reflected also across all DSA links. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-20 10:34:34 +01:00
Vladimir Oltean	726816a129	net: dsa: make cross-chip notifiers more efficient for host events To determine whether a given port should react to the port targeted by the notifier, dsa_port_host_vlan_match() and dsa_port_host_address_match() look at the positioning of the switch port currently executing the notifier relative to the switch port for which the notifier was emitted. To maintain stylistic compatibility with the other match functions from switch.c, the host address and host VLAN match functions take the notifier information about targeted port, switch and tree indices as argument. However, these functions only use that information to retrieve the struct dsa_port *targeted_dp, which is an invariant for the outer loop that calls them. So it makes more sense to calculate the targeted dp only once, and pass it to them as argument. But furthermore, the targeted dp is actually known at the time the call to dsa_port_notify() is made. It is just that we decide to only save the indices of the port, switch and tree in the notifier structure, just to retrace our steps and find the dp again using dsa_switch_find() and dsa_to_port(). But both the above functions are relatively expensive, since they need to iterate through lists. It appears more straightforward to make all notifiers just pass the targeted dp inside their info structure, and have the code that needs the indices to look at info->dp->index instead of info->port, or info->dp->ds->index instead of info->sw_index, or info->dp->ds->dst->index instead of info->tree_index. For the sake of consistency, all cross-chip notifiers are converted to pass the "dp" directly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-20 10:34:34 +01:00
Vladimir Oltean	8e9e678e47	net: dsa: move reset of VLAN filtering to dsa_port_switchdev_unsync_attrs In dsa_port_switchdev_unsync_attrs() there is a comment that resetting the VLAN filtering isn't done where it is expected. And since commit `108dc8741c` ("net: dsa: Avoid cross-chip syncing of VLAN filtering"), there is no reason to handle this in switch.c either. Therefore, move the logic to port.c, and adapt it slightly to the data structures and naming conventions from there. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-20 10:34:34 +01:00
Tobias Waldekranz	7414af30b7	net: dsa: Handle MST state changes Add the usual trampoline functionality from the generic DSA layer down to the drivers for MST state changes. When a state changes to disabled/blocking/listening, make sure to fast age any dynamic entries in the affected VLANs (those controlled by the MSTI in question). Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 16:49:59 -07:00
Tobias Waldekranz	8e6598a7b0	net: dsa: Pass VLAN MSTI migration notifications to driver Add the usual trampoline functionality from the generic DSA layer down to the drivers for VLAN MSTI migrations. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 16:49:59 -07:00
Tobias Waldekranz	332afc4c8c	net: dsa: Validate hardware support for MST When joining a bridge where MST is enabled, we validate that the proper offloading support is in place, otherwise we fallback to software bridging. When then mode is changed on a bridge in which we are members, we refuse the change if offloading is not supported. At the moment we only check for configurable learning, but this will be further restricted as we support more MST related switchdev events. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 16:49:58 -07:00
Vladimir Oltean	5e8a1e03aa	net: dsa: install secondary unicast and multicast addresses as host FDB/MDB In preparation of disabling flooding towards the CPU in standalone ports mode, identify the addresses requested by upper interfaces and use the new API for DSA FDB isolation to request the hardware driver to offload these as FDB or MDB objects. The objects belong to the user port's database, and are installed pointing towards the CPU port. Because dev_uc_add()/dev_mc_add() is VLAN-unaware, we offload to the port standalone database addresses with VID 0 (also VLAN-unaware). So this excludes switches with global VLAN filtering from supporting unicast filtering, because there, it is possible for a port of a switch to join a VLAN-aware bridge, and this changes the VLAN awareness of standalone ports, requiring VLAN-aware standalone host FDB entries. For the same reason, hellcreek, which requires VLAN awareness in standalone mode, is also exempted from unicast filtering. We create "standalone" variants of dsa_port_host_fdb_add() and dsa_port_host_mdb_add() (and the _del coresponding functions). We also create a separate work item type for handling deferred standalone host FDB/MDB entries compared to the switchdev one. This is done for the purpose of clarity - the procedure for offloading a bridge FDB entry is different than offloading a standalone one, and the switchdev event work handles only FDBs anyway, not MDBs. Deferral is needed for standalone entries because ndo_set_rx_mode runs in atomic context. We could probably optimize things a little by first queuing up all entries that need to be offloaded, and scheduling the work item just once, but the data structures that we can pass through __dev_uc_sync() and __dev_mc_sync() are limiting (there is nothing like a void *priv), so we'd have to keep the list of queued events somewhere in struct dsa_switch, and possibly a lock for it. Too complicated for now. Adding the address to the master is handled by dev_uc_sync(), adding it to the hardware is handled by __dev_uc_sync(). So this is the reason why dsa_port_standalone_host_fdb_add() does not call dev_uc_add(). Not that it had the rtnl_mutex anyway - ndo_set_rx_mode has it, but is atomic. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-03 14:15:31 +00:00
Vladimir Oltean	68d6d71eaf	net: dsa: rename the host FDB and MDB methods to contain the "bridge" namespace We are preparing to add API in port.c that adds FDB and MDB entries that correspond to the port's standalone database. Rename the existing methods to make it clear that the FDB and MDB entries offloaded come from the bridge database. Since the function names lengthen in dsa_slave_switchdev_event_work(), we place "addr" and "vid" in temporary variables, to shorten those. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-03 14:15:31 +00:00
Vladimir Oltean	06b9cce426	net: dsa: pass extack to .port_bridge_join driver methods As FDB isolation cannot be enforced between VLAN-aware bridges in lack of hardware assistance like extra FID bits, it seems plausible that many DSA switches cannot do it. Therefore, they need to reject configurations with multiple VLAN-aware bridges from the two code paths that can transition towards that state: - joining a VLAN-aware bridge - toggling VLAN awareness on an existing bridge The .port_vlan_filtering method already propagates the netlink extack to the driver, let's propagate it from .port_bridge_join too, to make sure that the driver can use the same function for both. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	c26933639b	net: dsa: request drivers to perform FDB isolation For DSA, to encourage drivers to perform FDB isolation simply means to track which bridge does each FDB and MDB entry belong to. It then becomes the driver responsibility to use something that makes the FDB entry from one bridge not match the FDB lookup of ports from other bridges. The top-level functions where the bridge is determined are: - dsa_port_fdb_{add,del} - dsa_port_host_fdb_{add,del} - dsa_port_mdb_{add,del} - dsa_port_host_mdb_{add,del} aka the pre-crosschip-notifier functions. Changing the API to pass a reference to a bridge is not superfluous, and looking at the passed bridge argument is not the same as having the driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add() method. DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well, and those do not have any dp->bridge information to retrieve, because they are not in any bridge - they are merely the pipes that serve the user ports that are in one or multiple bridges. The struct dsa_bridge associated with each FDB/MDB entry is encapsulated in a larger "struct dsa_db" database. Although only databases associated to bridges are notified for now, this API will be the starting point for implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB entries on the CPU port which belong to the corresponding user port's port database. These are supposed to match only when the port is standalone. It is better to introduce the API in its expected final form than to introduce it for bridges first, then to have to change drivers which may have made one or more assumptions. Drivers can use the provided bridge.num, but they can also use a different numbering scheme that is more convenient. DSA must perform refcounting on the CPU and DSA ports by also taking into account the bridge number. So if two bridges request the same local address, DSA must notify the driver twice, once for each bridge. In fact, if the driver supports FDB isolation, DSA must perform refcounting per bridge, but if the driver doesn't, DSA must refcount host addresses across all bridges, otherwise it would be telling the driver to delete an FDB entry for a bridge and the driver would delete it for all bridges. So introduce a bool fdb_isolation in drivers which would make all bridge databases passed to the cross-chip notifier have the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal() say that all bridge databases are the same database - which is essentially the legacy behavior. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	e212fa7c54	net: dsa: support FDB events on offloaded LAG interfaces This change introduces support for installing static FDB entries towards a bridge port that is a LAG of multiple DSA switch ports, as well as support for filtering towards the CPU local FDB entries emitted for LAG interfaces that are bridge ports. Conceptually, host addresses on LAG ports are identical to what we do for plain bridge ports. Whereas FDB entries _towards_ a LAG can't simply be replicated towards all member ports like we do for multicast, or VLAN. Instead we need new driver API. Hardware usually considers a LAG to be a "logical port", and sets the entire LAG as the forwarding destination. The physical egress port selection within the LAG is made by hashing policy, as usual. To represent the logical port corresponding to the LAG, we pass by value a copy of the dsa_lag structure to all switches in the tree that have at least one port in that LAG. To illustrate why a refcounted list of FDB entries is needed in struct dsa_lag, it is enough to say that: - a LAG may be a bridge port and may therefore receive FDB events even while it isn't yet offloaded by any DSA interface - DSA interfaces may be removed from a LAG while that is a bridge port; we don't want FDB entries lingering around, but we don't want to remove entries that are still in use, either For all the cases below to work, the idea is to always keep an FDB entry on a LAG with a reference count equal to the DSA member ports. So: - if a port joins a LAG, it requests the bridge to replay the FDB, and the FDB entries get created, or their refcount gets bumped by one - if a port leaves a LAG, the FDB replay deletes or decrements refcount by one - if an FDB is installed towards a LAG with ports already present, that entry is created (if it doesn't exist) and its refcount is bumped by the amount of ports already present in the LAG echo "Adding FDB entry to bond with existing ports" ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link del br0 ip link del bond0 echo "Adding FDB entry to empty bond" ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link del br0 ip link del bond0 echo "Adding FDB entry to empty bond, then removing ports one by one" ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 bridge fdb add dev bond0 00:01:02:03:04:05 master static ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link set swp1 nomaster ip link set swp2 nomaster ip link del br0 ip link del bond0 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:31:44 -08:00
Vladimir Oltean	dedd6a009f	net: dsa: create a dsa_lag structure The main purpose of this change is to create a data structure for a LAG as seen by DSA. This is similar to what we have for bridging - we pass a copy of this structure by value to ->port_lag_join and ->port_lag_leave. For now we keep the lag_dev, id and a reference count in it. Future patches will add a list of FDB entries for the LAG (these also need to be refcounted to work properly). The LAG structure is created using dsa_port_lag_create() and destroyed using dsa_port_lag_destroy(), just like we have for bridging. Because now, the dsa_lag itself is refcounted, we can simplify dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in the dst->lags array only as long as at least one port uses it. The refcounting logic inside those functions can be removed now - they are called only when we should perform the operation. dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag structure instead of the lag_dev net_device. dsa_lag_foreach_port() now takes the dsa_lag structure as argument. dst->lags holds an array of dsa_lag structures. dsa_lag_map() now also saves the dsa_lag->id value, so that linear walking of dst->lags in drivers using dsa_lag_id() is no longer necessary. They can just look at lag.id. dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(), which can be used by drivers to get the LAG ID assigned by DSA to a given port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:31:43 -08:00
Vladimir Oltean	46a76724e4	net: dsa: rename references to "lag" as "lag_dev" In preparation of converting struct net_device dp->lag_dev into a struct dsa_lag dp->lag, we need to rename, for consistency purposes, all occurrences of the "lag" variable in the DSA core to "lag_dev". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 21:31:42 -08:00
Jakub Kicinski	aaa25a2fa7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net tools/testing/selftests/net/mptcp/mptcp_join.sh `34aa6e3bcc` ("selftests: mptcp: add ip mptcp wrappers") `857898eb4b` ("selftests: mptcp: add missing join check") `6ef84b1517` ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c `fb7e76ea3f` ("net/mlx5e: TC, Skip redundant ct clear actions") `c63741b426` ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") `09bf979232` ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") `84ba8062e3` ("net/mlx5e: Test CT and SAMPLE on flow attr") `efe6f961cd` ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") `3b49a7edec` ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-24 17:54:25 -08:00
Hans Schultz	b9e8b58fd2	net: dsa: Include BR_PORT_LOCKED in the list of synced brport flags Ensures that the DSA switch driver gets notified of changes to the BR_PORT_LOCKED flag as well, for the case when a DSA port joins or leaves a LAG that is a bridge port. Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-23 12:52:34 +00:00
Alvin Šipraga	342b641919	net: dsa: fix panic when removing unoffloaded port from bridge If a bridged port is not offloaded to the hardware - either because the underlying driver does not implement the port_bridge_{join,leave} ops, or because the operation failed - then its dp->bridge pointer will be NULL when dsa_port_bridge_leave() is called. Avoid dereferncing NULL. This fixes the following splat when removing a port from a bridge: Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT_RT SMP CPU: 3 PID: 1119 Comm: brctl Tainted: G O 5.17.0-rc4-rt4 #1 Call trace: dsa_port_bridge_leave+0x8c/0x1e4 dsa_slave_changeupper+0x40/0x170 dsa_slave_netdevice_event+0x494/0x4d4 notifier_call_chain+0x80/0xe0 raw_notifier_call_chain+0x1c/0x24 call_netdevice_notifiers_info+0x5c/0xac __netdev_upper_dev_unlink+0xa4/0x200 netdev_upper_dev_unlink+0x38/0x60 del_nbp+0x1b0/0x300 br_del_if+0x38/0x114 add_del_if+0x60/0xa0 br_ioctl_stub+0x128/0x2dc br_ioctl_call+0x68/0xb0 dev_ifsioc+0x390/0x554 dev_ioctl+0x128/0x400 sock_do_ioctl+0xb4/0xf4 sock_ioctl+0x12c/0x4e0 __arm64_sys_ioctl+0xa8/0xf0 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.0+0x48/0xf0 do_el0_svc+0x28/0x84 el0_svc+0x1c/0x50 el0t_64_sync_handler+0xa8/0xb0 el0t_64_sync+0x17c/0x180 Code: f9402f00 f0002261 f9401302 913cc021 (a9401404) ---[ end trace 0000000000000000 ]--- Fixes: `d3eed0e57d` ("net: dsa: keep the bridge_dev and bridge_num as part of the same structure") Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220221203539.310690-1-alvin@pqrs.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-22 17:03:02 -08:00
Russell King (Oracle)	1054457006	net: phy: phylink: fix DSA mac_select_pcs() introduction Vladimir Oltean reports that probing on DSA drivers that aren't yet populating supported_interfaces now fails. Fix this by allowing phylink to detect whether DSA actually provides an underlying mac_select_pcs() implementation. Reported-by: Vladimir Oltean <olteanv@gmail.com> Fixes: `bde018222c` ("net: dsa: add support for phylink mac_select_pcs()") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/E1nMCD6-00A0wC-FG@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-22 16:57:08 -08:00
Vladimir Oltean	8940e6b669	net: dsa: avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held If the DSA master doesn't support IFF_UNICAST_FLT, then the following call path is possible: dsa_slave_switchdev_event_work -> dsa_port_host_fdb_add -> dev_uc_add -> __dev_set_rx_mode -> __dev_set_promiscuity Since the blamed commit, dsa_slave_switchdev_event_work() no longer holds rtnl_lock(), which triggers the ASSERT_RTNL() from __dev_set_promiscuity(). Taking rtnl_lock() around dev_uc_add() is impossible, because all the code paths that call dsa_flush_workqueue() do so from contexts where the rtnl_mutex is already held - so this would lead to an instant deadlock. dev_uc_add() in itself doesn't require the rtnl_mutex for protection. There is this comment in __dev_set_rx_mode() which assumes so: /* Unicast addresses changes may only happen under the rtnl, * therefore calling __dev_set_promiscuity here is safe. */ but it is from commit `4417da668c` ("[NET]: dev: secondary unicast address support") dated June 2007, and in the meantime, commit `f1f28aa351` ("netdev: Add addr_list_lock to struct net_device."), dated July 2008, has added &dev->addr_list_lock to protect this instead of the global rtnl_mutex. Nonetheless, __dev_set_promiscuity() does assume rtnl_mutex protection, but it is the uncommon path of what we typically expect dev_uc_add() to do. So since only the uncommon path requires rtnl_lock(), just check ahead of time whether dev_uc_add() would result into a call to __dev_set_promiscuity(), and handle that condition separately. DSA already configures the master interface to be promiscuous if the tagger requires this. We can extend this to also cover the case where the master doesn't handle dev_uc_add() (doesn't support IFF_UNICAST_FLT), and on the premise that we'd end up making it promiscuous during operation anyway, either if a DSA slave has a non-inherited MAC address, or if the bridge notifies local FDB entries for its own MAC address, the address of a station learned on a foreign port, etc. Fixes: `0faf890fc5` ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work") Reported-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 21:18:33 +00:00
Russell King (Oracle)	ccfbf44d4c	net: dsa: remove pcs_poll With drivers converted over to using phylink PCS, there is no need for the struct dsa_switch member "pcs_poll" to exist anymore - there is a flag in the struct phylink_pcs which indicates whether this PCS needs to be polled which supersedes this. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:41:50 +00:00
Russell King (Oracle)	bde018222c	net: dsa: add support for phylink mac_select_pcs() Add DSA support for the phylink mac_select_pcs() method so DSA drivers can return provide phylink with the appropriate PCS for the PHY interface mode. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:32 +00:00
Vladimir Oltean	d2b1d186ce	net: dsa: delete unused exported symbols for ethtool PHY stats Introduced in commit `cf96357303` ("net: dsa: Allow providing PHY statistics from CPU port"), it appears these were never used. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220216193726.2926320-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-17 20:07:09 -08:00
Vladimir Oltean	134ef2388e	net: dsa: add explicit support for host bridge VLANs Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it does so on user ports. This is good for basic functionality but has several limitations: - the VLAN group which must reach the CPU may be radically different from the VLAN group that must be autonomously forwarded by the switch. In other words, the admin may want to isolate noisy stations and avoid traffic from them going to the control processor of the switch, where it would just waste useless cycles. The bridge already supports independent control of VLAN groups on bridge ports and on the bridge itself, and when VLAN-aware, it will drop packets in software anyway if their VID isn't added as a 'self' entry towards the bridge device. - Replaying host FDB entries may depend, for some drivers like mv88e6xxx, on replaying the host VLANs as well. The 2 VLAN groups are approximately the same in most regular cases, but there are corner cases when timing matters, and DSA's approximation of replicating VLANs on shared ports simply does not work. - If a user makes the bridge (implicitly the CPU port) join a VLAN by accident, there is no way for the CPU port to isolate itself from that noisy VLAN except by rebooting the system. This is because for each VLAN added on a user port, DSA will add it on shared ports too, but for each VLAN deletion on a user port, it will remain installed on shared ports, since DSA has no good indication of whether the VLAN is still in use or not. Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN addition and removal events, DSA has a simple and straightforward task of separating the bridge port VLANs (these have an orig_dev which is a DSA slave interface, or a LAG interface) from the host VLANs (these have an orig_dev which is a bridge interface), and to keep a simple reference count of each VID on each shared port. Forwarding VLANs must be installed on the bridge ports and on all DSA ports interconnecting them. We don't have a good view of the exact topology, so we simply install forwarding VLANs on all DSA ports, which is what has been done until now. Host VLANs must be installed primarily on the dedicated CPU port of each bridge port. More subtly, they must also be installed on upstream-facing and downstream-facing DSA ports that are connecting the bridge ports and the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB entry may be absent from VTU) is still addressed even if that switch is in a cross-chip setup, and it has no local CPU port. Therefore: - user ports contain only bridge port (forwarding) VLANs, and no refcounting is necessary - DSA ports contain both forwarding and host VLANs. Refcounting is necessary among these 2 types. - CPU ports contain only host VLANs. Refcounting is also necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:05 +00:00
Vladimir Oltean	a68dc7b938	net: dsa: remove cross-chip support for HSR The cross-chip notifiers for HSR are bypass operations, meaning that even though all switches in a tree are notified, only the switch specified in the info structure is targeted. We can eliminate the unnecessary complexity by deleting the cross-chip notifier logic and calling the ds->ops straight from port.c. Cc: George McCollister <george.mccollister@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-05 15:04:51 +00:00

1 2 3 4

179 Commits