OpenCloudOS-Kernel/net/dsa
Vladimir Oltean dd17e1e682 net: mscc: ocelot: use ocelot_xmit_get_vlan_info() also for FDMA and register injection
[ Upstream commit 67c3ca2c5cfe6a50772514e3349b5e7b3b0fac03 ]

Problem description
-------------------

On an NXP LS1028A (felix DSA driver) with the following configuration:

- ocelot-8021q tagging protocol
- VLAN-aware bridge (with STP) spanning at least swp0 and swp1
- 8021q VLAN upper interfaces on swp0 and swp1: swp0.700, swp1.700
- ptp4l on swp0.700 and swp1.700

we see that the ptp4l instances do not see each other's traffic,
and they all go to the grand master state due to the
ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES condition.

Jumping to the conclusion for the impatient
-------------------------------------------

There is a zero-day bug in the ocelot switchdev driver in the way it
handles VLAN-tagged packet injection. The correct logic already exists in
the source code, in function ocelot_xmit_get_vlan_info() added by commit
5ca721c54d ("net: dsa: tag_ocelot: set the classified VLAN during xmit").
But it is used only for normal NPI-based injection with the DSA "ocelot"
tagging protocol. The other injection code paths (register-based and
FDMA-based) roll their own wrong logic. This affects and was noticed on
the DSA "ocelot-8021q" protocol because it uses register-based injection.

By moving ocelot_xmit_get_vlan_info() to a place that's common for both
the DSA tagger and the ocelot switch library, it can also be called from
ocelot_port_inject_frame() in ocelot.c.

We need to touch the lines with ocelot_ifh_port_set()'s prototype
anyway, so let's rename it to something clearer regarding what it does,
and add a kernel-doc. ocelot_ifh_set_basic() should do.

Investigation notes
-------------------

Debugging reveals that PTP event (aka those carrying timestamps, like
Sync) frames injected into swp0.700 (but also swp1.700) hit the wire
with two VLAN tags:

00000000: 01 1b 19 00 00 00 00 01 02 03 04 05 81 00 02 bc
                                              ~~~~~~~~~~~
00000010: 81 00 02 bc 88 f7 00 12 00 2c 00 00 02 00 00 00
          ~~~~~~~~~~~
00000020: 00 00 00 00 00 00 00 00 00 00 00 01 02 ff fe 03
00000030: 04 05 00 01 00 04 00 00 00 00 00 00 00 00 00 00
00000040: 00 00

The second (unexpected) VLAN tag makes felix_check_xtr_pkt() ->
ptp_classify_raw() fail to see these as PTP packets at the link
partner's receiving end, and return PTP_CLASS_NONE (because the BPF
classifier is not written to expect 2 VLAN tags).

The reason why packets have 2 VLAN tags is because the transmission
code treats VLAN incorrectly.

Neither ocelot switchdev, nor felix DSA, declare the NETIF_F_HW_VLAN_CTAG_TX
feature. Therefore, at xmit time, all VLANs should be in the skb head,
and none should be in the hwaccel area. This is done by:

static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
					  netdev_features_t features)
{
	if (skb_vlan_tag_present(skb) &&
	    !vlan_hw_offload_capable(features, skb->vlan_proto))
		skb = __vlan_hwaccel_push_inside(skb);
	return skb;
}

But ocelot_port_inject_frame() handles things incorrectly:

	ocelot_ifh_port_set(ifh, port, rew_op, skb_vlan_tag_get(skb));

void ocelot_ifh_port_set(struct sk_buff *skb, void *ifh, int port, u32 rew_op)
{
	(...)
	if (vlan_tag)
		ocelot_ifh_set_vlan_tci(ifh, vlan_tag);
	(...)
}

The way __vlan_hwaccel_push_inside() pushes the tag inside the skb head
is by calling:

static inline void __vlan_hwaccel_clear_tag(struct sk_buff *skb)
{
	skb->vlan_present = 0;
}

which does _not_ zero out skb->vlan_tci as seen by skb_vlan_tag_get().
This means that ocelot, when it calls skb_vlan_tag_get(), sees
(and uses) a residual skb->vlan_tci, while the same VLAN tag is
_already_ in the skb head.

The trivial fix for double VLAN headers is to replace the content of
ocelot_ifh_port_set() with:

	if (skb_vlan_tag_present(skb))
		ocelot_ifh_set_vlan_tci(ifh, skb_vlan_tag_get(skb));

but this would not be correct either, because, as mentioned,
vlan_hw_offload_capable() is false for us, so we'd be inserting dead
code and we'd always transmit packets with VID=0 in the injection frame
header.

I can't actually test the ocelot switchdev driver and rely exclusively
on code inspection, but I don't think traffic from 8021q uppers has ever
been injected properly, and not double-tagged. Thus I'm blaming the
introduction of VLAN fields in the injection header - early driver code.

As hinted at in the early conclusion, what we _want_ to happen for
VLAN transmission was already described once in commit 5ca721c54d
("net: dsa: tag_ocelot: set the classified VLAN during xmit").

ocelot_xmit_get_vlan_info() intends to ensure that if the port through
which we're transmitting is under a VLAN-aware bridge, the outer VLAN
tag from the skb head is stripped from there and inserted into the
injection frame header (so that the packet is processed in hardware
through that actual VLAN). And in all other cases, the packet is sent
with VID=0 in the injection frame header, since the port is VLAN-unaware
and has logic to strip this VID on egress (making it invisible to the
wire).

Fixes: 08d02364b1 ("net: mscc: fix the injection header")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29 17:33:45 +02:00
..
Kconfig net: dsa: modularize DSA_TAG_PROTO_NONE 2022-11-22 20:41:45 -08:00
Makefile net: dsa: add trace points for FDB/MDB operations 2023-04-12 08:36:07 +01:00
devlink.c net: dsa: move rest of devlink setup/teardown to devlink.c 2022-11-22 20:41:47 -08:00
devlink.h net: dsa: move rest of devlink setup/teardown to devlink.c 2022-11-22 20:41:47 -08:00
dsa.c net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses 2023-06-27 09:37:41 -07:00
dsa.h net: dsa: rename dsa2.c back into dsa.c and create its header 2022-11-22 20:41:53 -08:00
master.c net: dsa: replace NETDEV_PRE_CHANGE_HWTSTAMP notifier with a stub 2023-04-09 15:35:49 +01:00
master.h net: dsa: replace NETDEV_PRE_CHANGE_HWTSTAMP notifier with a stub 2023-04-09 15:35:49 +01:00
netlink.c net: dsa: kill off dsa_priv.h 2022-11-22 20:41:54 -08:00
netlink.h net: dsa: kill off dsa_priv.h 2022-11-22 20:41:54 -08:00
port.c net: dsa: mark parsed interface mode for legacy switch drivers 2023-08-09 13:08:09 -07:00
port.h net: dsa: make dsa_port_supports_hwtstamp() construct a fake ifreq 2023-04-03 10:04:27 +01:00
slave.c net: dsa: remove deprecated strncpy 2023-07-23 11:45:46 +01:00
slave.h net: dsa: move headers exported by slave.c to slave.h 2022-11-22 20:41:49 -08:00
stubs.c net: dsa: replace NETDEV_PRE_CHANGE_HWTSTAMP notifier with a stub 2023-04-09 15:35:49 +01:00
switch.c net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses 2023-06-27 09:37:41 -07:00
switch.h net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses 2023-06-27 09:37:41 -07:00
tag.c net: dsa: report rx_bytes unadjusted for ETH_HLEN 2023-03-20 09:09:53 +00:00
tag.h net: dsa: update TX path comments to not mention skb_mac_header() 2023-04-23 14:16:45 +01:00
tag_8021q.c net: dsa: update TX path comments to not mention skb_mac_header() 2023-04-23 14:16:45 +01:00
tag_8021q.h net: dsa: move tag_8021q headers to their proper place 2022-11-22 20:41:53 -08:00
tag_ar9331.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_brcm.c net: dsa: tag_brcm: legacy: fix daisy-chained switches 2023-03-21 17:29:13 -07:00
tag_dsa.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_gswip.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_hellcreek.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-12-08 18:19:59 -08:00
tag_ksz.c net: dsa: tag_ksz: do not rely on skb_mac_header() in TX paths 2023-04-23 14:16:44 +01:00
tag_lan9303.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_mtk.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_none.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_ocelot.c net: mscc: ocelot: use ocelot_xmit_get_vlan_info() also for FDMA and register injection 2024-08-29 17:33:45 +02:00
tag_ocelot_8021q.c net: dsa: move tag_8021q headers to their proper place 2022-11-22 20:41:53 -08:00
tag_qca.c net: dsa: tag_qca: return early if dev is not found 2023-08-01 12:02:42 +02:00
tag_rtl4_a.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_rtl8_4.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_rzn1_a5psw.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_sja1105.c net: dsa: sja1105: always enable the send_meta options 2023-07-04 19:42:27 +01:00
tag_trailer.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
tag_xrs700x.c net: dsa: move tagging protocol code to tag.{c,h} 2022-11-22 20:41:50 -08:00
trace.c net: dsa: add trace points for FDB/MDB operations 2023-04-12 08:36:07 +01:00
trace.h net: dsa: add trace points for VLAN operations 2023-04-12 08:36:07 +01:00