linux-sg2042

Commit Graph

Author	SHA1	Message	Date
YueHaibing	8aa69d3482	net: hns3: Remove unused inline function hclge_is_reset_pending() This is unused since commit `8e2288cad6` ("net: hns3: refactor PF cmdq init and uninit APIs with new common APIs"). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Jie Wang <wangjie125@huawei.com> Link: https://lore.kernel.org/r/20220216113507.22368-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-16 18:34:48 -08:00
David S. Miller	f0ead99e62	Merge branch 'Replay-and-offload-host-VLAN-entries-in-DSA' Vladimir Oltean says: ==================== Replay and offload host VLAN entries in DSA v2->v3: - make the bridge stop notifying switchdev for !BRENTRY VLANs - create precommit and commit wrappers around __vlan_add_flags(). - special-case the BRENTRY transition from false to true, instead of treating it as a change of flags and letting drivers figure out that it really isn't. - avoid setting *changed unless we know that functions will not error out later. - drop "old_flags" from struct switchdev_obj_port_vlan, nobody needs it now, in v2 only DSA needed it to filter out BRENTRY transitions, that is now solved cleaner. - no BRIDGE_VLAN_INFO_BRENTRY flag checks and manipulations in DSA whatsoever, use the "bool changed" bit as-is after changing what it means. - merge dsa_slave_host_vlan_{add,del}() with dsa_slave_foreign_vlan_{add,del}(), since now they do the same thing, because the host_vlan functions no longer need to mangle the vlan BRENTRY flags and bool changed. v1->v2: - prune switchdev VLAN additions with no actual change differently - no longer need to revert struct net_bridge_vlan changes on error from switchdev - no longer need to first delete a changed VLAN before readding it - pass 'bool changed' and 'u16 old_flags' through switchdev_obj_port_vlan so that DSA can do some additional post-processing with the BRIDGE_VLAN_INFO_BRENTRY flag - support VLANs on foreign interfaces - fix the same -EOPNOTSUPP error in mv88e6xxx, this time on removal, due to VLAN deletion getting replayed earlier than FDB deletion The motivation behind these patches is that Rafael reported the following error with mv88e6xxx when the first switch port joins a bridge: mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95 (-EOPNOTSUPP) The FDB entry that's added is the MAC address of the bridge, in VID 1 (the default_pvid), being replayed as part of br_add_if() -> ... -> nbp_switchdev_sync_objs(). -EOPNOTSUPP is the mv88e6xxx driver's way of saying that VID 1 doesn't exist in the VTU, so it can't program the ATU with a FID, something which it needs. It appears to be a race, but it isn't, since we only end up installing VID 1 in the VTU by coincidence. DSA's approximation of programming VLANs on the CPU port together with the user ports breaks down with host FDB entries on mv88e6xxx, since that strictly requires the VTU to contain the VID. But the user may freely add VLANs pointing just towards the bridge, and FDB entries in those VLANs, and DSA will not be aware of them, because it only listens for VLANs on user ports. To create a solution that scales properly to cross-chip setups and doesn't leak entries behind, some changes in the bridge driver are required. I believe that these are for the better overall, but I may be wrong. Namely, the same refcounting procedure that DSA has in place for host FDB and MDB entries can be replicated for VLANs, except that it's garbage in, garbage out: the VLAN addition and removal notifications from switchdev aren't balanced. So the first 2 patches attempt to deal with that. This patch set has been superficially tested on a board with 3 mv88e6xxx switches in a daisy chain and appears to produce the primary desired effect - the driver no longer returns -EOPNOTSUPP when the first port joins a bridge, and is successful in performing local termination under a VLAN-aware bridge. As an additional side effect, it silences the annoying "p%d: already a member of VLAN %d\n" warning messages that the mv88e6xxx driver produces when coupled with systemd-networkd, and a few VLANs are configured. Furthermore, it advances Florian's idea from a few years back, which never got merged: https://lore.kernel.org/lkml/20180624153339.13572-1-f.fainelli@gmail.com/ v2 has also been tested on the NXP LS1028A felix switch. Some testing: root@debian:~# bridge vlan add dev br0 vid 101 pvid self [ 100.709220] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 9 vlan 101 [ 100.873426] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 10 vlan 101 [ 100.892314] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 9 vlan 101 [ 101.053392] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 10 vlan 101 [ 101.076994] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_add: port 9 vlan 101 root@debian:~# bridge vlan add dev br0 vid 101 pvid self root@debian:~# bridge vlan add dev br0 vid 101 pvid self root@debian:~# bridge vlan port vlan-id eth0 1 PVID Egress Untagged lan9 1 PVID Egress Untagged lan10 1 PVID Egress Untagged lan11 1 PVID Egress Untagged lan12 1 PVID Egress Untagged lan13 1 PVID Egress Untagged lan14 1 PVID Egress Untagged lan15 1 PVID Egress Untagged lan16 1 PVID Egress Untagged lan17 1 PVID Egress Untagged lan18 1 PVID Egress Untagged lan19 1 PVID Egress Untagged lan20 1 PVID Egress Untagged lan21 1 PVID Egress Untagged lan22 1 PVID Egress Untagged lan23 1 PVID Egress Untagged lan24 1 PVID Egress Untagged sfp 1 PVID Egress Untagged lan1 1 PVID Egress Untagged lan2 1 PVID Egress Untagged lan3 1 PVID Egress Untagged lan4 1 PVID Egress Untagged lan5 1 PVID Egress Untagged lan6 1 PVID Egress Untagged lan7 1 PVID Egress Untagged lan8 1 PVID Egress Untagged br0 1 Egress Untagged 101 PVID root@debian:~# bridge vlan del dev br0 vid 101 pvid self [ 108.340487] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 9 vlan 101 [ 108.379167] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 10 vlan 101 [ 108.402319] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_del: port 9 vlan 101 [ 108.425866] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 9 vlan 101 [ 108.452280] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 10 vlan 101 root@debian:~# bridge vlan del dev br0 vid 101 pvid self root@debian:~# bridge vlan del dev br0 vid 101 pvid self root@debian:~# bridge vlan port vlan-id eth0 1 PVID Egress Untagged lan9 1 PVID Egress Untagged lan10 1 PVID Egress Untagged lan11 1 PVID Egress Untagged lan12 1 PVID Egress Untagged lan13 1 PVID Egress Untagged lan14 1 PVID Egress Untagged lan15 1 PVID Egress Untagged lan16 1 PVID Egress Untagged lan17 1 PVID Egress Untagged lan18 1 PVID Egress Untagged lan19 1 PVID Egress Untagged lan20 1 PVID Egress Untagged lan21 1 PVID Egress Untagged lan22 1 PVID Egress Untagged lan23 1 PVID Egress Untagged lan24 1 PVID Egress Untagged sfp 1 PVID Egress Untagged lan1 1 PVID Egress Untagged lan2 1 PVID Egress Untagged lan3 1 PVID Egress Untagged lan4 1 PVID Egress Untagged lan5 1 PVID Egress Untagged lan6 1 PVID Egress Untagged lan7 1 PVID Egress Untagged lan8 1 PVID Egress Untagged br0 1 Egress Untagged root@debian:~# bridge vlan del dev br0 vid 101 pvid self ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:05 +00:00
Vladimir Oltean	164f861bd4	net: dsa: offload bridge port VLANs on foreign interfaces DSA now explicitly handles VLANs installed with the 'self' flag on the bridge as host VLANs, instead of just replicating every bridge port VLAN also on the CPU port and never deleting it, which is what it did before. However, this leaves a corner case uncovered, as explained by Tobias Waldekranz: https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260 Forwarding towards a bridge port VLAN installed on a bridge port foreign to DSA (separate NIC, Wi-Fi AP) used to work by virtue of the fact that DSA itself needed to have at least one port in that VLAN (therefore, it also had the CPU port in said VLAN). However, now that the CPU port may not be member of all VLANs that user ports are members of, we need to ensure this isn't the case if software forwarding to a foreign interface is required. The solution is to treat bridge port VLANs on standalone interfaces in the exact same way as host VLANs. From DSA's perspective, there is no difference between local termination and software forwarding; packets in that VLAN must reach the CPU in both cases. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:05 +00:00
Vladimir Oltean	134ef2388e	net: dsa: add explicit support for host bridge VLANs Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it does so on user ports. This is good for basic functionality but has several limitations: - the VLAN group which must reach the CPU may be radically different from the VLAN group that must be autonomously forwarded by the switch. In other words, the admin may want to isolate noisy stations and avoid traffic from them going to the control processor of the switch, where it would just waste useless cycles. The bridge already supports independent control of VLAN groups on bridge ports and on the bridge itself, and when VLAN-aware, it will drop packets in software anyway if their VID isn't added as a 'self' entry towards the bridge device. - Replaying host FDB entries may depend, for some drivers like mv88e6xxx, on replaying the host VLANs as well. The 2 VLAN groups are approximately the same in most regular cases, but there are corner cases when timing matters, and DSA's approximation of replicating VLANs on shared ports simply does not work. - If a user makes the bridge (implicitly the CPU port) join a VLAN by accident, there is no way for the CPU port to isolate itself from that noisy VLAN except by rebooting the system. This is because for each VLAN added on a user port, DSA will add it on shared ports too, but for each VLAN deletion on a user port, it will remain installed on shared ports, since DSA has no good indication of whether the VLAN is still in use or not. Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN addition and removal events, DSA has a simple and straightforward task of separating the bridge port VLANs (these have an orig_dev which is a DSA slave interface, or a LAG interface) from the host VLANs (these have an orig_dev which is a bridge interface), and to keep a simple reference count of each VID on each shared port. Forwarding VLANs must be installed on the bridge ports and on all DSA ports interconnecting them. We don't have a good view of the exact topology, so we simply install forwarding VLANs on all DSA ports, which is what has been done until now. Host VLANs must be installed primarily on the dedicated CPU port of each bridge port. More subtly, they must also be installed on upstream-facing and downstream-facing DSA ports that are connecting the bridge ports and the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB entry may be absent from VTU) is still addressed even if that switch is in a cross-chip setup, and it has no local CPU port. Therefore: - user ports contain only bridge port (forwarding) VLANs, and no refcounting is necessary - DSA ports contain both forwarding and host VLANs. Refcounting is necessary among these 2 types. - CPU ports contain only host VLANs. Refcounting is also necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:05 +00:00
Vladimir Oltean	c4076cdd21	net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces The switchdev_handle_port_obj_add() helper is good for replicating a port object on the lower interfaces of @dev, if that object was emitted on a bridge, or on a bridge port that is a LAG. However, drivers that use this helper limit themselves to a box from which they can no longer intercept port objects notified on neighbor ports ("foreign interfaces"). One such driver is DSA, where software bridging with foreign interfaces such as standalone NICs or Wi-Fi APs is an important use case. There, a VLAN installed on a neighbor bridge port roughly corresponds to a forwarding VLAN installed on the DSA switch's CPU port. To support this use case while also making use of the benefits of the switchdev_handle_* replication helper for port objects, introduce a new variant of these functions that crawls through the neighbor ports of @dev, in search of potentially compatible switchdev ports that are interested in the event. The strategy is identical to switchdev_handle_fdb_event_to_device(): if @dev wasn't a switchdev interface, then go one step upper, and recursively call this function on the bridge that this port belongs to. At the next recursion step, __switchdev_handle_port_obj_add() will iterate through the bridge's lower interfaces. Among those, some will be switchdev interfaces, and one will be the original @dev that we came from. To prevent infinite recursion, we must suppress reentry into the original @dev, and just call the @add_cb for the switchdev_interfaces. It looks like this: br0 / \| \ / \| \ / \| \ swp0 swp1 eth0 1. __switchdev_handle_port_obj_add(eth0) -> check_cb(eth0) returns false -> eth0 has no lower interfaces -> eth0's bridge is br0 -> switchdev_lower_dev_find(br0, check_cb, foreign_dev_check_cb)) finds br0 2. __switchdev_handle_port_obj_add(br0) -> check_cb(br0) returns false -> netdev_for_each_lower_dev -> check_cb(swp0) returns true, so we don't skip this interface 3. __switchdev_handle_port_obj_add(swp0) -> check_cb(swp0) returns true, so we call add_cb(swp0) (back to netdev_for_each_lower_dev from 2) -> check_cb(swp1) returns true, so we don't skip this interface 4. __switchdev_handle_port_obj_add(swp1) -> check_cb(swp1) returns true, so we call add_cb(swp1) (back to netdev_for_each_lower_dev from 2) -> check_cb(eth0) returns false, so we skip this interface to avoid infinite recursion Note: eth0 could have been a LAG, and we don't want to suppress the recursion through its lowers if those exist, so when check_cb() returns false, we still call switchdev_lower_dev_find() to estimate whether there's anything worth a recursion beneath that LAG. Using check_cb() and foreign_dev_check_cb(), switchdev_lower_dev_find() not only figures out whether the lowers of the LAG are switchdev, but also whether they actively offload the LAG or not (whether the LAG is "foreign" to the switchdev interface or not). The port_obj_info->orig_dev is preserved across recursive calls, so switchdev drivers still know on which device was this notification originally emitted. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	7b465f4cf3	net: switchdev: rename switchdev_lower_dev_find to switchdev_lower_dev_find_rcu switchdev_lower_dev_find() assumes RCU read-side critical section calling context, since it uses netdev_walk_all_lower_dev_rcu(). Rename it appropriately, in preparation of adding a similar iterator that assumes writer-side rtnl_mutex protection. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	b28d580e29	net: bridge: switchdev: replay all VLAN groups The major user of replayed switchdev objects is DSA, and so far it hasn't needed information about anything other than bridge port VLANs, so this is all that br_switchdev_vlan_replay() knows to handle. DSA has managed to get by through replicating every VLAN addition on a user port such that the same VLAN is also added on all DSA and CPU ports, but there is a corner case where this does not work. The mv88e6xxx DSA driver currently prints this error message as soon as the first port of a switch joins a bridge: mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95 where a6:ef:77:c8:5f:3d vid 1 is a local FDB entry corresponding to the bridge MAC address in the default_pvid. The -EOPNOTSUPP is returned by mv88e6xxx_port_db_load_purge() because it tries to map VID 1 to a FID (the ATU is indexed by FID not VID), but fails to do so. This is because ->port_fdb_add() is called before ->port_vlan_add() for VID 1. The abridged timeline of the calls is: br_add_if -> netdev_master_upper_dev_link -> dsa_port_bridge_join -> switchdev_bridge_port_offload -> br_switchdev_vlan_replay () -> br_switchdev_fdb_replay -> mv88e6xxx_port_fdb_add -> nbp_vlan_init -> nbp_vlan_add -> mv88e6xxx_port_vlan_add and the issue is that at the time of (), the bridge port isn't in VID 1 (nbp_vlan_init hasn't been called), therefore br_switchdev_vlan_replay() won't have anything to replay, therefore VID 1 won't be in the VTU by the time mv88e6xxx_port_fdb_add() is called. This happens only when the first port of a switch joins. For further ports, the initial mv88e6xxx_port_vlan_add() is sufficient for VID 1 to be loaded in the VTU (which is switch-wide, not per port). The problem is somewhat unique to mv88e6xxx by chance, because most other drivers offload an FDB entry by VID, so FDBs and VLANs can be added asynchronously with respect to each other, but addressing the issue at the bridge layer makes sense, since what mv88e6xxx requires isn't absurd. To fix this problem, we need to recognize that it isn't the VLAN group of the port that we're interested in, but the VLAN group of the bridge itself (so it isn't a timing issue, but rather insufficient information being passed from switchdev to drivers). As mentioned, currently nbp_switchdev_sync_objs() only calls br_switchdev_vlan_replay() for VLANs corresponding to the port, but the VLANs corresponding to the bridge itself, for local termination, also need to be replayed. In this case, VID 1 is not (yet) present in the port's VLAN group but is present in the bridge's VLAN group. So to fix this bug, DSA is now obligated to explicitly handle VLANs pointing towards the bridge in order to "close this race" (which isn't really a race). As Tobias Waldekranz notices, this also implies that it must explicitly handle port VLANs on foreign interfaces, something that worked implicitly before: https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260 So in the end, br_switchdev_vlan_replay() must replay all VLANs from all VLAN groups: all the ports, and the bridge itself. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	263029ae31	net: bridge: make nbp_switchdev_unsync_objs() follow reverse order of sync() There may be switchdev drivers that can add/remove a FDB or MDB entry only as long as the VLAN it's in has been notified and offloaded first. The nbp_switchdev_sync_objs() method satisfies this requirement on addition, but nbp_switchdev_unsync_objs() first deletes VLANs, then deletes MDBs and FDBs. Reverse the order of the function calls to cater to this requirement. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	8d23a54f5b	net: bridge: switchdev: differentiate new VLANs from changed ones br_switchdev_port_vlan_add() currently emits a SWITCHDEV_PORT_OBJ_ADD event with a SWITCHDEV_OBJ_ID_PORT_VLAN for 2 distinct cases: - a struct net_bridge_vlan got created - an existing struct net_bridge_vlan was modified This makes it impossible for switchdev drivers to properly balance PORT_OBJ_ADD with PORT_OBJ_DEL events, so if we want to allow that to happen, we must provide a way for drivers to distinguish between a VLAN with changed flags and a new one. Annotate struct switchdev_obj_port_vlan with a "bool changed" that distinguishes the 2 cases above. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	27c5f74c7b	net: bridge: vlan: notify switchdev only when something changed Currently, when a VLAN entry is added multiple times in a row to a bridge port, nbp_vlan_add() calls br_switchdev_port_vlan_add() each time, even if the VLAN already exists and nothing about it has changed: bridge vlan add dev lan12 vid 100 master static Similarly, when a VLAN is added multiple times in a row to a bridge, br_vlan_add_existing() doesn't filter at all the calls to br_switchdev_port_vlan_add(): bridge vlan add dev br0 vid 100 self This behavior makes driver-level accounting of VLANs impossible, since it is enough for a single deletion event to remove a VLAN, but the addition event can be emitted an unlimited number of times. The cause for this can be identified as follows: we rely on __vlan_add_flags() to retroactively tell us whether it has changed anything about the VLAN flags or VLAN group pvid. So we'd first have to call __vlan_add_flags() before calling br_switchdev_port_vlan_add(), in order to have access to the "bool *changed" information. But we don't want to change the event ordering, because we'd have to revert the struct net_bridge_vlan changes we've made if switchdev returns an error. So to solve this, we need another function that tells us whether any change is going to occur in the VLAN or VLAN group, _prior_ to calling __vlan_add_flags(). Split __vlan_add_flags() into a precommit and a commit stage, and rename it to __vlan_flags_update(). The precommit stage, __vlan_flags_would_change(), will determine whether there is any reason to notify switchdev due to a change of flags (note: the BRENTRY flag transition from false to true is treated separately: as a new switchdev entry, because we skipped notifying the master VLAN when it wasn't a brentry yet, and therefore not as a change of flags). With this lookahead/precommit function in place, we can avoid notifying switchdev if nothing changed for the VLAN and VLAN group. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	cab2cd7700	net: bridge: vlan: make __vlan_add_flags react only to PVID and UNTAGGED Currently there is a very subtle aspect to the behavior of __vlan_add_flags(): it changes the struct net_bridge_vlan flags and pvid, yet it returns true ("changed") even if none of those changed, just a transition of br_vlan_is_brentry(v) took place from false to true. This can be seen in br_vlan_add_existing(), however we do not actually rely on this subtle behavior, since the "if" condition that checks that the vlan wasn't a brentry before had a useless (until now) assignment: *changed = true; Make things more obvious by actually making __vlan_add_flags() do what's written on the box, and be more specific about what is actually written on the box. This is needed because further transformations will be done to __vlan_add_flags(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	3116ad0696	net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag When a VLAN is added to a bridge port and it doesn't exist on the bridge device yet, it gets created for the multicast context, but it is 'hidden', since it doesn't have the BRENTRY flag yet: ip link add br0 type bridge && ip link set swp0 master br0 bridge vlan add dev swp0 vid 100 # the master VLAN 100 gets created bridge vlan add dev br0 vid 100 self # that VLAN becomes brentry just now All switchdev drivers ignore switchdev notifiers for VLAN entries which have the BRENTRY unset, and for good reason: these are merely private data structures used by the bridge driver. So we might just as well not notify those at all. Cleanup in the switchdev drivers that check for the BRENTRY flag is now possible, and will be handled separately, since those checks just became dead code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Vladimir Oltean	b2bc58d41f	net: bridge: vlan: check early for lack of BRENTRY flag in br_vlan_add_existing When a VLAN is added to a bridge port, a master VLAN gets created on the bridge for context, but it doesn't have the BRENTRY flag. Then, when the same VLAN is added to the bridge itself, that enters through the br_vlan_add_existing() code path and gains the BRENTRY flag, thus it becomes "existing". It seems natural to check for this condition early, because the current code flow is to notify switchdev of the addition of a VLAN that isn't a brentry, just to delete it immediately afterwards. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-16 11:21:04 +00:00
Haiyue Wang	b0471c2610	gve: enhance no queue page list detection The commit `a5886ef4f4` ("gve: Introduce per netdev `enum gve_queue_format`") introduces three queue format type, only GVE_GQI_QPL_FORMAT queue has page list. So it should use the queue page list number to detect the zero size queue page list. Correct the design logic. Using the 'queue_format == GVE_GQI_RDA_FORMAT' may lead to request zero sized memory allocation, like if the queue format is GVE_DQO_RDA_FORMAT. The kernel memory subsystem will return ZERO_SIZE_PTR, which is not NULL address, so the driver can run successfully. Also the code still checks the queue page list number firstly, then accesses the allocated memory, so zero number queue page list allocation will not lead to access fault. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Reviewed-by: Bailey Forrest <bcf@google.com> Link: https://lore.kernel.org/r/20220215051751.260866-1-haiyue.wang@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-15 18:01:06 -08:00
Colin Ian King	2c955856da	net: dm9051: Fix spelling mistake "eror" -> "error" There are spelling mistakes in debug messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-15 14:56:57 +00:00
Yang Li	99cd6a64e1	dpaa2-eth: Simplify bool conversion Fix the following coccicheck warnings: ./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1199:42-47: WARNING: conversion to bool not needed here ./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1218:54-59: WARNING: conversion to bool not needed here Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-15 14:39:12 +00:00
Vladimir Oltean	5454f5c28e	net: bridge: vlan: check for errors from __vlan_del in __vlan_flush If the following call path returns an error from switchdev: nbp_vlan_flush -> __vlan_del -> __vlan_vid_del -> br_switchdev_port_vlan_del -> __vlan_group_free -> WARN_ON(!list_empty(&vg->vlan_list)); then the deletion of the net_bridge_vlan is silently halted, which will trigger the WARN_ON from __vlan_group_free(). The WARN_ON is rather unhelpful, because nothing about the source of the error is printed. Add a print to catch errors from __vlan_del. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-15 14:37:28 +00:00
Christophe JAILLET	25ce79db80	net: hso: Use GFP_KERNEL instead of GFP_ATOMIC when possible hso_create_device() is only called from function that already use GFP_KERNEL. And all the callers are called from the probe function. So there is no need here to explicitly require a GFP_ATOMIC when allocating memory. Use GFP_KERNEL instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-15 14:34:29 +00:00
Michael Catanzaro	4f50ef152e	virtio_net: Fix code indent error This patch fixes the checkpatch.pl warning: ERROR: code indent should use tabs where possible #3453: FILE: drivers/net/virtio_net.c:3453: ret = register_virtio_driver(&virtio_net_driver);$ Uneccessary newline was also removed making line 3453 now 3452. Signed-off-by: Michael Catanzaro <mcatanzaro.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-15 14:27:17 +00:00
David S. Miller	9b3e446cd0	mlx5-updates-2022-02-14 mlx5 TX routines improvements 1) From Aya and Tariq, first 3 patches, Use the Max size of the TX descriptor as advertised by the device and not the fixed value of 16 that the driver always assumed, this is not a bug fix as all existing devices have Max value larger than 16, but the series is necessary for future proofing the driver. 2) TX Synchronization improvements from Maxim, last 12 patches Maxim Mikityanskiy Says: ======================= mlx5e: Synchronize ndo_select_queue with configuration changes The kernel can call ndo_select_queue at any time, and there is no direct way to block it. The implementation of ndo_select_queue in mlx5e expects the parameters to be consistent and may crash (invalid pointer, division by zero) if they aren't. There were attempts to partially fix some of the most frequent crashes, see commit `846d6da1fc` ("net/mlx5e: Fix division by 0 in mlx5e_select_queue") and commit `84c8a87402` ("net/mlx5e: Fix division by 0 in mlx5e_select_queue for representors"). However, they don't address the issue completely. This series introduces the proper synchronization mechanism between mlx5e configuration and TX data path: 1. txq2sq updates are synchronized properly with ndo_start_xmit (mlx5e_xmit). The TX queue is stopped when it configuration is being updated, and memory barriers ensure the changes are visible before restarting. 2. The set of parameters needed for mlx5e_select_queue is reduced, and synchronization using RCU is implemented. This way, changes are atomic, and the state in mlx5e_select_queue is always consistent. 3. A few optimizations are applied to the new implementation of mlx5e_select_queue. ======================= -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmILSJwACgkQSD+KveBX +j6BjwgAj4tGNrIuR8wByLq7gFG+h8t/A80TkqxgNHFhYuF3u6bS6UnmymCAsM3F G8jFott6Pu/4d7SQvksrtVQJvTTj+TinZFeLGNxYXLwm/Syk6Avbs2c02eDL1CbQ fUE4HavgwpS+hp8xvdWOUH+i5b+e7I+iF6JC/K0LULHUhXaJ7BXiQi3qCm7DKwLm gVdWzDHNBj73Xmxak71CH0b79qdIBMs0JlZjdizE++8vg/AZ8CY2G51UUTmjJk5h YASXBigbmGAP5o30IWxjYaj51UQTmWmR5A1XhNIC4iVdWjFOPcz6k8M5EfB0ADq+ eXkQKGDiyMDddKAaor611K0n9H4Ogw== =SDns -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2022-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-02-14 mlx5 TX routines improvements 1) From Aya and Tariq, first 3 patches, Use the Max size of the TX descriptor as advertised by the device and not the fixed value of 16 that the driver always assumed, this is not a bug fix as all existing devices have Max value larger than 16, but the series is necessary for future proofing the driver. 2) TX Synchronization improvements from Maxim, last 12 patches Maxim Mikityanskiy Says: ======================= mlx5e: Synchronize ndo_select_queue with configuration changes The kernel can call ndo_select_queue at any time, and there is no direct way to block it. The implementation of ndo_select_queue in mlx5e expects the parameters to be consistent and may crash (invalid pointer, division by zero) if they aren't. There were attempts to partially fix some of the most frequent crashes, see commit `846d6da1fc` ("net/mlx5e: Fix division by 0 in mlx5e_select_queue") and commit `84c8a87402` ("net/mlx5e: Fix division by 0 in mlx5e_select_queue for representors"). However, they don't address the issue completely. This series introduces the proper synchronization mechanism between mlx5e configuration and TX data path: 1. txq2sq updates are synchronized properly with ndo_start_xmit (mlx5e_xmit). The TX queue is stopped when it configuration is being updated, and memory barriers ensure the changes are visible before restarting. 2. The set of parameters needed for mlx5e_select_queue is reduced, and synchronization using RCU is implemented. This way, changes are atomic, and the state in mlx5e_select_queue is always consistent. 3. A few optimizations are applied to the new implementation of mlx5e_select_queue. ======================= ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-15 10:35:09 +00:00
Maxim Mikityanskiy	71753b8ec1	net/mlx5e: Optimize the common case condition in mlx5e_select_queue Check all booleans for special queues at once, when deciding whether to go to the fast path in mlx5e_select_queue. Pack them into bitfields to have some room for extensibility. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:52 -08:00
Maxim Mikityanskiy	3a9e5fff2a	net/mlx5e: Optimize modulo in mlx5e_select_queue To improve the performance of the modulo operation (%), it's replaced by a subtracting the divisor in a loop. The modulo is used to fix up an out-of-bounds value that might be returned by netdev_pick_tx or to convert the queue number to the channel number when num_tcs > 1. Both situations are unlikely, because XPS is configured not to pick higher queues (qid >= num_channels) by default, so under normal circumstances the flow won't go inside the loop, and it will be faster than %. num_tcs == 8 adds at most 7 iterations to the loop. PTP adds at most 1 iteration to the loop. HTB would add at most 256 iterations (when num_channels == 1), so there is an additional boundary check in the HTB flow, which falls back to % if more than 7 iterations are expected. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy	3c87aedd48	net/mlx5e: Optimize mlx5e_select_queue This commit optimizes mlx5e_select_queue for HTB and PTP cases by short-cutting some checks, without sacrificing performance of the common non-HTB non-PTP flow. 1. The HTB flow uses the fact that num_tcs == 1 to drop these checks (it's not possible to attach both mqprio and htb as the root qdisc). It's also enough to calculate `txq_ix % num_channels` only once, instead of twice. 2. The PTP flow drops the check for HTB and the second calculation of `txq_ix % num_channels`. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy	ed5f9cf06b	net/mlx5e: Use READ_ONCE/WRITE_ONCE for DCBX trust state trust_state can be written while mlx5e_select_queue() is reading it. To avoid inconsistencies, use READ_ONCE and WRITE_ONCE for access and updates, and touch the variable only once per operation. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy	62f7991fea	net/mlx5e: Move repeating code that gets TC prio into a function Both mlx5e_select_queue and mlx5e_select_ptpsq contain the same logic to get user priority of a packet, according to the current trust state settings. This commit moves this repeating code to its own function. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:51 -08:00
Maxim Mikityanskiy	3ab45777a2	net/mlx5e: Use select queue parameters to sync with control flow Start using the select queue parameters introduced in the previous commit to have proper synchronization with changing the configuration (such as number of channels and queues). It ensures that the state that mlx5e_select_queue() sees is always consistent and stays the same while the function is running. Also it allows mlx5e_select_queue to stop using data structures that weren't synchronized properly: txq2sq, channel_tc2realtxq, port_ptp_tc2realtxq. The last two are removed completely, as they were used only in mlx5e_select_queue. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:50 -08:00
Maxim Mikityanskiy	6b23f6ab86	net/mlx5e: Move mlx5e_select_queue to en/selq.c This commit moves mlx5e_select_queue and all stuff related to ndo_select_queue to en/selq.c to put all stuff working with selq into a separate file. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:50 -08:00
Maxim Mikityanskiy	8bf30be750	net/mlx5e: Introduce select queue parameters ndo_select_queue can be called at any time, and there is no way to stop the kernel from calling it to synchronize with configuration changes (real_num_tx_queues, num_tc). This commit introduces an internal way in mlx5e to sync mlx5e_select_queue() with these changes. The configuration needed by this function is stored in a struct mlx5e_selq_params, which is modified and accessed in an atomic way using RCU methods. The whole ndo_select_queue is called under an RCU lock, providing the necessary guarantees. The parameters stored in the new struct mlx5e_selq_params should only be used from inside mlx5e_select_queue. It's the minimal set of parameters needed for mlx5e_select_queue to do its job efficiently, derived from parameters stored elsewhere. That means that when the configuration change, mlx5e_selq_params may need to be updated. In such cases, the mlx5e_selq_prepare/mlx5e_selq_apply API should be used. struct mlx5e_selq contains two slots for the params: active and standby. mlx5e_selq_prepare updates the standby slot, and mlx5e_selq_apply swaps the slots in a safe atomic way using the RCU API. It integrates well with the open/activate stages of the configuration change flow. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:50 -08:00
Maxim Mikityanskiy	17c84cb46e	net/mlx5e: Sync txq2sq updates with mlx5e_xmit for HTB queues This commit makes necessary changes to guarantee that txq2sq remains stable while mlx5e_xmit is running. Proper synchronization is added for HTB TX queues. All updates to txq2sq are performed while the corresponding queue is disabled (i.e. mlx5e_xmit doesn't run on that queue). smp_wmb after each change guarantees that mlx5e_xmit can see the updated value after the queue is enabled. Comments explaining this mechanism are added to mlx5e_xmit. When an HTB SQ can be deleted (after deleting an HTB node), synchronize with RCU to wait for mlx5e_select_queue to finish and stop selecting that queue, before we re-enable it to avoid TX timeout watchdog alarms. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:49 -08:00
Maxim Mikityanskiy	6ce204eac3	net/mlx5e: Use a barrier after updating txq2sq mlx5e_build_txq_maps updates txq2sq while TX queues are stopped. Add a barrier to ensure that these changes are visible before the queues are started and mlx5e_xmit reads from txq2sq. This commit handles regular TX queues. Synchronization between HTB TX queues and mlx5e_xmit is handled in the following commit. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:49 -08:00
Maxim Mikityanskiy	d08c6e2a4d	net/mlx5e: Disable TX queues before registering the netdev Normally, the queues are disabled when the channels are deactivated, and enabled when the channels are activated. However, on register, the channels are not active, but the queues are enabled by default. This change fixes it, preventing mlx5e_xmit from running when the channels are deactivated in the beginning. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:49 -08:00
Maxim Mikityanskiy	befa41771f	net/mlx5e: Cleanup of start/stop all queues mlx5e_activate_priv_channels() and mlx5e_deactivate_priv_channels() start and stop all netdev TX queues. This commit removes the unneeded call to netif_tx_stop_all_queues and adds explanatory comments why these operations are needed. netif_tx_disable() does the same thing that netif_tx_stop_all_queues(), but taking the TX lock, thus guaranteeing that ndo_start_xmit is not running after return. That means that the netif_tx_stop_all_queues() call is not really necessary. The comments are improved: the TX watchdog timeout explanation is moved to the start stage where it really belongs (it used to be in both places, but was lost during some old refactoring) and rephrased in more details; the explanation for stopping all TX queues is added. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:49 -08:00
Aya Levin	76c31e5f75	net/mlx5e: Use FW limitation for max MPW WQEBBs Calculate maximal count of MPW WQEBBs on SQ's creation and store it there. Remove MLX5E_TX_MPW_MAX_NUM_DS and MLX5E_TX_MPW_MAX_WQEBBS. Update mlx5e_tx_mpwqe_is_full() and mlx5e_xdp_mpqwe_is_full() . Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:48 -08:00
Aya Levin	c27bd1718c	net/mlx5e: Read max WQEBBs on the SQ from firmware Prior to this patch the maximal value for max WQEBBs (WQE Basic Blocks, where WQE is a Work Queue Element) on the TX side was assumed to be 16 (fixed value). All firmware versions till today comply to this. In order to be more flexible and resilient, read from FW the corresponding: max_wqe_sz_sq. This value describes the maximum WQE size given in bytes, thus max WQEBBs is given by the division in WQEBB's byte size. The driver uses the top between 16 and the division result. This ensures synchronization between driver and firmware and avoids unexpected behavior. Store this value on the different SQs (Send Queues) for easy access. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:48 -08:00
Tariq Toukan	9536923d3f	net/mlx5e: Remove unused tstamp SQ field Remove tstamp pointer in mlx5e_txqsq as it's no longer used after commit `7c39afb394` ("net/mlx5: PTP code migration to driver core section"). Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-14 22:30:48 -08:00
Tobias Waldekranz	d0b78ab1ca	net: dsa: mv88e6xxx: Fix validation of built-in PHYs on 6095/6097 These chips have 8 built-in FE PHYs and 3 SERDES interfaces that can run at 1G. With the blamed commit, the built-in PHYs could no longer be connected to, using an MII PHY interface mode. Create a separate .phylink_get_caps callback for these chips, which takes the FE/GE split into consideration. Fixes: `2ee84cfefb` ("net: dsa: mv88e6xxx: convert to phylink_generic_validate()") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20220213185154.3262207-1-tobias@waldekranz.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-14 21:13:54 -08:00
Colin Ian King	12d8c11198	selftests: net: cmsg_sender: Fix spelling mistake "MONOTINIC" -> "MONOTONIC" There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:12:06 +00:00
Volodymyr Mytnyk	fa5d824ce5	net: prestera: acl: add multi-chain support offload Add support of rule offloading added to the non-zero index chain, which was previously forbidden. Also, goto action is offloaded allowing to jump for processing of desired chain. Note that only implicit chain 0 is bound to the device port(s) for processing. The rest of chains have to be jumped by actions. Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:11:43 +00:00
David S. Miller	e81f1e0de8	Merge branch 'wwan-debugfs' M Chetan Kumar says: ==================== net: wwan: debugfs dev reference not dropped This patch series contains WWAN subsystem & IOSM Driver changes to drop dev reference obtained as part of wwan debugfs dir entry retrieval. PATCH1: A new debugfs interface is introduced in wwan subsystem so that wwan driver can drop the obtained dev reference post debugfs use. PATCH2: IOSM Driver uses new debugfs interface to drop dev reference. Please refer to commit messages for details. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:09:59 +00:00
M Chetan Kumar	163f69ae22	net: wwan: iosm: drop debugfs dev reference Post debugfs use call wwan_put_debugfs_dir()to drop debugfs dev reference. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:09:59 +00:00
M Chetan Kumar	76f05d8862	net: wwan: debugfs obtained dev reference not dropped WWAN driver call's wwan_get_debugfs_dir() to obtain WWAN debugfs dir entry. As part of this procedure it returns a reference to a found device. Since there is no debugfs interface available at WWAN subsystem, it is not possible to drop dev reference post debugfs use. This leads to side effects like post wwan driver load and reload the wwan instance gets increment from wwanX to wwanX+1. A new debugfs interface is added in wwan subsystem so that wwan driver can drop the obtained dev reference post debugfs use. void wwan_put_debugfs_dir(struct dentry *dir) Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:09:59 +00:00
David S. Miller	1e997d040a	Merge branch 'dsa-realtek-next' Luiz Angelo Daros de Luca says: ==================== net: dsa: realtek: realtek-mdio: reset before setup This patch series cleans the realtek-smi reset code and copy that to the realtek-mdio. v1-v2) - do not run reset code block if GPIO is missing. It was printing "RESET deasserted" even when there is no GPIO configured. - reset switch after dsa_unregister_switch() - demote reset messages to debug v2-v3) - do not assert the reset on gpiod_get. Do it explicitly aferwards. - split the commit into two (one for each module) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:06:11 +00:00
Luiz Angelo Daros de Luca	05f7b042c5	net: dsa: realtek: realtek-mdio: reset before setup Some devices, like the switch in Banana Pi BPI R64 only starts to answer after a HW reset. It is the same reset code from realtek-smi. Reported-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Tested-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:06:11 +00:00
Luiz Angelo Daros de Luca	9a236b543f	net: dsa: realtek: realtek-smi: clean-up reset When reset GPIO was missing, the driver was still printing an info message and still trying to assert the reset. Although gpiod_set_value() will silently ignore calls with NULL gpio_desc, it is better to make it clear the driver might allow gpio_desc to be NULL. The initial value for the reset pin was changed to GPIOD_OUT_LOW, followed by a gpiod_set_value() asserting the reset. This way, it will be easier to spot if and where the reset really happens. A new "asserted RESET" message was added just after the reset is asserted, similar to the existing "deasserted RESET" message. Both messages were demoted to dbg. The code comment is not needed anymore. Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:06:11 +00:00
Ido Schimmel	dd263a8cb1	ipv6: blackhole_netdev needs snmp6 counters Whenever rt6_uncached_list_flush_dev() swaps rt->rt6_idev to the blackhole device, parts of IPv6 stack might still need to increment one SNMP counter. Root cause, patch from Ido, changelog from Eric :) This bug suggests that we need to audit rt->rt6_idev usages and make sure they are properly using RCU protection. Fixes: `e5f80fcf86` ("ipv6: give an IPv6 dev to blackhole_netdev") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 14:04:27 +00:00
Luiz Angelo Daros de Luca	7db45f8d95	net: dsa: realtek: rename macro to match filename The macro was missed while renaming realtek-smi.h to realtek.h. Fixes: `f5f119077b` (net: dsa: realtek: rename realtek_smi to) Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 13:46:04 +00:00
David S. Miller	da54d75beb	Merge branch 'netdev-RT' Sebastian Andrzej Siewior says: ==================== net: dev: PREEMPT_RT fixups. this series removes or replaces preempt_disable() and local_irq_save() sections which are problematic on PREEMPT_RT. Patch 2 makes netif_rx() work from any context after I found suggestions for it in an old thread. Should that work, then the context-specific variants could be removed. v2…v3: - #2 - Export __netif_rx() so it can be used by everyone. - Add a lockdep assert to check for interrupt context. - Update the kernel doc and mention that the skb is posted to backlog NAPI. - Use __netif_rx() also in drivers/net/*.c. - Added Toke''s review tag and kept Eric's desptite the changes made. v1…v2: - #1 and #2 - merge patch 1 und 2 from the series (as per Toke). - updated patch description and corrected the first commit number (as per Eric). - #2 - Provide netif_rx() as in v1 and additionally __netif_rx() without local_bh disable()+enable() for the loopback driver. __netif_rx() is not exported (loopback is built-in only) so it won't be used drivers. If this doesn't work then we can still export/ define a wrapper as Eric suggested. - Added a comment that netif_rx() considered legacy. - #3 - Moved ____napi_schedule() into rps_ipi_queued() and renamed it napi_schedule_rps(). https://lore.kernel.org/all/20220204201259.1095226-1-bigeasy@linutronix.de/ v1: https://lore.kernel.org/all/20220202122848.647635-1-bigeasy@linutronix.de ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 13:38:36 +00:00
Sebastian Andrzej Siewior	e722db8de6	net: dev: Make rps_lock() disable interrupts. Disabling interrupts and in the RPS case locking input_pkt_queue is split into local_irq_disable() and optional spin_lock(). This breaks on PREEMPT_RT because the spinlock_t typed lock can not be acquired with disabled interrupts. The sections in which the lock is acquired is usually short in a sense that it is not causing long und unbounded latiencies. One exception is the skb_flow_limit() invocation which may invoke a BPF program (and may require sleeping locks). By moving local_irq_disable() + spin_lock() into rps_lock(), we can keep interrupts disabled on !PREEMPT_RT and enabled on PREEMPT_RT kernels. Without RPS on a PREEMPT_RT kernel, the needed synchronisation happens as part of local_bh_disable() on the local CPU. ____napi_schedule() is only invoked if sd is from the local CPU. Replace it with __napi_schedule_irqoff() which already disables interrupts on PREEMPT_RT as needed. Move this call to rps_ipi_queued() and rename the function to napi_schedule_rps as suggested by Jakub. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 13:38:35 +00:00
Sebastian Andrzej Siewior	baebdf48c3	net: dev: Makes sure netif_rx() can be invoked in any context. Dave suggested a while ago (eleven years by now) "Let's make netif_rx() work in all contexts and get rid of netif_rx_ni()". Eric agreed and pointed out that modern devices should use netif_receive_skb() to avoid the overhead. In the meantime someone added another variant, netif_rx_any_context(), which behaves as suggested. netif_rx() must be invoked with disabled bottom halves to ensure that pending softirqs, which were raised within the function, are handled. netif_rx_ni() can be invoked only from process context (bottom halves must be enabled) because the function handles pending softirqs without checking if bottom halves were disabled or not. netif_rx_any_context() invokes on the former functions by checking in_interrupts(). netif_rx() could be taught to handle both cases (disabled and enabled bottom halves) by simply disabling bottom halves while invoking netif_rx_internal(). The local_bh_enable() invocation will then invoke pending softirqs only if the BH-disable counter drops to zero. Eric is concerned about the overhead of BH-disable+enable especially in regard to the loopback driver. As critical as this driver is, it will receive a shortcut to avoid the additional overhead which is not needed. Add a local_bh_disable() section in netif_rx() to ensure softirqs are handled if needed. Provide __netif_rx() which does not disable BH and has a lockdep assert to ensure that interrupts are disabled. Use this shortcut in the loopback driver and in drivers/net/*.c. Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they can be removed once they are no more users left. Link: https://lkml.kernel.org/r/20100415.020246.218622820.davem@davemloft.net Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 13:38:35 +00:00
Sebastian Andrzej Siewior	f234ae2947	net: dev: Remove preempt_disable() and get_cpu() in netif_rx_internal(). The preempt_disable() () section was introduced in commit `cece1945bf` ("net: disable preemption before call smp_processor_id()") and adds it in case this function is invoked from preemtible context and because get_cpu() later on as been added. The get_cpu() usage was added in commit `b0e28f1eff` ("net: netif_rx() must disable preemption") because ip_dev_loopback_xmit() invoked netif_rx() with enabled preemption causing a warning in smp_processor_id(). The function netif_rx() should only be invoked from an interrupt context which implies disabled preemption. The commit `e30b38c298` ("ip: Fix ip_dev_loopback_xmit()") was addressing this and replaced netif_rx() with in netif_rx_ni() in ip_dev_loopback_xmit(). Based on the discussion on the list, the former patch (`b0e28f1eff`) should not have been applied only the latter (`e30b38c298`). Remove get_cpu() and preempt_disable() since the function is supposed to be invoked from context with stable per-CPU pointers. Bottom halves have to be disabled at this point because the function may raise softirqs which need to be processed. Link: https://lkml.kernel.org/r/20100415.013347.98375530.davem@davemloft.net Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-14 13:38:35 +00:00

1 2 3 4 5 ...

1074154 Commits All Branches Search

1074154 Commits

All Branches