Saeed Mahameed says:
====================
mlx5-updates-2017-11-09
This series introduces vlan offloads related improvements for mlx5
ethernet netdev driver, from Gal Pressman.
- Add support for 802.1ad vlan filter
- Add support for 802.1ad vlan insertion
- Add vlan offloads statistics to ethtool (inserted/stripped vlans)
- CHECKSUM_COMPLETE support for vlan traffic when vlan stripping is off! (Finally)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn says:
====================
IGMP snooping for local traffic
The linux bridge supports IGMP snooping. It will listen to IGMP
reports on bridge ports and keep track of which groups have been
joined on an interface. It will then forward multicast based on this
group membership.
When the bridge adds or removed groups from an interface, it uses
switchdev to request the hardware add an mdb to a port, so the
hardware can perform the selective forwarding between ports.
What is not covered by the current bridge code, is IGMP joins/leaves
from the host on the brX interface. These are not reported via
switchdev so that hardware knows the local host is interested in the
multicast frames.
Luckily, the bridge does track joins/leaves on the brX interface. The
code is obfusticated, which is why i missed it with my first attempt.
So the first patch tries to remove this obfustication. Currently,
there is no notifications sent when the bridge interface joins a
group. The second patch adds them. bridge monitor then shows
joins/leaves in the same way as for other ports of the bridge.
Then starts the work passing down to the hardware that the host has
joined/left a group. The existing switchdev mdb object cannot be used,
since the semantics are different. The existing
SWITCHDEV_OBJ_ID_PORT_MDB is used to indicate a specific multicast
group should be forwarded out that port of the switch. However here we
require the exact opposite. We want multicast frames for the group
received on the port to the forwarded to the host. Hence add a new
object SWITCHDEV_OBJ_ID_HOST_MDB, a multicast database entry to
forward to the host. This new object is then propagated through the
DSA layers. No DSA driver changes should be needed, this should just
work...
This version fixes up the nitpick from Nikolay, removes an unrelated
white space change, and adds in a patch adding a few const attributes
to a couple of functions taking a port parameter, in order to stop the
following patch produces warnings.
====================
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the host indicates when a multicast group should be forwarded
from the switch to the host, don't do it by default.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The notify mechanism does not need to modify the port it is notifying.
So make the parameter const.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add code to handle switchdev host mdb add/del. Since DSA uses one of
the switch ports as a transport to the host, we just need to add an
MDB on this port.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the host joins or leaves a multicast group, use switchdev to add
an object to the hardware to forward traffic for the group to the
host.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The host can join or leave a multicast group on the brX interface, as
indicated by IGMP snooping. This is tracked within the bridge
multicast code. Send a notification when this happens, in the same way
a notification is sent when a port of the bridge joins/leaves a group
because of IGMP snooping.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The boolean mglist indicates the host has joined a particular
multicast group on the bridge interface. It is badly named, obscuring
what is means. Rename it.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simple cases of overlapping changes in the packet scheduler.
Must easier to resolve this time.
Which probably means that I screwed it up somehow.
Signed-off-by: David S. Miller <davem@davemloft.net>
- Prevent the schedutil cpufreq governor from using the
utilization of a wrong CPU in some cases which started to
happen after one of the recent changes in it (Chris Redpath).
- Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM
interface as that causes serious issue (related to NVMe) to
appear on one of these machines, even though the other Dells
XPS13 9360 in somewhat different HW configurations behave
correctly (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJaBFMsAAoJEILEb/54YlRxb8sP/jhqgCEiaUvSpFYBLklx1fqK
QDbbYw3Os2JlAcMFElGjZEoV0+WBbadTHQxIxB353mpcAe1moI/SoDAJEgA9HWib
tjJIt1lak3nLF967y0pOgAi9uvFCG6Rw+ZyVSrYBnT+eG6fw3h6MdHaOEFeCHJY4
qUgj+Bm0OtisW9mRFk5zdGam+6MxaE1z9mhjRAebrAjOI+46Oj/LQl7q32S2vWTp
GJm6lSYOmSruo5vdU7T3dOSkBrkymf4E//1XkegFvcweEX/aITuCAil0Rl4X2PV0
NSeeCp66skftvlZ/vf3QOem1XLgPdVZ9dr7WRzOZt3CDctXuDaA3uKfRJf3oIMsi
rnrsmWnfsT4sNaqV6zHYTkMtQfSsxFfjOde/HC7vAtC0oM/EECsonBtIlMUBmYlR
7fi+wtrE6ErsMgkThOvplG+cAnvWCK8yhmFBAKbcwg9UHwooF0HxXPqVwHE6M8pw
eFALkJyagx1PTGKXNHOUTHfZcYGlhhJxH6SEhs/pJIxL8rVgey8mJfA9y8eGmEXP
fKPnEb6yI7/9ZpzFFu3DLiHMifgc6o9F0mRqw3apeQFGtg46frYw51meP5Faq0EZ
BAb0AXBF4qNB+XvORG0ABJkPqqDaZVPVfdo1SIA06IBr2mpVT1RAetMU0qSuWy+x
STI04myU/9hHmCSdqzpg
=F9fg
-----END PGP SIGNATURE-----
Merge tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull final power management fixes from Rafael Wysocki:
"These fix a regression in the schedutil cpufreq governor introduced by
a recent change and blacklist Dell XPS13 9360 from using the Low Power
S0 Idle _DSM interface which triggers serious problems on one of these
machines.
Specifics:
- Prevent the schedutil cpufreq governor from using the utilization
of a wrong CPU in some cases which started to happen after one of
the recent changes in it (Chris Redpath).
- Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM
interface as that causes serious issue (related to NVMe) to appear
on one of these machines, even though the other Dells XPS13 9360 in
somewhat different HW configurations behave correctly (Rafael
Wysocki)"
* tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360
cpufreq: schedutil: Examine the correct CPU when we update util
The amount of the changes isn't as quite small as wished, nevertheless
they are straight fixes that deserve merging to 4.14 final.
Most of fixes are about ALSA core bugs spotted by fuzzer: a follow-up
fix for the previous nested rwsem patch, a fix to avoid the resource
hogs due to too many concurrent ALSA timer invocations, and a fix for
a crash with SYSEX MIDI transfer over OSS sequencer emulation that is
used by none but fuzzer.
The rest are usual HD-audio and USB-audio device-specific quirks,
which are safe to apply.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEECxfAB4MH3rD5mfB6bDGAVD0pKaQFAloEM2wOHHRpd2FpQHN1
c2UuZGUACgkQbDGAVD0pKaTBhA//RLxylDxyZ/OPOpvCIcF9pPhZdt5Ep16h5bFd
fr/AmipflVkba6K7auT5JpcwBC+w+24VN2G8huUTZ+HAH1lVVqiSs+XZYw2sHlOi
uL/DZ0pEVN8412UH/7Iu4vYeiz/XRzd1YtGQkFI4KKrwm10Vz2sR6JHgvLh30/Zz
pyZrMSsu29yrT1R4l0zI5abEBs23n7vYpi9vsKInP44UkGZwi5iKPH0ThTrCJ92A
uKOAhIJkGZk0q9GCWnVhe/NWF6t0YK7c4FWqTgXOmIdDpGfylwdjdAsFAFG/cs4x
n4pZiwO3seFH+6fr4fc0esNqN/adj+15YEyzUmt12ctLKLEELFhJowT9Grnz1Zoq
jswfkXMJIWdqHQ3YBxlhs4Q8ZM7fDXf/gYqsOIRolxy6dtrJ1SLVjJkhtYWx/T+V
P++ftn9CQneji1/wqAqUw4HVyc+36rLYK60GKDVtrTCigyilCrq84Yj8W0qAqKJD
/CK77d1NKdVZ4do+GpzXjwpwNs6LzjSjqkNH2hdDG4gBNoILxxo9GNYSnjOiLxm4
UCFA6SkD4vlq70VT3vn9YleFq2FzNADQ8jVgliyL1+4mF5baPXKHAdtGJA/HI7Nz
23wf/1zIoOXd+GymaI20ke2EaVvTVVLDyvQSwFVspmSfu6MA0faGy8AUFJWCR8Nh
heXgbu0=
=Q2mh
-----END PGP SIGNATURE-----
Merge tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The amount of the changes isn't as quite small as wished, nevertheless
they are straight fixes that deserve merging to 4.14 final.
Most of fixes are about ALSA core bugs spotted by fuzzer: a follow-up
fix for the previous nested rwsem patch, a fix to avoid the resource
hogs due to too many concurrent ALSA timer invocations, and a fix for
a crash with SYSEX MIDI transfer over OSS sequencer emulation that is
used by none but fuzzer.
The rest are usual HD-audio and USB-audio device-specific quirks,
which are safe to apply"
* tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - fix headset mic problem for Dell machines with alc274
ALSA: seq: Fix OSS sysex delivery in OSS emulation
ALSA: seq: Avoid invalid lockdep class warning
ALSA: timer: Limit max instances per timer
ALSA: usb-audio: support new Amanero Combo384 firmware version
Pull networking fixes from David Miller:
1) Fix use-after-free in IPSEC input parsing, desintation address
pointer was loaded before pskb_may_pull() which can change the SKB
data pointers. From Florian Westphal.
2) Stack out-of-bounds read in xfrm_state_find(), from Steffen
Klassert.
3) IPVS state of SKB is not properly reset when moving between
namespaces, from Ye Yin.
4) Fix crash in asix driver suspend and resume, from Andrey Konovalov.
5) Don't deliver ipv6 l2tp tunnel packets to ipv4 l2tp tunnels, and
vice versa, from Guillaume Nault.
6) Fix DSACK undo on non-dup ACKs, from Priyaranjan Jha.
7) Fix regression in bond_xmit_hash()'s behavior after the TCP port
selection changes back in 4.2, from Hangbin Liu.
8) Two divide by zero bugs in USB networking drivers when parsing
descriptors, from Bjorn Mork.
9) Fix bonding slaves being stuck in BOND_LINK_FAIL state, from Jay
Vosburgh.
10) Missing skb_reset_mac_header() in qmi_wwan, from Kristian Evensen.
11) Fix the destruction of tc action object races properly, from Cong
Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
cls_u32: use tcf_exts_get_net() before call_rcu()
cls_tcindex: use tcf_exts_get_net() before call_rcu()
cls_rsvp: use tcf_exts_get_net() before call_rcu()
cls_route: use tcf_exts_get_net() before call_rcu()
cls_matchall: use tcf_exts_get_net() before call_rcu()
cls_fw: use tcf_exts_get_net() before call_rcu()
cls_flower: use tcf_exts_get_net() before call_rcu()
cls_flow: use tcf_exts_get_net() before call_rcu()
cls_cgroup: use tcf_exts_get_net() before call_rcu()
cls_bpf: use tcf_exts_get_net() before call_rcu()
cls_basic: use tcf_exts_get_net() before call_rcu()
net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
Revert "net_sched: hold netns refcnt for each action"
net: usb: asix: fill null-ptr-deref in asix_suspend
Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
qmi_wwan: Add missing skb_reset_mac_header-call
bonding: fix slave stuck in BOND_LINK_FAIL state
qrtr: Move to postcore_initcall
net: qmi_wwan: fix divide by 0 on bad descriptors
net: cdc_ether: fix divide by 0 on bad descriptors
...
Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and
the pin 0x1a is for Headphone Mic, he suggested to apply
ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we
verified applying this FIXUP can fix this problem.
Cc: <stable@vger.kernel.org>
Cc: Kailang Yang <kailang@realtek.com>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When the VLAN tag is present in the packet buffer (i.e VLAN stripping disabled, QinQ)
the driver will currently report CHECKSUM_UNNECESSARY.
Instead of using CHECKSUM_COMPLETE offload for packets with first
ethertype of IPv4/6, use it for packets with last ethertype of IPv4/6 to
cover the former cases as well.
The checksum field present in the CQE is calculated from the IP header
until the end of the packet. When the first ethertype is different than
IPv4/6 (for ex. 802.1Q VLAN) a checksum of the VLAN header/s should be
added. The small header/s checksum calculation will allow us to use
CHECKSUM_COMPLETE instead of CHECKSUM_UNNECESSARY.
Testing bandwidth of one and 8 TCP streams to a single RQ,
LRO and VLAN stripping offloads disabled:
CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
NIC: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
Before:
+--------------+--------------------+---------------------+----------------------+
| Traffic type | 1 Stream BW [Mbps] | 8 Streams BW [Mbps] | Checksum offload |
+--------------+--------------------+---------------------+----------------------+
| Untagged | 28,247.35 | 24,716.88 | CHECKSUM_COMPLETE |
| VLAN | 27,516.69 | 23,752.26 | CHECKSUM_UNNECESSARY |
| QinQ | 6,961.30 | 20,667.04 | CHECKSUM_UNNECESSARY |
+--------------+--------------------+---------------------+----------------------+
Now:
+--------------+--------------------+---------------------+-------------------+
| Traffic type | 1 Stream BW [Mbps] | 8 Streams BW [Mbps] | Checksum offload |
+--------------+--------------------+---------------------+-------------------+
| Untagged | 28,521.28 | 24,926.32 | CHECKSUM_COMPLETE |
| VLAN | 27,389.37 | 23,715.34 | CHECKSUM_COMPLETE |
| QinQ | 6,901.77 | 20,845.73 | CHECKSUM_COMPLETE |
+--------------+--------------------+---------------------+-------------------+
No performance degradation observed.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The following counters are now exposed through ethtool -S:
rx[i]_removed_vlan_packets (per channel)
rx_removed_vlan_packets
tx[i]_added_vlan_packets (per channel)
tx_added_vlan_packets
rx_removed_vlan_packets: The number of packets that had their
outer VLAN header stripped to the CQE by the hardware.
tx_added_vlan_packets: The number of packets that had their
outer VLAN header inserted by the hardware.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Report VLAN insertion support for S-tagged packets and add support by
choosing the correct VLAN type in the WQE.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When a user chooses to use 802.1ad VLAN the proper steering rules will
be added to the VLAN flow table (matching the specific S-tag VID).
Due to current hardware limitation, when using 802.1ad, we must disable
C-tag VLAN stripping on the RQs.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Extend the net device error logging with netdev_*_once macros.
netdev_*_once are the equivalents of the dev_*_once macros which are
useful for messages that should only be logged once.
Also add netdev_WARN_ONCE, which is the "once" extension for the already
existing netdev_WARN macro.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When add VLAN rule fails the active vlan bit should be cleared.
Fixes: afb736e933 ("net/mlx5: Ethernet resource handling files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Rename VLAN related symbols to better reflect the fact that they
are associated to C-tag VLAN.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-11-09
1) Fix a use after free due to a reallocated skb head.
From Florian Westphal.
2) Fix sporadic lookup failures on labeled IPSEC.
From Florian Westphal.
3) Fix a stack out of bounds when a socket policy is applied
to an IPv6 socket that sends IPv4 packets.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang says:
====================
net_sched: close the race between call_rcu() and cleanup_net()
This patchset tries to fix the race between call_rcu() and
cleanup_net() again. Without holding the netns refcnt the
tc_action_net_exit() in netns workqueue could be called before
filter destroy works in tc filter workqueue. This patchset
moves the netns refcnt from tc actions to tcf_exts, without
breaking per-netns tc actions.
Patch 1 reverts the previous fix, patch 2 introduces two new
API's to help to address the bug and the rest patches switch
to the new API's. Please see each patch for details.
I was not able to reproduce this bug, but now after adding
some delay in filter destroy work I manage to trigger the
crash. After this patchset, the crash is not reproducible
any more and the debugging printk's show the order is expected
too.
====================
Fixes: ddf97ccdd7 ("net_sched: add network namespace support for tc actions")
Reported-by: Lucas Bates <lucasb@mojatatu.com>
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.
Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of holding netns refcnt in tc actions, we can minimize
the holding time by saving it in struct tcf_exts instead. This
means we can just hold netns refcnt right before call_rcu() and
release it after tcf_exts_destroy() is done.
However, because on netns cleanup path we call tcf_proto_destroy()
too, obviously we can not hold netns for a zero refcnt, in this
case we have to do cleanup synchronously. It is fine for RCU too,
the caller cleanup_net() already waits for a grace period.
For other cases, refcnt is non-zero and we can safely grab it as
normal and release it after we are done.
This patch provides two new API for each filter to use:
tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
use the following pattern:
void __destroy_filter() {
tcf_exts_destroy();
tcf_exts_put_net(); // <== release netns refcnt
kfree();
}
void some_work() {
rtnl_lock();
__destroy_filter();
rtnl_unlock();
}
void some_rcu_callback() {
tcf_queue_work(some_work);
}
if (tcf_exts_get_net()) // <== hold netns refcnt
call_rcu(some_rcu_callback);
else
__destroy_filter();
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit ceffcc5e25.
If we hold that refcnt, the netns can never be destroyed until
all actions are destroyed by user, this breaks our netns design
which we expect all actions are destroyed when we destroy the
whole netns.
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot says:
====================
net: dsa: setup stage
When probing a DSA switch, there is basically two stages.
The first stage is the parsing of the switch device, from either device
tree or platform data. It fetches the DSA tree to which it belongs, and
validates its ports. The switch device is then added to the tree, and
the second stage is called if this was the last switch of the tree.
The second stage is the setup of the tree, which validates that the tree
is complete, sets up the routing tables, the default CPU port for user
ports, sets up the switch drivers and finally the master interfaces,
which makes the whole switch fabric functional.
This patch series covers the second setup stage. The setup and teardown
of a switch tree have been separated into logical steps, and the probing
of a switch now simply parses and adds a switch to a tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit brings no functional changes. It gets rid of the underscore
prefixed _dsa_register_switch and _dsa_unregister_switch functions in
favor of dsa_switch_probe() which parses and adds a switch to a tree and
dsa_switch_remove() which removes a switch from a tree.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the tree setup is centralized, we can simplify the code a bit
more by setting up or tearing down the tree directly when adding or
removing a switch to/from it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The *_complete() functions take too much arguments to do only one thing:
they try to fetch the dsa_port structures corresponding to device nodes
under the "link" list property of DSA ports, and use them to setup the
routing table of switches.
This patch simplifies them by providing instead simpler
dsa_{port,switch,tree}_setup_routing_table functions which return a
boolean value, true if the tree is complete.
dsa_tree_setup_routing_table is called inside dsa_tree_setup which
simplifies the switch registering function as well.
A switch's routing table is now initialized before its setup.
This also makes dsa_port_is_valid obsolete, remove it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The OF code provides a of_for_each_phandle() helper to iterate over
phandles. Use it instead of arbitrary iterating ourselves over the list
of phandles hanging to the "link" property of the port's device node.
The of_phandle_iterator_next() helper calls of_node_put() itself on
it.node. Thus We must only do it ourselves if we break the loop.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of having two dsa_ds_find_port_dn (which returns a bool) and
dsa_dst_find_port_dn (which returns a switch) functions, provide a more
explicit dsa_tree_find_port_by_node function which returns a matching
port.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dsa_dsa_port_apply and dsa_cpu_port_apply functions do exactly the
same. The dsa_user_port_apply function does not try to register a fixed
link but try to create a slave.
This commit factorizes and scopes all that in two convenient
dsa_port_setup and dsa_port_teardown functions.
It won't hurt to register a devlink_port for unused port as well.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patches brings no functional changes. It removes the unused dst
argument from the dsa_ds_apply and dsa_ds_unapply functions, rename them
to dsa_switch_setup and dsa_switch_teardown for a more explicit scope.
This clarifies the steps of the setup or teardown of a switch fabric.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit provides better scope for the DSA tree setup and teardown
functions. It renames the "applied" bool to "setup" and print a message
when the tree is setup, as it is done during teardown.
At the same time, check dst->setup in dsa_tree_setup, where it is set to
true.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add DSA helpers to setup and teardown a master net device wired to its
CPU port. This centralizes the dsa_ptr assignment.
This also makes the master ethtool helpers static at the same time.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dsa_dst_parse function called just before dsa_dst_apply does not
parse the tree but does only one thing: it assigns the default CPU port
to dst->cpu_dp and to each user ports.
This patch simplifies this by calling a dsa_tree_setup_default_cpu
function at the beginning of dsa_dst_apply directly.
A dsa_port_is_user helper is added for convenience.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A DSA port has a dedicated CPU port assigned to it, stored in the cpu_dp
member. It is not meant to be modified by a port, thus make it const.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit baedf68a06.
There is an updated version of this fix which covers
the problem more thoroughly.
Signed-off-by: David S. Miller <davem@davemloft.net>