Christian Marangi says:
====================
Add MTU change with stmmac interface running
This series is to permit MTU change while the interface is running.
Major rework are needed to permit to allocate a new dma conf based on
the new MTU before applying it. This is to make sure there is enough
space to allocate all the DMA queue before releasing the stmmac driver.
This was tested with a simple way to stress the network while the
interface is running.
2 ssh connection to the device:
- One generating simple traffic with while true; do free; done
- The other making the mtu change with a delay of 1 second
The connection is correctly stopped and recovered after the MTU is changed.
The first 2 patch of this series are minor fixup that fix problems
presented while testing this. One fix a problem when we renable a queue
while we are generating a new dma conf. The other is a corner case that
was notice while stressing the driver and turning down the interface while
there was some traffic.
(this is a follow-up of a simpler patch that wanted to add the same
feature. It was suggested to first try to check if it was possible to
apply the new configuration. Posting as RFC as it does major rework for
the new concept of DMA conf)
====================
Link: https://lore.kernel.org/r/20220723142933.16030-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are sleep in atomic context bugs in timer handlers of sctp
such as sctp_generate_t3_rtx_event(), sctp_generate_probe_event(),
sctp_generate_t1_init_event(), sctp_generate_timeout_event(),
sctp_generate_t3_rtx_event() and so on.
The root cause is sctp_sched_prio_init_sid() with GFP_KERNEL parameter
that may sleep could be called by different timer handlers which is in
interrupt context.
One of the call paths that could trigger bug is shown below:
(interrupt context)
sctp_generate_probe_event
sctp_do_sm
sctp_side_effects
sctp_cmd_interpreter
sctp_outq_teardown
sctp_outq_init
sctp_sched_set_sched
n->init_sid(..,GFP_KERNEL)
sctp_sched_prio_init_sid //may sleep
This patch changes gfp_t parameter of init_sid in sctp_sched_set_sched()
from GFP_KERNEL to GFP_ATOMIC in order to prevent sleep in atomic
context bugs.
Fixes: 5bbbbe32a4 ("sctp: introduce stream scheduler foundations")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/20220723015809.11553-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove the limitation where the interface needs to be down to change
MTU by releasing and opening the stmmac driver to set the new MTU.
Also call the set_filter function to correctly init the port.
This permits to remove the EBUSY error while the ethernet port is
running permitting a correct MTU change if for example a DSA request
a MTU change for a switch CPU port.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rework the driver to generate the stmmac dma_conf before stmmac_open.
This permits a function to first check if it's possible to allocate a
new dma_config and then pass it directly to __stmmac_open and "open" the
interface with the new configuration.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move dma buf conf to dedicated struct. This in preparation for code
rework that will permit to allocate separate dma_conf without affecting
the priv struct.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Disable all queues and disconnect before tx_disable in stmmac_release to
prevent a corner case where packet may be still queued at the same time
tx_disable is called resulting in kernel panic if some packet still has
to be processed.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move queue reset to dedicated functions. This aside from a simple
cleanup is also required to allocate a dma conf without resetting the tx
queue while the device is temporarily detached as now the reset is not
part of the dma init function and can be done later in the code flow.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
delete extra space and tab in blank line, there is no functional change.
Reported-by: Hacash Robot <hacashRobot@santino.com>
Signed-off-by: William Dean <williamsukatube@gmail.com>
Link: https://lore.kernel.org/r/20220723073222.2961602-1-williamsukatube@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Due to an invalid conflict resolution on my side while working on 2
different series (LAG FDBs and FDB isolation), dsa_switch_do_lag_fdb_add()
does not store the database associated with a dsa_mac_addr structure.
So after adding an FDB entry associated with a LAG, dsa_mac_addr_find()
fails to find it while deleting it, because &a->db is zeroized memory
for all stored FDB entries of lag->fdbs, and dsa_switch_do_lag_fdb_del()
returns -ENOENT rather than deleting the entry.
Fixes: c26933639b ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220723012411.1125066-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix the inability to bring an interface up on a setup with
only MSI interrupts enabled (no MSI-X).
Solution is to add a default number of QPs = 1. This is enough,
since without MSI-X support driver enables only a basic feature set.
Fixes: bc6d33c8d9 ("i40e: Fix the number of queues available to be mapped for use")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220722175401.112572-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Third set of patches for v5.20. MLO work continues and we have a lot
of stack changes due to that, including driver API changes. Not much
driver patches except on mt76.
Major changes:
cfg80211/mac80211
* more prepartion for Wi-Fi 7 Multi-Link Operation (MLO) support,
works with one link now
* align with IEEE Draft P802.11be_D2.0
* hardware timestamps for receive and transmit
mt76
* preparation for new chipset support
* ACPI SAR support
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmLe1k4RHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZvxlQf8DrZIllhF0q/7Wry3JuG0gbNA+V2eI/lc
OYrephsDBm/dvvyjcFWcdUzxoNk0k1+aOrx/09JijHFgCGKVwuK1+hxYVfjW2q43
9mHxJBo4NcMk1RDDM3paVuZ8QMHuYugbv2mQOZeAEq2XloAaqEM7wVE+bb4Mgtgx
VAKS5du2igrSt83wl8BRMFb9MPAM1EQ3Cw7Ro5T4y+1Qm/hrBm6qWizSpqh9CXYx
pDLR3pvQxiD4Axa0Uq3rUbyF4hLwciqSFOJvr2sI3q7b9YElA7wIi6NQzMkYJH6Z
7HW5K6UIQbblAaQkv2BLqpU1N6puTHUOAf5Md31vOAaOcGbSI5hbUA==
=Cnxg
-----END PGP SIGNATURE-----
Merge tag 'wireless-next-2022-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v5.20
Third set of patches for v5.20. MLO work continues and we have a lot
of stack changes due to that, including driver API changes. Not much
driver patches except on mt76.
Major changes:
cfg80211/mac80211
- more prepartion for Wi-Fi 7 Multi-Link Operation (MLO) support,
works with one link now
- align with IEEE Draft P802.11be_D2.0
- hardware timestamps for receive and transmit
mt76
- preparation for new chipset support
- ACPI SAR support
* tag 'wireless-next-2022-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (254 commits)
wifi: mac80211: fix link data leak
wifi: mac80211: mlme: fix disassoc with MLO
wifi: mac80211: add macros to loop over active links
wifi: mac80211: remove erroneous sband/link validation
wifi: mac80211: mlme: transmit assoc frame with address translation
wifi: mac80211: verify link addresses are different
wifi: mac80211: rx: track link in RX data
wifi: mac80211: optionally implement MLO multicast TX
wifi: mac80211: expand ieee80211_mgmt_tx() for MLO
wifi: nl80211: add MLO link ID to the NL80211_CMD_FRAME TX API
wifi: mac80211: report link ID to cfg80211 on mgmt RX
wifi: cfg80211: report link ID in NL80211_CMD_FRAME
wifi: mac80211: add hardware timestamps for RX and TX
wifi: cfg80211: add hardware timestamps to frame RX info
wifi: cfg80211/nl80211: move rx management data into a struct
wifi: cfg80211: add a function for reporting TX status with hardware timestamps
wifi: nl80211: add RX and TX timestamp attributes
wifi: ieee80211: add helper functions for detecting TM/FTM frames
wifi: mac80211_hwsim: handle links for wmediumd/virtio
wifi: mac80211: sta_info: fix link_sta insertion
...
====================
Link: https://lore.kernel.org/r/20220725174547.EA465C341C6@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These properties are inherited from ethernet-controller.yaml.
This fixes the dt_binding_check warning:
imx8mm-tqma8mqml-mba8mx.dt.yaml: ethernet@30be0000: 'nvmem-cell-names',
'nvmem-cells' do not match any of the regexes: 'pinctrl-[0-9]+'
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220720063924.1412799-1-alexander.stein@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-07-22
This series contains updates to i40e and iavf drivers.
Przemyslaw adds a helper function for determining whether TC MQPRIO is
enabled for i40e.
Avinash utilizes the driver's bookkeeping of filters to check for
duplicate filter before sending the request to the PF for iavf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
A typical flow offload scenario for OpenWrt users is routed traffic
received by the wan interface that is redirected to a wlan device
belonging to the lan bridge. Current implementation fails to
fill wdma offload info in mtk_flow_get_wdma_info() since odev device is
the local bridge. Fix the issue running dev_fill_forward_path routine in
mtk_flow_get_wdma_info in order to identify the wlan device.
Tested-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel says:
====================
mlxsw: Spectrum-2 PTP preparations
This patchset includes various preparations required for Spectrum-2 PTP
support.
Most of the changes are non-functional (e.g., renaming, adding
registers). The only intentional user visible change is in patch #10
where the PHC time is initialized to zero in accordance with the
recommendation of the PTP maintainer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlxsw_sp_ptp_phc_adjfreq() configures MTUTC register to adjust
hardware frequency by a given value.
This configuration will be same for Spectrum-2. In preparation for
Spectrum-2 PTP support, rename the function to not be Spectrum-1 specific.
Later, it will be used for Spectrum-2 also.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Spectrum-1 and Spectrum-2 differ in their time stamping capabilities.
The former can be configured to time stamp only a subset of received PTP
events (e.g., only Sync), whereas the latter will time stamp all PTP
events or none.
In preparation for Spectrum-2 PTP support, rename the function that
parses the hardware time stamping configuration upon %SIOCSHWTSTAMP to
be Spectrum-1 specific.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there is one shared structure that holds the required
structures for PTP clock. Most of the existing fields are relevant only
for Spectrum-1 (cycles, timecounter, and more). Rename the structure to
be specific for Spectrum-1 and align the existing code. Add a common
structure which includes the structures which will be used also for
Spectrum-2. This structure will be returned from clock_init() operation,
as the definition is shared between all ASICs' operations.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there is one shared structure that holds the required
structures and details for PTP. Most of the existing fields are relevant
only for Spectrum-1 (hash table, lock for hash table, delayed work, and
more). Rename the structure to be specific for Spectrum-1 and align the
existing code. Add a common structure which includes
'struct mlxsw_sp *mlxsw_sp' and will be returned from ptp_init()
operation, as the definition is shared between all ASICs' operations.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the reading of FRC values (high and low) is done using macro
which calls to a function. In addition, to calculate the offset of FRC,
a simple macro is used. This code can be simplified by adding an helper
function and calculating the offset explicitly instead of using an
additional macro for that.
Add the helper function and convert the existing code. This helper will be
used later to read UTC clock.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As lately recommended in the mailing list[1], set the clock to zero time as
part of initialization.
The idea is that when the clock reads 'Jan 1, 1970', then it is clearly
wrong and user will not mistakenly think that the clock is set correctly.
If as part of initialization, the driver sets the clock, user might see
correct date and time (maybe with a small shift) and assume that there
is no need to sync the clock.
Fix the existing code of Spectrum-1 to set the 'timecounter' to zero.
[1]:
https://lore.kernel.org/netdev/20220201191041.GB7009@hoboy.vegasvil.org/
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the 'read_frc_capable' bit to 'read_clock_capable' since now it can
be both the FRC and UTC clocks.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a resource identifier for maximum number of FIDs so that it could be
later used to query the information from firmware.
In Spectrum-2 and Spectrum-3, the correction field of PTP packets which are
sent as control packets is not updated at egress port. To overcome this
limitation, some packets will be sent as data packets. The header should
include FID, which is supposed to be 'Max FID + port - 1'. As preparation,
add the required resource, to be able to query the value from firmware
later.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the field FID in TX header is defined, but is not used as it is
relevant only for data packets. mlxsw driver currently sends all
host-generated traffic as control packets and not as data packets.
In Spectrum-2 and Spectrum-3, the correction field of PTP packets which
are sent as control packets is not updated at egress port. To overcome this
limitation while adding support for PTP, some packets will be sent as data
packets.
Fix the wrong shift in the definition, to allow using the field later.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The type of time stamp field in the CQE is configured via the
CONFIG_PROFILE command during driver initialization. Add the definition
of the relevant fields to the command's payload and set the type to UTC
for Spectrum-2 and above. This configuration can be done as part of the
preparations to PTP support, as the type of the time stamp will not break
any existing behavior.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add UTC sec and nsec PCI BAR and offset to query firmware command for a
future use.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Completion Queue Element version 2 (CQEv2) includes various metadata
fields of packets.
Add 'time_stamp' and 'time_stamp_type' fields along with functions to
extract the seconds and nanoseconds for a future use.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In Spectrum-2, all the packets are time stamped, the MTPCPC register is
used to configure the types of packets that will adjust the correction
field and which port will trap PTP packets.
If ingress correction is set on a port for a given packet type, then
when such a packet is received via the port, the current time stamp is
subtracted from the correction field.
If egress correction is set on a port for a given packet type, then when
such a packet is transmitted via the port, the current time stamp is
added to the correction field.
Assuming the systems is configured correctly, the above means that the
correction field will contain the transient delay between the ports.
Add this register for a future use in order to support PTP in Spectrum-2.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MTUTC register configures the HW UTC counter.
Add the relevant fields and operations to support PTP in Spectrum-2 and
update mlxsw_reg_mtutc_pack() with the new fields for a future use.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The right name of the register is MTPTPT, which refers to Monitoring
Precision Time Protocol Trap Register.
Therefore, rename the function mlxsw_reg_mtptptp_pack() to
mlxsw_reg_mtptpt_pack().
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2022-07-20
1) Don't set DST_NOPOLICY in IPv4, a recent patch made this
superfluous. From Eyal Birger.
2) Convert alg_key to flexible array member to avoid an iproute2
compile warning when built with gcc-12.
From Stephen Hemminger.
3) xfrm_register_km and xfrm_unregister_km do always return 0
so change the type to void. From Zhengchao Shao.
4) Fix spelling mistake in esp6.c
From Zhang Jiaming.
5) Improve the wording of comment above XFRM_OFFLOAD flags.
From Petr Vaněk.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima says:
====================
sysctl: Fix data-races around ipv4_net_table (Round 6, Final).
This series fixes data-races around 11 knobs after tcp_pacing_ss_ratio
ipv4_net_table, and this is the final round for ipv4_net_table.
While at it, other data-races around these related knobs are fixed.
- decnet_mem
- decnet_rmem
- tipc_rmem
There are still 58 tables possibly missing some fixes under net/.
$ grep -rnE "struct ctl_table.*?\[\] =" net/ | wc -l
60
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading sysctl_fib_notify_on_flag_change, it can be changed
concurrently. Thus, we need to add READ_ONCE() to its readers.
Fixes: 680aea08e7 ("net: ipv4: Emit notification when fib hardware flags are changed")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading sysctl_tcp_reflect_tos, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.
Fixes: ac8f1710c1 ("tcp: reflect tos value received in SYN to the socket")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading sysctl_tcp_comp_sack_nr, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.
Fixes: 9c21d2fc41 ("tcp: add tcp_comp_sack_nr sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading sysctl_tcp_comp_sack_slack_ns, it can be changed
concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: a70437cc09 ("tcp: add hrtimer slack to sack compression")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading sysctl_tcp_comp_sack_delay_ns, it can be changed
concurrently. Thus, we need to add READ_ONCE() to its reader.
Fixes: 6d82aa2420 ("tcp: add tcp_comp_sack_delay_ns sysctl")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading these sysctl variables, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.
- .sysctl_rmem
- .sysctl_rwmem
- .sysctl_rmem_offset
- .sysctl_wmem_offset
- sysctl_tcp_rmem[1, 2]
- sysctl_tcp_wmem[1, 2]
- sysctl_decnet_rmem[1]
- sysctl_decnet_wmem[1]
- sysctl_tipc_rmem[1]
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reading sysctl_tcp_pacing_(ss|ca)_ratio, they can be changed
concurrently. Thus, we need to add READ_ONCE() to their readers.
Fixes: 43e122b014 ("tcp: refine pacing rate determination")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mld_{query | report}_work() processes queued events.
If there are too many events in the queue, it re-queue a work.
And then, it returns without in6_dev_put().
But if queuing is failed, it should call in6_dev_put(), but it doesn't.
So, a reference count leak would occur.
THREAD0 THREAD1
mld_report_work()
spin_lock_bh()
if (!mod_delayed_work())
in6_dev_hold();
spin_unlock_bh()
spin_lock_bh()
schedule_delayed_work()
spin_unlock_bh()
Script to reproduce(by Hangbin Liu):
ip netns add ns1
ip netns add ns2
ip netns exec ns1 sysctl -w net.ipv6.conf.all.force_mld_version=1
ip netns exec ns2 sysctl -w net.ipv6.conf.all.force_mld_version=1
ip -n ns1 link add veth0 type veth peer name veth0 netns ns2
ip -n ns1 link set veth0 up
ip -n ns2 link set veth0 up
for i in `seq 50`; do
for j in `seq 100`; do
ip -n ns1 addr add 2021:${i}::${j}/64 dev veth0
ip -n ns2 addr add 2022:${i}::${j}/64 dev veth0
done
done
modprobe -r veth
ip -a netns del
splat looks like:
unregister_netdevice: waiting for veth0 to become free. Usage count = 2
leaked reference.
ipv6_add_dev+0x324/0xec0
addrconf_notify+0x481/0xd10
raw_notifier_call_chain+0xe3/0x120
call_netdevice_notifiers+0x106/0x160
register_netdevice+0x114c/0x16b0
veth_newlink+0x48b/0xa50 [veth]
rtnl_newlink+0x11a2/0x1a40
rtnetlink_rcv_msg+0x63f/0xc00
netlink_rcv_skb+0x1df/0x3e0
netlink_unicast+0x5de/0x850
netlink_sendmsg+0x6c9/0xa90
____sys_sendmsg+0x76a/0x780
__sys_sendmsg+0x27c/0x340
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: f185de28d9 ("mld: add new workqueues for process mld events")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Harini Katakam says:
====================
macb: Add Versal compatible string to Macb driver
Add Versal device support.
v2:
- Sort compatible strings alphabetically in DT bindings.
- Reorganize new config and CAPS order to be cleaner.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
On Versal TSU clock cannot be disabled irrespective of whether PTP is
used. Hence introduce a new Versal config structure with a "need tsu"
caps flag and check the same in runtime_suspend/resume before cutting
off clocks.
More information on this for future reference:
This is an IP limitation on versions 1p11 and 1p12 when Qbv is enabled
(See designcfg1, bit 3). However it is better to rely on an SoC specific
check rather than the IP version because tsu clk property itself may not
represent actual HW tsu clock on some chip designs.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sort capability flags by the bit position set.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
init_rx_sa() allocates relevant resource for rx_sa->stats and rx_sa->
key.tfm with alloc_percpu() and macsec_alloc_tfm(). When some error
occurs after init_rx_sa() is called in macsec_add_rxsa(), the function
released rx_sa with kfree() without releasing rx_sa->stats and rx_sa->
key.tfm, which will lead to a resource leak.
We should call macsec_rxsa_put() instead of kfree() to decrease the ref
count of rx_sa and release the relevant resource if the refcount is 0.
The same bug exists in macsec_add_txsa() for tx_sa as well. This patch
fixes the above two bugs.
Fixes: 3cf3227a21 ("net: macsec: hardware offloading infrastructure")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca says:
====================
macsec: fix config issues
The patch adding netlink support for XPN (commit 48ef50fa86
("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)"))
introduced several issues, including a kernel panic reported at [1].
Reproducing those bugs with upstream iproute is limited, since iproute
doesn't currently support XPN. I'm also working on this.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=208315
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, MACSEC_SA_ATTR_PN is handled inconsistently, sometimes as a
u32, sometimes forced into a u64 without checking the actual length of
the attribute. Instead, we can use nla_get_u64 everywhere, which will
read up to 64 bits into a u64, capped by the actual length of the
attribute coming from userspace.
This fixes several issues:
- the check in validate_add_rxsa doesn't work with 32-bit attributes
- the checks in validate_add_txsa and validate_upd_sa incorrectly
reject X << 32 (with X != 0)
Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
IEEE 802.1AEbw-2013 (section 10.7.8) specifies that the maximum value
of the replay window is 2^30-1, to help with recovery of the upper
bits of the PN.
To avoid leaving the existing macsec device in an inconsistent state
if this test fails during changelink, reuse the cleanup mechanism
introduced for HW offload. This wasn't needed until now because
macsec_changelink_common could not fail during changelink, as
modifying the cipher suite was not allowed.
Finally, this must happen after handling IFLA_MACSEC_CIPHER_SUITE so
that secy->xpn is set.
Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The expected length is MACSEC_SALT_LEN, not MACSEC_SA_ATTR_SALT.
Fixes: 48ef50fa86 ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>