Commit Graph

1031329 Commits

Author SHA1 Message Date
Jakub Kicinski 396492b4c5 docs: networking: netdevsim rules
There are aspects of netdevsim which are commonly
misunderstood and pointed out in review. Cong
suggest we document them.

Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 12:43:27 +01:00
Peilin Ye 625af9f029 tc-testing: Add control-plane selftests for sch_mq
Recently we added multi-queue support to netdevsim in commit d4861fc6be
("netdevsim: Add multi-queue support"); add a few control-plane selftests
for sch_mq using this new feature.

Use nsPlugin.py to avoid network interface name collisions.

Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 12:42:27 +01:00
Vladimir Oltean a54182b2a5 Revert "net: build all switchdev drivers as modules when the bridge is a module"
This reverts commit b0e8181762. Explicit
driver dependency on the bridge is no longer needed since
switchdev_bridge_port_{,un}offload() is no longer implemented by the
bridge driver but by switchdev.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 12:35:07 +01:00
Vladimir Oltean 957e2235e5 net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge
With the introduction of explicit offloading API in switchdev in commit
2f5dc00f7a ("net: bridge: switchdev: let drivers inform which bridge
ports are offloaded"), we started having Ethernet switch drivers calling
directly into a function exported by net/bridge/br_switchdev.c, which is
a function exported by the bridge driver.

This means that drivers that did not have an explicit dependency on the
bridge before, like cpsw and am65-cpsw, now do - otherwise it is not
possible to call a symbol exported by a driver that can be built as
module unless you are a module too.

There was an attempt to solve the dependency issue in the form of commit
b0e8181762 ("net: build all switchdev drivers as modules when the
bridge is a module"). Grygorii Strashko, however, says about it:

| In my opinion, the problem is a bit bigger here than just fixing the
| build :(
|
| In case, of ^cpsw the switchdev mode is kinda optional and in many
| cases (especially for testing purposes, NFS) the multi-mac mode is
| still preferable mode.
|
| There were no such tight dependency between switchdev drivers and
| bridge core before and switchdev serviced as independent, notification
| based layer between them, so ^cpsw still can be "Y" and bridge can be
| "M". Now for mostly every kernel build configuration the CONFIG_BRIDGE
| will need to be set as "Y", or we will have to update drivers to
| support build with BRIDGE=n and maintain separate builds for
| networking vs non-networking testing.  But is this enough?  Wouldn't
| it cause 'chain reaction' required to add more and more "Y" options
| (like CONFIG_VLAN_8021Q)?
|
| PS. Just to be sure we on the same page - ARM builds will be forced
| (with this patch) to have CONFIG_TI_CPSW_SWITCHDEV=m and so all our
| automation testing will just fail with omap2plus_defconfig.

In the light of this, it would be desirable for some configurations to
avoid dependencies between switchdev drivers and the bridge, and have
the switchdev mode as completely optional within the driver.

Arnd Bergmann also tried to write a patch which better expressed the
build time dependency for Ethernet switch drivers where the switchdev
support is optional, like cpsw/am65-cpsw, and this made the drivers
follow the bridge (compile as module if the bridge is a module) only if
the optional switchdev support in the driver was enabled in the first
place:
https://patchwork.kernel.org/project/netdevbpf/patch/20210802144813.1152762-1-arnd@kernel.org/

but this still did not solve the fact that cpsw and am65-cpsw now must
be built as modules when the bridge is a module - it just expressed
correctly that optional dependency. But the new behavior is an apparent
regression from Grygorii's perspective.

So to support the use case where the Ethernet driver is built-in,
NET_SWITCHDEV (a bool option) is enabled, and the bridge is a module, we
need a framework that can handle the possible absence of the bridge from
the running system, i.e. runtime bloatware as opposed to build-time
bloatware.

Luckily we already have this framework, since switchdev has been using
it extensively. Events from the bridge side are transmitted to the
driver side using notifier chains - this was originally done so that
unrelated drivers could snoop for events emitted by the bridge towards
ports that are implemented by other drivers (think of a switch driver
with LAG offload that listens for switchdev events on a bonding/team
interface that it offloads).

There are also events which are transmitted from the driver side to the
bridge side, which again are modeled using notifiers.
SWITCHDEV_FDB_ADD_TO_BRIDGE is an example of this, and deals with
notifying the bridge that a MAC address has been dynamically learned.
So there is a precedent we can use for modeling the new framework.

The difference compared to SWITCHDEV_FDB_ADD_TO_BRIDGE is that the work
that the bridge needs to do when a port becomes offloaded is blocking in
its nature: replay VLANs, MDBs etc. The calling context is indeed
blocking (we are under rtnl_mutex), but the existing switchdev
notification chain that the bridge is subscribed to is only the atomic
one. So we need to subscribe the bridge to the blocking switchdev
notification chain too.

This patch:
- keeps the driver-side perception of the switchdev_bridge_port_{,un}offload
  unchanged
- moves the implementation of switchdev_bridge_port_{,un}offload from
  the bridge module into the switchdev module.
- makes everybody that is subscribed to the switchdev blocking notifier
  chain "hear" offload & unoffload events
- makes the bridge driver subscribe and handle those events
- moves the bridge driver's handling of those events into 2 new
  functions called br_switchdev_port_{,un}offload. These functions
  contain in fact the core of the logic that was previously in
  switchdev_bridge_port_{,un}offload, just that now we go through an
  extra indirection layer to reach them.

Unlike all the other switchdev notification structures, the structure
used to carry the bridge port information, struct
switchdev_notifier_brport_info, does not contain a "bool handled".
This is because in the current usage pattern, we always know that a
switchdev bridge port offloading event will be handled by the bridge,
because the switchdev_bridge_port_offload() call was initiated by a
NETDEV_CHANGEUPPER event in the first place, where info->upper_dev is a
bridge. So if the bridge wasn't loaded, then the CHANGEUPPER event
couldn't have happened.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 12:35:07 +01:00
David S. Miller 9c0532f9cc linux-can-next-for-5.15-20210804
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmEKaBUTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqSvgCACpR64hydl7/qt9QGnm9Ym6/v/L9y9v
 aBfZMQsedP1GSuev5PpxghXU4GF0LXiDr6ryr0hhu7w2ojjlLNl9sVHCF9qdAJKz
 x2D4YTlxct2KuPBdhWllQr/KWFbJh2IzarHEWzdo+QoU5A8jDlsK2kLeeikFECzT
 fVUe3mu1k66/DvHsetsfzIvbUkuHk2SPpK/pwrUC6Siw6wQZBHlSoUEtBNwEPlyH
 8+ZQJPqtrjr2v3mZUOkgHrlXEOZRu6OM3i1Yv2bn2x4VI+3KQHEw/cA1WNE2AOzN
 CfMp4sS98QdCrAboX4VJZpGAbziTFHedqFjjIP9ultCfH9ROHhQj4Zsl
 =37wt
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-5.15-20210804' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-08-04

this is a pull request of 5 patches for net-next/master.

The first patch is by me and fixes a typo in a comment in the CAN
J1939 protocol.

The next 2 patches are by Oleksij Rempel and update the CAN J1939
protocol to send RX status updates via the error queue mechanism.

The next patch is by me and adds a missing variable initialization to
the flexcan driver (the problem was introduced in the current net-next
cycle).

The last patch is by Aswath Govindraju and adds power-domains to the
Bosch m_can DT binding documentation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 11:30:09 +01:00
Aswath Govindraju d85165b238 dt-bindings: net: can: Document power-domains property
Document power-domains property for adding the Power domain provider.

Link: https://lore.kernel.org/r/20210802091822.16407-1-a-govindraju@ti.com
Signed-off-by: Aswath Govindraju <a-govindraju@ti.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-04 12:11:57 +02:00
Marc Kleine-Budde 3362666972 can: flexcan: flexcan_clks_enable(): add missing variable initialization
This patch adds the missing initialization of the "err" variable in
the flexcan_clks_enable() function.

Fixes: d9cead75b1 ("can: flexcan: add mcf5441x support")
Link: https://lore.kernel.org/r/20210728075428.1493568-1-mkl@pengutronix.de
Reported-by: kernel test robot <lkp@intel.com>
Cc: Angelo Dureghello <angelo@kernel-space.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-04 12:11:56 +02:00
Oleksij Rempel 5b9272e93f can: j1939: extend UAPI to notify about RX status
To be able to create applications with user friendly feedback, we need be
able to provide receive status information.

Typical ETP transfer may take seconds or even hours. To give user some
clue or show a progress bar, the stack should push status updates.
Same as for the TX information, the socket error queue will be used with
following new signals:
- J1939_EE_INFO_RX_RTS   - received and accepted request to send signal.
- J1939_EE_INFO_RX_DPO   - received data package offset signal
- J1939_EE_INFO_RX_ABORT - RX session was aborted

Instead of completion signal, user will get data package.
To activate this signals, application should set
SOF_TIMESTAMPING_RX_SOFTWARE to the SO_TIMESTAMPING socket option. This
will avoid unpredictable application behavior for the old software.

Link: https://lore.kernel.org/r/20210707094854.30781-3-o.rempel@pengutronix.de
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-04 12:11:52 +02:00
Oleksij Rempel cd85d3aed5 can: j1939: rename J1939_ERRQUEUE_* to J1939_ERRQUEUE_TX_*
Prepare the world for the J1939_ERRQUEUE_RX_ version

Link: https://lore.kernel.org/r/20210707094854.30781-2-o.rempel@pengutronix.de
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-08-04 12:11:48 +02:00
Sean Christopherson 179c6c27bf KVM: SVM: Fix off-by-one indexing when nullifying last used SEV VMCB
Use the raw ASID, not ASID-1, when nullifying the last used VMCB when
freeing an SEV ASID.  The consumer, pre_sev_run(), indexes the array by
the raw ASID, thus KVM could get a false negative when checking for a
different VMCB if KVM manages to reallocate the same ASID+VMCB combo for
a new VM.

Note, this cannot cause a functional issue _in the current code_, as
pre_sev_run() also checks which pCPU last did VMRUN for the vCPU, and
last_vmentry_cpu is initialized to -1 during vCPU creation, i.e. is
guaranteed to mismatch on the first VMRUN.  However, prior to commit
8a14fe4f0c ("kvm: x86: Move last_cpu into kvm_vcpu_arch as
last_vmentry_cpu"), SVM tracked pCPU on its own and zero-initialized the
last_cpu variable.  Thus it's theoretically possible that older versions
of KVM could miss a TLB flush if the first VMRUN is on pCPU0 and the ASID
and VMCB exactly match those of a prior VM.

Fixes: 70cd94e60c ("KVM: SVM: VMRUN should use associated ASID when SEV is enabled")
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-04 06:02:09 -04:00
Paolo Bonzini 85cd39af14 KVM: Do not leak memory for duplicate debugfs directories
KVM creates a debugfs directory for each VM in order to store statistics
about the virtual machine.  The directory name is built from the process
pid and a VM fd.  While generally unique, it is possible to keep a
file descriptor alive in a way that causes duplicate directories, which
manifests as these messages:

  [  471.846235] debugfs: Directory '20245-4' with parent 'kvm' already present!

Even though this should not happen in practice, it is more or less
expected in the case of KVM for testcases that call KVM_CREATE_VM and
close the resulting file descriptor repeatedly and in parallel.

When this happens, debugfs_create_dir() returns an error but
kvm_create_vm_debugfs() goes on to allocate stat data structs which are
later leaked.  The slow memory leak was spotted by syzkaller, where it
caused OOM reports.

Since the issue only affects debugfs, do a lookup before calling
debugfs_create_dir, so that the message is downgraded and rate-limited.
While at it, ensure kvm->debugfs_dentry is NULL rather than an error
if it is not created.  This fixes kvm_destroy_vm_debugfs, which was not
checking IS_ERR_OR_NULL correctly.

Cc: stable@vger.kernel.org
Fixes: 536a6f88c4 ("KVM: Create debugfs dir and stat files for each VM")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-04 06:02:03 -04:00
David S. Miller d00551b402 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2021-08-04

1) Fix a sysbot reported memory leak in xfrm_user_rcv_msg.
   From Pavel Skripkin.

2) Revert "xfrm: policy: Read seqcount outside of rcu-read side
   in xfrm_policy_lookup_bytype". This commit tried to fix a
   lockin bug, but only cured some of the symptoms. A proper
   fix is applied on top of this revert.

3) Fix a locking bug on xfrm state hash resize. A recent change
   on sequence counters accidentally repaced a spinlock by a mutex.
   Fix from Frederic Weisbecker.

4) Fix possible user-memory-access in xfrm_user_rcv_msg_compat().
   From Dmitry Safonov.

5) Add initialiation sefltest fot xfrm_spdattr_type_t.
   From Dmitry Safonov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:45:41 +01:00
David S. Miller ff0ee9dfe8 Merge branch 'pegasus-errors'
Petko Manolov says:

====================
net: usb: pegasus: better error checking and DRIVER_VERSION removal

v3:

Pavel Skripkin again: make sure -ETIMEDOUT is returned by __mii_op() on timeout
condition;

v2:

Special thanks to Pavel Skripkin for the review and who caught a few bugs.
setup_pegasus_II() would not print an erroneous message on the success path.

v1:

Add error checking for get_registers() and derivatives.  If the usb transfer
fail then just don't use the buffer where the legal data should have been
returned.

Remove DRIVER_VERSION per Greg KH request.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:38:42 +01:00
Petko Manolov bc65bacf23 net: usb: pegasus: Remove the changelog and DRIVER_VERSION.
These are now deemed redundant.

Signed-off-by: Petko Manolov <petkan@nucleusys.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:38:42 +01:00
Petko Manolov 8a160e2e9a net: usb: pegasus: Check the return value of get_geristers() and friends;
Certain call sites of get_geristers() did not do proper error handling.  This
could be a problem as get_geristers() typically return the data via pointer to a
buffer.  If an error occurred the code is carelessly manipulating the wrong data.

Signed-off-by: Petko Manolov <petkan@nucleusys.com>
Reviewed-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:38:42 +01:00
Eric Dumazet 51b8f812e5 ipv6: exthdrs: get rid of indirect calls in ip6_parse_tlv()
As presented last month in our "BIG TCP" talk at netdev 0x15,
we plan using IPv6 jumbograms.

One of the minor problem we talked about is the fact that
ip6_parse_tlv() is currently using tables to list known tlvs,
thus using potentially expensive indirect calls.

While we could mitigate this cost using macros from
indirect_call_wrapper.h, we also can get rid of the tables
and let the compiler emit optimized code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Justin Iurman <justin.iurman@uliege.be>
Cc: Coco Li <lixiaoyan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:34:40 +01:00
David S. Miller d851798584 Merge branch 'm7530-sw-fallback'
DENG Qingfang says:

====================
mt7530 software fallback bridging fix

DSA core has gained software fallback support since commit 2f5dc00f7a,
but it does not work properly on mt7530. This patch series fixes the
issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:30:00 +01:00
DENG Qingfang 73c447cacb net: dsa: mt7530: always install FDB entries with IVL and FID 1
This reverts commit 7e77702178 ("mt7530 mt7530_fdb_write only set ivl
bit vid larger than 1").

Before this series, the default value of all ports' PVID is 1, which is
copied into the FDB entry, even if the ports are VLAN unaware. So
`bridge fdb show` will show entries like `dev swp0 vlan 1 self` even on
a VLAN-unaware bridge.

The blamed commit does not solve that issue completely, instead it may
cause a new issue that FDB is inaccessible in a VLAN-aware bridge with
PVID 1.

This series sets PVID to 0 on VLAN-unaware ports, so `bridge fdb show`
will no longer print `vlan 1` on VLAN-unaware bridges, and that special
case in fdb_write is not required anymore.

Set FDB entries' filter ID to 1 to match the VLAN table.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:30:00 +01:00
DENG Qingfang a9e3f62dff net: dsa: mt7530: set STP state on filter ID 1
As filter ID 1 is the only one used for bridges, set STP state on it.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:30:00 +01:00
DENG Qingfang 6087175b79 net: dsa: mt7530: use independent VLAN learning on VLAN-unaware bridges
Consider the following bridge configuration, where bond0 is not
offloaded:

         +-- br0 --+
        / /   |     \
       / /    |      \
      /  |    |     bond0
     /   |    |     /   \
   swp0 swp1 swp2 swp3 swp4
     .        .       .
     .        .       .
     A        B       C

Ideally, when the switch receives a packet from swp3 or swp4, it should
forward the packet to the CPU, according to the port matrix and unknown
unicast flood settings.

But packet loss will happen if the destination address is at one of the
offloaded ports (swp0~2). For example, when client C sends a packet to
A, the FDB lookup will indicate that it should be forwarded to swp0, but
the port matrix of swp3 and swp4 is configured to only allow the CPU to
be its destination, so it is dropped.

However, this issue does not happen if the bridge is VLAN-aware. That is
because VLAN-aware bridges use independent VLAN learning, i.e. use VID
for FDB lookup, on offloaded ports. As swp3 and swp4 are not offloaded,
shared VLAN learning with default filter ID of 0 is used instead. So the
lookup for A with filter ID 0 never hits and the packet can be forwarded
to the CPU.

In the current code, only two combinations were used to toggle user
ports' VLAN awareness: one is PCR.PORT_VLAN set to port matrix mode with
PVC.VLAN_ATTR set to transparent port, the other is PCR.PORT_VLAN set to
security mode with PVC.VLAN_ATTR set to user port.

It turns out that only PVC.VLAN_ATTR contributes to VLAN awareness, and
port matrix mode just skips the VLAN table lookup. The reference manual
is somehow misleading when describing PORT_VLAN modes. It states that
PORT_MEM (VLAN port member) is used for destination if the VLAN table
lookup hits, but actually **PORT_MEM & PORT_MATRIX** (bitwise AND of
VLAN port member and port matrix) is used instead, which means we can
have two or more separate VLAN-aware bridges with the same PVID and
traffic won't leak between them.

Therefore, to solve this, enable independent VLAN learning with PVID 0
on VLAN-unaware bridges, by setting their PCR.PORT_VLAN to fallback
mode, while leaving standalone ports in port matrix mode. The CPU port
is always set to fallback mode to serve those bridges.

During testing, it is found that FDB lookup with filter ID of 0 will
also hit entries with VID 0 even with independent VLAN learning. To
avoid that, install all VLANs with filter ID of 1.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:30:00 +01:00
DENG Qingfang 0b69c54c74 net: dsa: mt7530: enable assisted learning on CPU port
Consider the following bridge configuration, where bond0 is not
offloaded:

         +-- br0 --+
        / /   |     \
       / /    |      \
      /  |    |     bond0
     /   |    |     /   \
   swp0 swp1 swp2 swp3 swp4
     .        .       .
     .        .       .
     A        B       C

Address learning is enabled on offloaded ports (swp0~2) and the CPU
port, so when client A sends a packet to C, the following will happen:

1. The switch learns that client A can be reached at swp0.
2. The switch probably already knows that client C can be reached at the
   CPU port, so it forwards the packet to the CPU.
3. The bridge core knows client C can be reached at bond0, so it
   forwards the packet back to the switch.
4. The switch learns that client A can be reached at the CPU port.
5. The switch forwards the packet to either swp3 or swp4, according to
   the packet's tag.

That makes client A's MAC address flap between swp0 and the CPU port. If
client B sends a packet to A, it is possible that the packet is
forwarded to the CPU. With offload_fwd_mark = 1, the bridge core won't
forward it back to the switch, resulting in packet loss.

As we have the assisted_learning_on_cpu_port in DSA core now, enable
that and disable hardware learning on the CPU port.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <oltean@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:30:00 +01:00
David S. Miller 8eceea4134 Merge branch 'ipa-pm-irqs'
Alex Elder says:

====================
net: ipa: prepare GSI interrupts for runtime PM

The last patch in this series arranges for GSI interrupts to be
disabled when the IPA hardware is suspended.  This ensures the clock
is always operational when a GSI interrupt fires.  Leading up to
that are patches that rearrange the code a bit to allow this to
be done.

The first two patches aren't *directly* related.  They remove some
flag arguments to some GSI suspend/resume related functions, using
the version field now present in the GSI structure.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
Alex Elder 45a42a3c50 net: ipa: disable GSI interrupts while suspended
Introduce new functions gsi_suspend() and gsi_resume(), which will
disable the GSI interrupt handler after all endpoints are suspended
and re-enable it before endpoints are resumed.  This will ensure no
GSI interrupt handler will fire when the hardware is suspended.

Here's a little further explanation.  There are seven GSI interrupt
types, and most are disabled except when needed.
  - These two are not used (never enabled):
      GSI_INTER_EE_CH_CTRL
      GSI_INTER_EE_EV_CTRL
  - These two are only used to implement channel and event ring
    commands, and are only enabled while a command is underway:
      GSI_CH_CTRL
      GSI_EV_CTRL
  - The IEOB interrupt signals I/O completion.  It will not fire
    when a channel is stopped (or "suspended").
      GSI_IEOB
  - This interrupt is used to allocate or halt modem channels,
    and is only enabled while such a command is underway.
      GSI_GLOB_EE
    However it also is used to signal certain errors, and this could
    occur at any time.
  - The general interrupt signals general errors, and could occur at
    any time.
      GSI_GENERAL

The purpose for this change is to ensure no global or general
interrupts fire due to errors while the hardware is suspended.
We enable the clock on resume, and at that time we can "handle"
(at least report) these error conditions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
Alex Elder b176f95b57 net: ipa: move gsi_irq_init() code into setup
The GSI IRQ handler could be triggered as soon as it is registered
with request_irq().  The handler function, gsi_isr(), touches
hardware, meaning the IPA clock must be operational.  The IPA clock
is not operating when the handler is registered (in gsi_irq_init()),
so this is a problem.

Move the call to request_irq() for the GSI interrupt handler into
gsi_irq_setup(), which is called when the IPA clock is known to be
operational (and furthermore, the GSI firmware will have been
loaded).  Request the IRQ at the end of that function, after all
interrupt types have been disabled and masked.

Move the matching free_irq() call into gsi_irq_teardown(), and get
rid of the now empty gsi_irq_exit(),

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
Alex Elder 1657d8a458 net: ipa: have gsi_irq_setup() return an error code
Change gsi_irq_setup() so it returns an error value, and introduce
gsi_irq_teardown() as its inverse.  Set the interrupt type (IRQ
rather than MSI) in gsi_irq_setup().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
Alex Elder a7860a5f89 net: ipa: move some GSI setup functions
Move gsi_irq_setup() and gsi_ring_setup() so they're defined right
above gsi_setup() where they're called.  This is a trivial movement
of code to prepare for upcoming patches.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
Alex Elder 4a4ba483e4 net: ipa: move version check for channel suspend/resume
Change the Boolean flags passed to __gsi_channel_start() and
__gsi_channel_stop() so they represent whether the request is being
made to implement suspend (versus stop) or resume (versus start).

Then stop or start the channel for suspend/resume requests only if
the hardware version indicates it should be done.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
Alex Elder decfef0fa6 net: ipa: use gsi->version for channel suspend/resume
The GSI layer has the IPA version now, so there's no need for
version-specific flags to be passed from IPA.  One instance of
this is in gsi_channel_suspend() and gsi_channel_resume(), which
indicate whether or not the endpoint suspend is implemented by
GSI stopping the channel.  We can make that determination based
on gsi->version, eliminating the need for a Boolean flag in those
functions.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:12:05 +01:00
David S. Miller 93bbcfee05 Merge branch 'mhi-mbim'
Loic Poulain says:

====================
net: mhi: move MBIM to WWAN

Implement a proper WWAN driver for MBIM network protocol, with multi link
management supported through the WWAN framework (wwan rtnetlink).

Until now, MBIM over MHI was supported directly in the mhi_net driver, via
some protocol rx/tx fixup callbacks, but with only one session supported
(no multilink muxing). We can then remove that part from mhi_net and restore
the driver to a simpler version for 'raw' ip transfer (or QMAP via rmnet link).

Note that a wwan0 link is created by default for session-id 0. Additional links
can be managed via ip tool:

    $ ip link add dev wwan0mms parentdev wwan0 type wwan linkid 1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:10:12 +01:00
Loic Poulain 7ffa7542ec net: mhi: Remove MBIM protocol
The MBIM protocol has now been integrated in a proper WWAN driver. We
can then revert back to a simpler driver for mhi_net, which is used
for raw IP or QMAP protocol (via rmnet link).

- Remove protocol management
- Remove WWAN framework usage (only valid for mbim)
- Remove net/mhi directory for simpler mhi_net.c file

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:10:12 +01:00
Loic Poulain aa730a9905 net: wwan: Add MHI MBIM network driver
Add new wwan driver for MBIM over MHI. MBIM is a transport protocol
for IP packets, allowing packet aggregation and muxing. Initially
designed for USB bus, it is also exposed through MHI bus for QCOM
based PCIe wwan modems.

This driver supports the new wwan rtnetlink interface for multi-link
management and has been tested with Quectel EM120R-GL M2 module.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:10:12 +01:00
David S. Miller 8730379ee0 Merge branch 'queues'
Jakub Kicinski says:

====================
net: add netif_set_real_num_queues() for device reconfig

This short set adds a helper to make the implementation of
two-phase NIC reconfig easier.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:05:13 +01:00
Jakub Kicinski e874f4557b nfp: use netif_set_real_num_queues()
Avoid reconfig problems due to failures in netif_set_real_num_tx_queues()
by using netif_set_real_num_queues().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:05:13 +01:00
Jakub Kicinski 271e5b7d00 net: add netif_set_real_num_queues() for device reconfig
netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues()
can fail which breaks drivers trying to implement reconfiguration
in a way that can't leave the device half-broken. In other words
those functions are incompatible with prepare/commit approach.

Luckily setting real number of queues can fail only if the number
is increased, meaning that if we order operations correctly we
can guarantee ending up with either new config (success), or
the old one (on error).

Provide a helper implementing such logic so that drivers don't
have to duplicate it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:05:13 +01:00
Rocco Yue 8679c31e02 net: add extack arg for link ops
Pass extack arg to validate_linkmsg and validate_link_af callbacks.
If a netlink attribute has a reject_message, use the extended ack
mechanism to carry the message back to user space.

Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:01:26 +01:00
Leon Romanovsky 13a9c4ac31 net/prestera: Fix devlink groups leakage in error flow
Devlink trap group is registered but not released in error flow,
add the missing devlink_trap_groups_unregister() call.

Fixes: 0a9003f45e ("net: marvell: prestera: devlink: add traps/groups implementation")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 10:00:36 +01:00
Yunsheng Lin 06f5553e0f net: sched: fix lockdep_set_class() typo error for sch->seqlock
According to comment in qdisc_alloc(), sch->seqlock's lockdep
class key should be set to qdisc_tx_busylock, due to possible
type error, sch->busylock's lockdep class key is set to
qdisc_tx_busylock, which is duplicated because sch->busylock's
lockdep class key is already set in qdisc_alloc().

So fix it by replacing sch->busylock with sch->seqlock.

Fixes: 96009c7d50 ("sched: replace __QDISC_STATE_RUNNING bit with a spin lock")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:59:24 +01:00
Rao Shoaib 314001f0bf af_unix: Add OOB support
This patch adds OOB support for AF_UNIX sockets.
The semantics is same as TCP.

The last byte of a message with the OOB flag is
treated as the OOB byte. The byte is separated into
a skb and a pointer to the skb is stored in unix_sock.
The pointer is used to enforce OOB semantics.

Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:55:52 +01:00
David S. Miller 7e89350c90 Merge branch 'dpaa2-switch-next'
Ioana Ciornei says:

====================
dpaa2-switch: integrate the MAC endpoint support

This patch set integrates the already available MAC support into the
dpaa2-switch driver as well.

The first 4 patches are fixing up some minor problems or optimizing the
code, while the remaining ones are actually integrating the dpaa2-mac
support into the switch driver by calling the dpaa2_mac_* provided
functions. While at it, we also export the MAC statistics in ethtool
like we do for dpaa2-eth.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:48 +01:00
Ioana Ciornei f0653a8920 dpaa2-switch: export MAC statistics in ethtool
If a switch port is connected to a MAC, use the common dpaa2-mac support
for exporting the available MAC statistics.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 8581362d9c dpaa2-switch: add a prefix to HW ethtool stats
In the next patch, we'll add support for also exporting the MAC
statistics in the ethtool stats. Annotate already present HW stats with
a suggestive prefix.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 84cba72956 dpaa2-switch: integrate the MAC endpoint support
Integrate the common MAC endpoint management support into the
dpaa2-switch driver as well. Nothing special happens here, just that the
already available dpaa2-mac functions are also called from dpaa2-switch.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 27cfdadd68 bus: fsl-mc: extend fsl_mc_get_endpoint() to pass interface ID
In case of a switch DPAA2 object, the interface ID is also needed when
querying for the object endpoint. Extend fsl_mc_get_endpoint() so that
users can also pass the interface ID that are interested in.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 2b24ffd83e dpaa2-switch: no need to check link state right after ndo_open
The call to dpaa2_switch_port_link_state_update is a leftover from the
time when on DPAA2 platforms the PHYs were started at boot time so when
an ifconfig was issued on the associated interface, the link status
needed to be checked directly from the ndo_open() callback.  This is not
needed anymore since we are now properly integrated with the PHY layer
thus a link interrupt will come directly from the PHY eventually without
the need to call the sync function.
Fix this up by removing the call to dpaa2_switch_port_link_state_update.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 042ad90ca7 dpaa2-switch: do not enable the DPSW at probe time
We should not enable the switch interfaces at probe time since this is
trigged by the open callback. Remove the call dpsw_enable() which does
exactly this.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 24ab724f8a dpaa2-switch: use the port index in the IRQ handler
The MC firmware supplies us the switch interface index for which an
interrupt was triggered. Use this to our advantage instead of looping
through all the switch ports and doing unnecessary work.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:34 +01:00
Ioana Ciornei 1ca6cf5ecb dpaa2-switch: request all interrupts sources on the DPSW
Request all interrupt sources to be read and then cleared on the DPSW
object. In the next patches we'll also add support for treating other
interrupts.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-04 09:53:33 +01:00
Oleksij Rempel d1a58c013a net: dsa: qca: ar9331: reorder MDIO write sequence
In case of this switch we work with 32bit registers on top of 16bit
bus. Some registers (for example access to forwarding database) have
trigger bit on the first 16bit half of request and the result +
configuration of request in the second half. Without this patch, we would
trigger database operation and overwrite result in one run.

To make it work properly, we should do the second part of transfer
before the first one is done.

So far, this rule seems to work for all registers on this switch.

Fixes: ec6698c272 ("net: dsa: add support for Atheros AR9331 built-in switch")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20210803063746.3600-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-03 14:35:28 -07:00
Joakim Zhang b820c114eb net: fec: fix MAC internal delay doesn't work
This patch intends to fix MAC internal delay doesn't work, due to use
of_property_read_u32() incorrectly, and improve this feature a bit:
1) check the delay value if valid, only program register when it's 2000ps.
2) only enable "enet_2x_txclk" clock when require MAC internal delay.

Fixes: fc539459e9 ("net: fec: add MAC internal delayed clock feature support")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20210803052424.19008-1-qiangqing.zhang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-03 14:34:32 -07:00
Vladimir Oltean 421297efe6 net: dsa: tag_sja1105: consistently fail with arbitrary input
Dan Carpenter's smatch tests report that the "vid" variable, populated
by sja1105_vlan_rcv when an skb is received by the tagger that has a
VLAN ID which cannot be decoded by tag_8021q, may be uninitialized when
used here:

	if (source_port == -1 || switch_id == -1)
		skb->dev = dsa_find_designated_bridge_port_by_vid(netdev, vid);

The sja1105 driver, by construction, sets up the switch in a way that
all data plane packets sent towards the CPU port are VLAN-tagged. So it
is practically impossible, in a functional system, for a packet to be
processed by sja1110_rcv() which is not a control packet and does not
have a VLAN header either.

However, it would be nice if the sja1105 tagging driver could
consistently do something valid, for example fail, even if presented with
packets that do not hold valid sja1105 tags. Currently it is a bit hard
to argue that it does that, given the fact that a data plane packet with
no VLAN tag will trigger a call to dsa_find_designated_bridge_port_by_vid
with a vid argument that is an uninitialized stack variable.

To fix this, we can initialize the u16 vid variable with 0, a value that
can never be a bridge VLAN, so dsa_find_designated_bridge_port_by_vid
will always return a NULL skb->dev.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210802195137.303625-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-03 14:31:54 -07:00