Commit Graph

814979 Commits

Author SHA1 Message Date
Andy Shevchenko 29ca1c5a4b net-sysfs: Switch to bitmap_zalloc()
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04 10:20:39 -08:00
Andy Shevchenko 214fa1c437 mellanox: Switch to bitmap_zalloc()
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04 10:19:33 -08:00
David S. Miller f7fb7c1a1c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-03-04

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add AF_XDP support to libbpf. Rationale is to facilitate writing
   AF_XDP applications by offering higher-level APIs that hide many
   of the details of the AF_XDP uapi. Sample programs are converted
   over to this new interface as well, from Magnus.

2) Introduce a new cant_sleep() macro for annotation of functions
   that cannot sleep and use it in BPF_PROG_RUN() to assert that
   BPF programs run under preemption disabled context, from Peter.

3) Introduce per BPF prog stats in order to monitor the usage
   of BPF; this is controlled by kernel.bpf_stats_enabled sysctl
   knob where monitoring tools can make use of this to efficiently
   determine the average cost of programs, from Alexei.

4) Split up BPF selftest's test_progs similarly as we already
   did with test_verifier. This allows to further reduce merge
   conflicts in future and to get more structure into our
   quickly growing BPF selftest suite, from Stanislav.

5) Fix a bug in BTF's dedup algorithm which can cause an infinite
   loop in some circumstances; also various BPF doc fixes and
   improvements, from Andrii.

6) Various BPF sample cleanups and migration to libbpf in order
   to further isolate the old sample loader code (so we can get
   rid of it at some point), from Jakub.

7) Add a new BPF helper for BPF cgroup skb progs that allows
   to set ECN CE code point and a Host Bandwidth Manager (HBM)
   sample program for limiting the bandwidth used by v2 cgroups,
   from Lawrence.

8) Enable write access to skb->queue_mapping from tc BPF egress
   programs in order to let BPF pick TX queue, from Jesper.

9) Fix a bug in BPF spinlock handling for map-in-map which did
   not propagate spin_lock_off to the meta map, from Yonghong.

10) Fix a bug in the new per-CPU BPF prog counters to properly
    initialize stats for each CPU, from Eric.

11) Add various BPF helper prototypes to selftest's bpf_helpers.h,
    from Willem.

12) Fix various BPF samples bugs in XDP and tracing progs,
    from Toke, Daniel and Yonghong.

13) Silence preemption splat in test_bpf after BPF_PROG_RUN()
    enforces it now everywhere, from Anders.

14) Fix a signedness bug in libbpf's btf_dedup_ref_type() to
    get error handling working, from Dan.

15) Fix bpftool documentation and auto-completion with regards
    to stream_{verdict,parser} attach types, from Alban.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-04 10:14:31 -08:00
Daniel Borkmann 87dab7c3d5 bpf: add test cases for non-pointer sanitiation logic
Add two additional tests for further asserting the
BPF_ALU_NON_POINTER logic with cases that were missed
previously.

Cc: Marek Majkowski <marek@cloudflare.com>
Cc: Arthur Fabre <afabre@cloudflare.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-04 10:54:20 +01:00
David S. Miller 8c4238df4d Merge branch 'mlxsw-minimal-Add-ethtool-and-resource-query-support'
Ido Schimmel says:

====================
mlxsw: minimal: Add ethtool and resource query support

Vadim says:

The minimal driver is chip independent and uses I2C bus for chip access.
Its purpose is to support chassis management on systems equipped with
Mellanox switch ASICs. For example, from a BMC (Board Management
Controller) device.

Patches #1-#3 add ethtool support to the minimal driver so that QSFP/SFP
module info could be retrieved by the driver. This is done by exposing a
dummy netdev for each front panel port and implementing the required
ethtool operations.

Patches #4-#8 add resource query support. This allows the driver to
query the firmware about values of certain resources (e.g., maximum
number of ports). It is required on systems where the maximum number of
ports is larger than the hard coded default (64).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak 6a986993e4 mlxsw: i2c: Extend initialization by querying resources data
Extend initialization flow by query requests for chip resources data in
order to obtain chip's specific capabilities, like the number of ports.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak 95b75cbd1b mlxsw: i2c: Extend input parameters list of command API
Extend input parameters list of command API in mlxsw_i2c_cmd() in order
to support initialization commands. Up until now, only access commands
were supported by I2C driver.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak f43d9d9b4e mlxsw: i2c: Modify input parameter name in initialization API
Change input parameter name "resource" to "res" in mlxsw_i2c_init() in
order to align it with mlxsw_pci_init().

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak 27758c8016 mlxsw: i2c: Fix comment misspelling
Fix comment for mlxsw_i2c_write_cmd().

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak e5ba7803ba mlxsw: core: Move resource query API to common location
Move mlxsw_pci_resources_query() to a common location to allow reuse by
the different drivers and over all the supported physical buses. Rename
it to mlxsw_core_resources_query().

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak c100e47caa mlxsw: minimal: Add ethtool support
The minimal driver is chip independent and uses I2C bus for chip access.
Its purpose is to support chassis management on systems equipped with
Mellanox switch ASICs. For example from BMC (Board Management
Controller) device.

Expose a dummy netdev for each front panel port and implement basic
ethtool operations to obtain QSFP/SFP module info through ethtool.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak 1ded391df0 mlxsw: minimal: Make structures and variables names shorter
Replace "mlxsw_minimal" by "mlxsw_m" in order to improve code
readability.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
Vadim Pasternak 1b1c6c1a38 mlxsw: core: Move ethtool module callbacks to a common location
Move the implementation of ethtool module callbacks - .get_module_info()
and .get_module_eeprom() - to a common location to allow reuse by the
different mlxsw drivers.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:23:00 -08:00
David S. Miller a9836336dd Merge branch 'tls-Fix-issues-in-tls_device'
Boris Pismenny says:

====================
tls: Fix issues in tls_device

This series fixes issues encountered in tls_device code paths,
which were introduced recently.

Additionally, this series includes a fix for tls software only receive flow,
which causes corruption of payload received by user space applications.

This series was tested using the OpenSSL integration of KTLS -
https://github.com/mellan
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
Boris Pismenny d069b780e3 tls: Fix tls_device receive
Currently, the receive function fails to handle records already
decrypted by the device due to the commit mentioned below.

This commit advances the TLS record sequence number and prepares the context
to handle the next record.

Fixes: fedf201e12 ("net: tls: Refactor control message handling on recv")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
Eran Ben Elisha 7754bd63ed tls: Fix mixing between async capable and async
Today, tls_sw_recvmsg is capable of using asynchronous mode to handle
application data TLS records. Moreover, it assumes that if the cipher
can be handled asynchronously, then all packets will be processed
asynchronously.

However, this assumption is not always true. Specifically, for AES-GCM
in TLS1.2, it causes data corruption, and breaks user applications.

This patch fixes this problem by separating the async capability from
the decryption operation result.

Fixes: c0ab4732d4 ("net/tls: Do not use async crypto for non-data records")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
Boris Pismenny 7463d3a2db tls: Fix write space handling
TLS device cannot use the sw context. This patch returns the original
tls device write space handler and moves the sw/device specific portions
to the relevant files.

Also, we remove the write_space call for the tls_sw flow, because it
handles partial records in its delayed tx work handler.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
Boris Pismenny 94850257cf tls: Fix tls_device handling of partial records
Cleanup the handling of partial records while fixing a bug where the
tls_push_pending_closed_record function is using the software tls
context instead of the hardware context.

The bug resulted in the following crash:
[   88.791229] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   88.793271] #PF error: [normal kernel read fault]
[   88.794449] PGD 800000022a426067 P4D 800000022a426067 PUD 22a156067 PMD 0
[   88.795958] Oops: 0000 [#1] SMP PTI
[   88.796884] CPU: 2 PID: 4973 Comm: openssl Not tainted 5.0.0-rc4+ #3
[   88.798314] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   88.800067] RIP: 0010:tls_tx_records+0xef/0x1d0 [tls]
[   88.801256] Code: 00 02 48 89 43 08 e8 a0 0b 96 d9 48 89 df e8 48 dd
4d d9 4c 89 f8 4d 8b bf 98 00 00 00 48 05 98 00 00 00 48 89 04 24 49 39
c7 <49> 8b 1f 4d 89 fd 0f 84 af 00 00 00 41 8b 47 10 85 c0 0f 85 8d 00
[   88.805179] RSP: 0018:ffffbd888186fca8 EFLAGS: 00010213
[   88.806458] RAX: ffff9af1ed657c98 RBX: ffff9af1e88a1980 RCX: 0000000000000000
[   88.808050] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9af1e88a1980
[   88.809724] RBP: ffff9af1e88a1980 R08: 0000000000000017 R09: ffff9af1ebeeb700
[   88.811294] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   88.812917] R13: ffff9af1e88a1980 R14: ffff9af1ec13f800 R15: 0000000000000000
[   88.814506] FS:  00007fcad2240740(0000) GS:ffff9af1f7880000(0000) knlGS:0000000000000000
[   88.816337] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.817717] CR2: 0000000000000000 CR3: 0000000228b3e000 CR4: 00000000001406e0
[   88.819328] Call Trace:
[   88.820123]  tls_push_data+0x628/0x6a0 [tls]
[   88.821283]  ? remove_wait_queue+0x20/0x60
[   88.822383]  ? n_tty_read+0x683/0x910
[   88.823363]  tls_device_sendmsg+0x53/0xa0 [tls]
[   88.824505]  sock_sendmsg+0x36/0x50
[   88.825492]  sock_write_iter+0x87/0x100
[   88.826521]  __vfs_write+0x127/0x1b0
[   88.827499]  vfs_write+0xad/0x1b0
[   88.828454]  ksys_write+0x52/0xc0
[   88.829378]  do_syscall_64+0x5b/0x180
[   88.830369]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   88.831603] RIP: 0033:0x7fcad1451680

[ 1248.470626] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 1248.472564] #PF error: [normal kernel read fault]
[ 1248.473790] PGD 0 P4D 0
[ 1248.474642] Oops: 0000 [#1] SMP PTI
[ 1248.475651] CPU: 3 PID: 7197 Comm: openssl Tainted: G           OE 5.0.0-rc4+ #3
[ 1248.477426] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 1248.479310] RIP: 0010:tls_tx_records+0x110/0x1f0 [tls]
[ 1248.480644] Code: 00 02 48 89 43 08 e8 4f cb 63 d7 48 89 df e8 f7 9c
1b d7 4c 89 f8 4d 8b bf 98 00 00 00 48 05 98 00 00 00 48 89 04 24 49 39
c7 <49> 8b 1f 4d 89 fd 0f 84 af 00 00 00 41 8b 47 10 85 c0 0f 85 8d 00
[ 1248.484825] RSP: 0018:ffffaa0a41543c08 EFLAGS: 00010213
[ 1248.486154] RAX: ffff955a2755dc98 RBX: ffff955a36031980 RCX: 0000000000000006
[ 1248.487855] RDX: 0000000000000000 RSI: 000000000000002b RDI: 0000000000000286
[ 1248.489524] RBP: ffff955a36031980 R08: 0000000000000000 R09: 00000000000002b1
[ 1248.491394] R10: 0000000000000003 R11: 00000000ad55ad55 R12: 0000000000000000
[ 1248.493162] R13: 0000000000000000 R14: ffff955a2abe6c00 R15: 0000000000000000
[ 1248.494923] FS:  0000000000000000(0000) GS:ffff955a378c0000(0000) knlGS:0000000000000000
[ 1248.496847] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1248.498357] CR2: 0000000000000000 CR3: 000000020c40e000 CR4: 00000000001406e0
[ 1248.500136] Call Trace:
[ 1248.500998]  ? tcp_check_oom+0xd0/0xd0
[ 1248.502106]  tls_sk_proto_close+0x127/0x1e0 [tls]
[ 1248.503411]  inet_release+0x3c/0x60
[ 1248.504530]  __sock_release+0x3d/0xb0
[ 1248.505611]  sock_close+0x11/0x20
[ 1248.506612]  __fput+0xb4/0x220
[ 1248.507559]  task_work_run+0x88/0xa0
[ 1248.508617]  do_exit+0x2cb/0xbc0
[ 1248.509597]  ? core_sys_select+0x17a/0x280
[ 1248.510740]  do_group_exit+0x39/0xb0
[ 1248.511789]  get_signal+0x1d0/0x630
[ 1248.512823]  do_signal+0x36/0x620
[ 1248.513822]  exit_to_usermode_loop+0x5c/0xc6
[ 1248.515003]  do_syscall_64+0x157/0x180
[ 1248.516094]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1248.517456] RIP: 0033:0x7fb398bd3f53
[ 1248.518537] Code: Bad RIP value.

Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 22:10:16 -08:00
David S. Miller 7d827379b0 Merge branch 'net-phy-clean-up-the-old-gen10g-functions'
Heiner Kallweit says:

====================
net: phy: clean up the old gen10g functions

The old gen10g_ functions are mainly stubs and have been superseded
by genphy_c45_ equivalents. So lets remove / hide the old functions
as far as possible.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:48:06 -08:00
Heiner Kallweit 7be3ad848f net: phy: remove gen10g_no_soft_reset
genphy_no_soft_reset and gen10g_no_soft_reset are both the same no-ops,
one is enough.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:47:57 -08:00
Heiner Kallweit d81210c25e net: phy: don't export gen10g_read_status
gen10g_read_status is deprecated, therefore stop exporting it.
We don't want to encourage anybody to use it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:47:42 -08:00
Heiner Kallweit c5e91d3942 net: phy: remove gen10g_config_init
ETHTOOL_LINK_MODE_10000baseT_Full_BIT is set anyway in the supported
and advertising bitmap because it's part of PHY_10GBIT_FEATURES.
And all users of gen10g_config_init use PHY_10GBIT_FEATURES.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:47:42 -08:00
Heiner Kallweit a6d0aa97f4 net: phy: remove gen10g_suspend and gen10g_resume
phy_suspend() and phy_resume() are no-ops anyway if no callback is
defined. Therefore we don't need these stubs.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:47:42 -08:00
Heiner Kallweit d7bed825ba net: phy: use genphy_c45_aneg_done in genphy_aneg_done
Now that we have it let's use genphy_c45_aneg_done() in phy_aneg_done().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:47:42 -08:00
Joe Perches 6bfc1128d5 fsl/fman: Use vsprintf extension %pM
Make logging of an ethernet address more consistent with
the rest of the kernel.

Miscellanea:

The %02hx use also did not quite match the u8 definition
of addr though that did not actually matter given normal
integer promotion rules.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:10:06 -08:00
Francesco Ruggeri 9036b2fe09 net: ipv6: add socket option IPV6_ROUTER_ALERT_ISOLATE
By default IPv6 socket with IPV6_ROUTER_ALERT socket option set will
receive all IPv6 RA packets from all namespaces.
IPV6_ROUTER_ALERT_ISOLATE socket option restricts packets received by
the socket to be only from the socket's namespace.

Signed-off-by: Maxim Martynov <maxim@arista.com>
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 21:05:10 -08:00
Ben Dooks 46d841105d net: fixup address-space warnings in compat_mc_{get,set}sockopt()
Add __user attributes in some of the casts in this function to avoid
the following sparse warnings:

net/compat.c:592:57: warning: cast removes address space of expression
net/compat.c:592:57: warning: incorrect type in initializer (different address spaces)
net/compat.c:592:57:    expected struct compat_group_req [noderef] <asn:1>*gr32
net/compat.c:592:57:    got void *<noident>
net/compat.c:613:65: warning: cast removes address space of expression
net/compat.c:613:65: warning: incorrect type in initializer (different address spaces)
net/compat.c:613:65:    expected struct compat_group_source_req [noderef] <asn:1>*gsr32
net/compat.c:613:65:    got void *<noident>
net/compat.c:634:60: warning: cast removes address space of expression
net/compat.c:634:60: warning: incorrect type in initializer (different address spaces)
net/compat.c:634:60:    expected struct compat_group_filter [noderef] <asn:1>*gf32
net/compat.c:634:60:    got void *<noident>
net/compat.c:672:52: warning: cast removes address space of expression
net/compat.c:672:52: warning: incorrect type in initializer (different address spaces)
net/compat.c:672:52:    expected struct compat_group_filter [noderef] <asn:1>*gf32
net/compat.c:672:52:    got void *<noident>

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:58:25 -08:00
Florian Fainelli d6af21a4fb net: dsa: Use prepare/commit phase in dsa_slave_vlan_rx_add_vid()
We were skipping the prepare phase which causes some problems with at
least a couple of drivers:

- mv88e6xxx chooses to skip programming VID = 0 with -EOPNOTSUPP in
  the prepare phase, but we would still try to force this VID since we
  would only call the commit phase and so we would get the driver to
  return -EINVAL instead

- qca8k does not currently have a port_vlan_add() callback implemented,
  yet we would try to call that unconditionally leading to a NPD

Fix both issues by conforming to the current model doing a
prepare/commit phase, this makes us consistent throughout the code and
assumptions.

Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Michal Vokáč <michal.vokac@ysoft.com>
Fixes: 061f6a505a ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:45:52 -08:00
David S. Miller a5f1512d0b Merge branch 'dpaa2-eth-add-XDP_REDIRECT-support'
Ioana Ciornei says:

====================
dpaa2-eth: add XDP_REDIRECT support

The first patch adds different software annotation types for Tx frames
depending on frame type while the second one actually adds support for basic
XDP_REDIRECT.

Changes in v2:
  - add missing xdp_do_flush_map() call
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:41:18 -08:00
Ioana Radulescu d678be1dc1 dpaa2-eth: add XDP_REDIRECT support
Implement support for the XDP_REDIRECT action.

The redirected frame is transmitted and confirmed on the regular Tx/Tx
conf queues. Frame is marked with the "XDP" type in the software
annotation, since it requires special treatment.

We don't have good hardware support for TX batching, so the
XDP_XMIT_FLUSH flag doesn't make a difference for now; ndo_xdp_xmit
performs the actual Tx operation on the spot.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:41:18 -08:00
Ioana Radulescu e3fdf6ba09 dpaa2-eth: Add software annotation types
We write different metadata information in the software annotation
area of Tx frames, depending on frame type. Make this more explicit
by introducing a type field and separate structures for single buffer
and scatter-gather frames.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:41:18 -08:00
David S. Miller 3cec12ce5a Merge branch 'sched-Patches-from-out-of-tree-version-of-sch_cake'
Toke Høiland-Jørgensen says:

====================
sched: Patches from out-of-tree version of sch_cake

This series includes a couple of patches with updates from the out-of-tree
version of sch_cake. The first one is a fix to the fairness scheduling when
dual-mode fairness is enabled. The second patch is an additional feature flag
that allows using fwmark as a tin selector, as a convenience for people who want
to customise tin selection. The third patch is just a cleanup to the tin
selection logic.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:14:28 -08:00
Toke Høiland-Jørgensen 4976e3c683 sch_cake: Simplify logic in cake_select_tin()
With more modes added the logic in cake_select_tin() was getting a bit
hairy, and it turns out we can actually simplify it quite a bit. This also
allows us to get rid of one of the two diffserv parsing functions, which
has the added benefit that already-zeroed DSCP fields won't get re-written.

Suggested-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:14:28 -08:00
Kevin Darbyshire-Bryant 0b5c7efdfc sch_cake: Permit use of connmarks as tin classifiers
Add flag 'FWMARK' to enable use of firewall connmarks as tin selector.
The connmark (skbuff->mark) needs to be in the range 1->tin_cnt ie.
for diffserv3 the mark needs to be 1->3.

Background

Typically CAKE uses DSCP as the basis for tin selection.  DSCP values
are relatively easily changed as part of the egress path, usually with
iptables & the mangle table, ingress is more challenging.  CAKE is often
used on the WAN interface of a residential gateway where passthrough of
DSCP from the ISP is either missing or set to unhelpful values thus use
of ingress DSCP values for tin selection isn't helpful in that
environment.

An approach to solving the ingress tin selection problem is to use
CAKE's understanding of tc filters.  Naive tc filters could match on
source/destination port numbers and force tin selection that way, but
multiple filters don't scale particularly well as each filter must be
traversed whether it matches or not. e.g. a simple example to map 3
firewall marks to tins:

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR protocol all handle 0x01 fw action skbedit priority ${MAJOR}1
tc filter add dev $DEV parent $MAJOR protocol all handle 0x02 fw action skbedit priority ${MAJOR}2
tc filter add dev $DEV parent $MAJOR protocol all handle 0x03 fw action skbedit priority ${MAJOR}3

Another option is to use eBPF cls_act with tc filters e.g.

MAJOR=$( tc qdisc show dev $DEV | head -1 | awk '{print $3}' )
tc filter add dev $DEV parent $MAJOR bpf da obj my-bpf-fwmark-to-class.o

This has the disadvantages of a) needing someone to write & maintain
the bpf program, b) a bpf toolchain to compile it and c) needing to
hardcode the major number in the bpf program so it matches the cake
instance (or forcing the cake instance to a particular major number)
since the major number cannot be passed to the bpf program via tc
command line.

As already hinted at by the previous examples, it would be helpful
to associate tins with something that survives the Internet path and
ideally allows tin selection on both egress and ingress.  Netfilter's
conntrack permits setting an identifying mark on a connection which
can also be restored to an ingress packet with tc action connmark e.g.

tc filter add dev eth0 parent ffff: protocol all prio 10 u32 \
	match u32 0 0 flowid 1:1 action connmark action mirred egress redirect dev ifb1

Since tc's connmark action has restored any connmark into skb->mark,
any of the previous solutions are based upon it and in one form or
another copy that mark to the skb->priority field where again CAKE
picks this up.

This change cuts out at least one of the (less intuitive &
non-scalable) middlemen and permit direct access to skb->mark.

Signed-off-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:14:28 -08:00
George Amanakis 7126399299 sch_cake: Make the dual modes fairer
CAKE host fairness does not work well with TCP flows in dual-srchost and
dual-dsthost setup. The reason is that ACKs generated by TCP flows are
classified as sparse flows, and affect flow isolation from other hosts. Fix
this by calculating host_load based only on the bulk flows a host
generates. In a hash collision the host_bulk_flow_count values must be
decremented on the old hosts and incremented on the new ones *if* the queue
is in the bulk set.

Reported-by: Pete Heist <peteheist@gmail.com>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 20:14:28 -08:00
David S. Miller c21e18a550 Merge branch 'Macb-power-management-support-for-ZynqMP'
Harini Katakam says:

====================
Macb power management support for ZynqMP

This series adds support for macb suspend/resume with system power down.
In relation to the above, this series also updates mdio_read/write
function for PM and adds tsu clock management.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:51:37 -08:00
Harini Katakam de991c58b3 net: macb: Add support for suspend/resume with full power down
When macb device is suspended and system is powered down, the clocks
are removed and hence macb should be closed gracefully and restored
upon resume. This patch does the same by switching off the net device,
suspending phy and performing necessary cleanup of interrupts and BDs.
Upon resume, all these are reinitialized again.

Reset of macb device is done only when GEM is not a wake device.
Even when gem is a wake device, tx queues can be stopped and ptp device
can be closed (tsu clock will be disabled in pm_runtime_suspend) as
wake event detection has no dependency on this.

Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:51:37 -08:00
Harini Katakam d54f89af6c net: macb: Add pm runtime support
Add runtime pm functions and move clock handling there.
Add runtime PM calls to mdio functions to allow for active mdio bus.

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:51:37 -08:00
Harini Katakam f5473d1d44 net: macb: Support clock management for tsu_clk
TSU clock needs to be enabled/disabled as per support in devicetree
and it should also be controlled during suspend/resume (WOL has no
dependency on this clock).

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:51:37 -08:00
Harini Katakam 8beb79b7ae net: macb: Check MDIO state before read/write and use timeouts
Replace the while loop in MDIO read/write functions with a timeout.
In addition, add a check for MDIO bus busy before initiating a new
operation as well to make sure there is no ongoing MDIO operation.

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com>
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:51:37 -08:00
David S. Miller 41bc0ddb80 Merge branch 'net-dsa-microchip-add-KSZ9893-switch-support'
Tristram Ha says:

====================
net: dsa: microchip: add KSZ9893 switch support

This series of patches is to modify the KSZ9477 DSA driver to support
running KSZ9893 switch.

The KSZ9893 switch is similar to KSZ9477 except the ingress tail tag has
1 byte instead of 2 bytes.  The XMII register that governs the MAC
communication also has different register definitions.

v1
- Put KSZ9893 tagging in separate patch
- Remove other switch support
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:48:49 -08:00
Tristram Ha 8c29bebb1f net: dsa: microchip: add KSZ9893 switch support
Add KSZ9893 switch support in KSZ9477 driver.  This switch is similar to
KSZ9477 except the ingress tail tag has 1 byte instead of 2 bytes, so
KSZ9893 tagging will be used.

The XMII register that governs how the host port communicates with the
MAC also has different register definitions.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:48:49 -08:00
Tristram Ha 88b573af91 net: dsa: add KSZ9893 switch tagging support
KSZ9893 switch is similar to KSZ9477 switch except the ingress tail tag
has 1 byte instead of 2 bytes.  The size of the portmap is smaller and
so the override and lookup bits are also moved.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:48:49 -08:00
Tristram Ha a1c0ed24fe dt-bindings: net: dsa: document additional Microchip KSZ9477 family switches
Document additional Microchip KSZ9477 family switches.

Show how KSZ8565 switch should be configured as the host port is port 7
instead of port 5.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-03 13:48:49 -08:00
David S. Miller d5fa9c55e5 mlx5-updates-2019-03-01
This series adds multipath offload support and contains some small updates
 to mlx5 driver.
 
 Multipath offload support from Roi Dayan:
 
 We are going to track SW multipath route and related nexthops and reflect
 that as port affinity to the HW.
 
 1) Some patches are preparation.
 2) add the multipath mode and fib events handling.
 3) add support to handle offload failure for net error, i.e.
 port down.
 4) Small updates to match the behavior of multipath
 
 Two small updates from Eran Ben Elisha,
 5) Make a function static
 6) Update PCIe supported devices list.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJceZBCAAoJEEg/ir3gV/o+sC0H/RWg+QKByvv0L3gqouKRvQq6
 6IL8dacbYnGFiwhXiB/087oPn4g3rOImfvnyrH+d6/clkXGPqrgSCTLhISUc7KyP
 Ig837K0fbn9LdV7a4t6OkTMiH9XUWh/Q88LpMLn0abPHIvE+blm7plbHV1x+D6pA
 +O4QM7qHcDVUYh7lV4WqQJg/caEjNML6JHDouZ0nnqqfWM9C9Jk05sp3Nj80WdWt
 0vruJRlHNTiiKcEcYCoV6z8ZUuN7OhwmyxuPiAGkK5n3Jcpjp9EQ9LfCT+kDuNKO
 SLTk1JMtsv4ZCrB6geC2QIAGUu2lGcLPUUPohgp1nY/hEbK1Vg24sPFhEdk4/r4=
 =JM/g
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2019-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2019-03-01

This series adds multipath offload support and contains some small updates
to mlx5 driver.

Multipath offload support from Roi Dayan:

We are going to track SW multipath route and related nexthops and reflect
that as port affinity to the HW.

1) Some patches are preparation.
2) add the multipath mode and fib events handling.
3) add support to handle offload failure for net error, i.e.
port down.
4) Small updates to match the behavior of multipath

Two small updates from Eran Ben Elisha,
5) Make a function static
6) Update PCIe supported devices list.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 14:04:20 -08:00
David S. Miller 4e7df119d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next:

1) Add .release_ops to properly unroll .select_ops, use it from nft_compat.
   After this change, we can remove list of extensions too to simplify this
   codebase.

2) Update amanda conntrack helper to support v3.4, from Florian Tham.

3) Get rid of the obsolete BUGPRINT macro in ebtables, from
   Florian Westphal.

4) Merge IPv4 and IPv6 masquerading infrastructure into one single module.
   From Florian Westphal.

5) Patchset to remove nf_nat_l3proto structure to get rid of
   indirections, from Florian Westphal.

6) Skip unnecessary conntrack timeout updates in case the value is
   still the same, also from Florian Westphal.

7) Remove unnecessary 'fall through' comments in empty switch cases,
   from Li RongQing.

8) Fix lookup to fixed size hashtable sets on big endian with 32-bit keys.

9) Incorrect logic to deactivate path of fixed size hashtable sets,
   element was being tested to self.

10) Remove nft_hash_key(), the bitmap set is always selected for 16-bit
    keys.

11) Use boolean whenever possible in IPVS codebase, from Andrea Claudi.

12) Enter close state in conntrack if RST matches exact sequence number,
    from Florian Westphal.

13) Initialize dst_cache in tunnel extension, from wenxu.

14) Pass protocol as u16 to xt_check_match and xt_check_target, from
    Li RongQing.

15) SCTP header is granted to be in a linear area from IPVS NAT handler,
    from Xin Long.

16) Don't steal packets coming from slave VRF device from the
    ip_sabotage_in() path, from David Ahern.

17) Fix unsafe update of basechain stats, from Li RongQing.

18) Make sure CONNTRACK_LOCKS is power of 2 to let compiler optimize
    modulo operation as bitwise AND, from Li RongQing.

19) Use device_attribute instead of internal definition in the IDLETIMER
    target, from Sami Tolvanen.

20) Merge redir, masq and IPv4/IPv6 NAT chain types, from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 14:01:04 -08:00
David S. Miller 2369afb669 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2019-03-02

Here's one more bluetooth-next pull request for the 5.1 kernel:

 - Added support for MediaTek MT7663U and MT7668U UART devices
 - Cleanups & fixes to the hci_qca driver
 - Fixed wakeup pin behavior for QCA6174A controller

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 13:55:36 -08:00
David S. Miller 9eb359140c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-02 12:54:35 -08:00
Alexei Starovoitov ea5bade929 Merge branch 'split-test_progs'
Stanislav Fomichev says:

====================
Recently we had linux-next bpf/bpf-next conflict when we added new
functionality to the test_progs.c at the same location. Let's split
test_progs.c the same way we recently split test_verifier.c.

I follow the same patten we did in commit 2dfb40121e ("selftests: bpf:
prepare for break up of verifier tests") for verifier: create
scaffolding to support dedicated files and slowly move the tests into
separate files.

The first patch adds scaffolding, subsequent patches move progs into
separate files.

In theory, many of the standalone tests can be migrated to this new
framework as well. They get the benefit of common CHECK macro and
bpf_find_map function which a lot of standalone tests need to redefine.

v3 changes:
* respin on top of commit ebace0e981 ("selftests/bpf: use
  __bpf_constant_htons in test_prog.c for flow dissector")
* put bpf_rlimit.h into test_progs.c instead of test_progs.h

v2 changes:
* added cover letter, added more description about file structure
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-02 11:10:41 -08:00
Stanislav Fomichev 886225bb08 selftests: bpf: break up test_progs - misc
Move the rest of prog tests into separate files.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-02 11:10:40 -08:00