Commit Graph

755095 Commits

Author SHA1 Message Date
Amit Pundir 7dc5fe0814 Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader
AOSP use userspace firmware loader to load firmwares, which will
return -EAGAIN in case qca/rampatch_00440302.bin is not found.
Since there is no rampatch for dragonboard820c QCA controller
revision, just make it work as is.

CC: Loic Poulain <loic.poulain@linaro.org>
CC: Nicolas Dechesne <nicolas.dechesne@linaro.org>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johan Hedberg <johan.hedberg@gmail.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-05-18 06:37:50 +02:00
Loic Poulain 61a1ecfc80 Bluetooth: btqcomsmd: Fix rx/tx stats
HCI RX/TX byte counters were only incremented when sending ACL packets.
To reflect the real HCI traffic, we need to increment these counters on
HCI events and HCI commands as well.

Increment error counter on rpmsg errors.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-05-18 06:37:50 +02:00
Hans de Goede e6ba8208a4 Bluetooth: hci_bcm: Remove irq-active-low DMI quirk for the Thinkpad 8
Interrupts specified through an "Interrupt" ACPI resource (versus through
a "GpioInt" resource) are now always assumed to be active low.

When this change was originally made the Thinkpad 8 quirk was kept around
because it was uncertain if the Thinkpad 8 uses an "Interrupt" or a
"GpioInt" resource.

Bug https://bugzilla.kernel.org/show_bug.cgi?id=196701 has a DSDT for the
Thinkpad 8 attached and it uses an "Interrupt" resource, so the quirk is
not necessary and the quirk, as well as the irq-active-low quirk handling
code can be removed.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-05-18 06:37:50 +02:00
Hans de Goede 2b05393b06 Bluetooth: hci_bcm: Add broken-irq dmi blacklist and add Meegopad T08 to it
The Meegopad T08 hdmi-stick (think Intel computestick) has a brcm43430
wifi/bt combo chip. The BCM2E90 ACPI device describing the BT part does
contain a valid ActiveLow GpioInt entry, but the GPIO it points to never
goes low, so either the IRQ pin is not connected, or the ACPI resource-
table points to the wrong GPIO.

Eitherway things will not work if we try to use the specified IRQ, this
commits adds a DMI based broken-irq blacklist and disables use of the IRQ
and thus also runtime-pm for devices on this list.

This blacklist starts with the the Meegopad T08, fixing bluetooth not
working on this hdmi-stick. Since this is not a battery powered device
the loss of runtime-pm is not really an issue.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-05-18 06:37:50 +02:00
David S. Miller 6caf9fb3bd Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-05-18

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix two bugs in sockmap, a use after free in sockmap's error path
   from sock_map_ctx_update_elem() where we mistakenly drop a reference
   we didn't take prior to that, and in the same function fix a race
   in bpf_prog_inc_not_zero() where we didn't use the progs from prior
   READ_ONCE(), from John.

2) Reject program expansions once we figure out that their jump target
   which crosses patchlet boundaries could otherwise get truncated in
   insn->off space, from Daniel.

3) Check the return value of fopen() in BPF selftest's test_verifier
   where we determine whether unpriv BPF is disabled, and iff we do
   fail there then just assume it is disabled. This fixes a segfault
   when used with older kernels, from Jesper.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 23:33:52 -04:00
Dave Airlie 1827cad96d Merge tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Userptr IOCTL zero size check (Matt)
- Two hardware quirk fixes (Michel & Chris)

* tag 'drm-intel-fixes-2018-05-17' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915/gen9: Add WaClearHIZ_WM_CHICKEN3 for bxt and glk
  drm/i915/execlists: Use rmb() to order CSB reads
  drm/i915/userptr: reject zero user_size
2018-05-18 12:01:49 +10:00
Or Gerlitz a228060a7c net/mlx5e: Add HW vport counters to representor ethtool stats
Currently the representor only report the SW (slow-path) traffic
counters.

Add packet/bytes reporting of the HW counters, which account for the
total amount of traffic that was handled by the vport, both slow and
fast (offloaded) paths. The newly exposed counters are named
vport_rx/tx_packets/bytes.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Adi Nissim <adin@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 8f8ae8953f net/mlx5e: Ignore attempts to offload multiple times a TC flow
For VF->VF and uplink->VF rules, the TC core (cls_api) attempts
to offload the same flow multiple times into the driver, b/c we
registered to the egdev callback.

Use the flow cookie to ignore attempts to add such flows, we can't
reject them (return error), b/c this will fail the offload attempt,
so we ignore that. We indentify wrong stat/del calls using the flow
ingress/egress flags, here we do return error to the core.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 655dc3d2b9 net/mlx5e: Use shared table for offloaded TC eswitch flows
Currently, each representor netdev use their own hash table to keep
the mapping from TC flow (f->cookie) to the driver offloaded instance.
The table is the one which originally was added for offloading TC NIC
(not eswitch) rules.

This scheme breaks when the core TC code calls us to add the same flow
twice, (e.g under egdev use case) since we don't spot that and offload
a 2nd flow into the HW with the wrong source vport.

As a pre-step to solve that, we move to use a single table which keeps
all offloaded TC eswitch flows. The table is located at the eswitch
uplink representor object.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 05866c8236 net/mlx5e: Prepare for shared table to keep TC eswitch flows
This is a refactoring step to be able and store the hash table which
keeps track of offloaded TC flows in a different location for NIC
vs e-switch rules.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Or Gerlitz 60bd4af814 net/mlx5e: Add ingress/egress indication for offloaded TC flows
When an e-switch TC rule is offloaded through the egdev (egress
device) mechanism, we treat this as egress, all other cases (NIC
and e-switch) are considred ingress.

This is preparation step that will allow us to  identify "wrong"
stat/del offload calls made by the TC core on egdev based flows and
ignore them.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Rabie Loulou b1d90e6bbd net/mlx5e: Offload TC eswitch rules for VFs belonging to different PFs
When the merged eswitch capability is supported, allow offloading rules
between VFs which belong to different PFs (and hence have different
eswitch affinity).

Signed-off-by: Rabie Loulou <rabiel@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:48:54 -07:00
Saeed Mahameed 260ab7042e mlx5-updates-2018-05-17
mlx5 core dirver updates for both net-next and rdma-next branches.
 
 From Christophe JAILLET, first three patche to use kvfree where needed.
 
 From: Or Gerlitz <ogerlitz@mellanox.com>
 
 Next six patches from Roi and Co adds support for merged
 sriov e-switch which comes to serve cases where both PFs, VFs set
 on them and both uplinks are to be used in single v-switch SW model.
 When merged e-switch is supported, the per-port e-switch is logically
 merged into one e-switch that spans both physical ports and all the VFs.
 
 This model allows to offload TC eswitch rules between VFs belonging
 to different PFs (and hence have different eswitch affinity), it also
 sets the some of the foundations needed for uplink LAG support.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJa/fLEAAoJEEg/ir3gV/o+7jUH/3n5/Uw1LLt3TfeKArx6i0F1
 3G4U5B0ha03qiDqXprwhyQ3I6lgYmRBmjcxnqmvcqOAqO4/hSsjtTR+A/mgbEDhJ
 YtdekFNEX+72h/N2GIpZwChIWSE3EcMPaLYnV8TwLUgh9YSust2sCLSBbJCjxOKc
 j78M8ept/bXZwTm/iJhEjtmqw0xl91rl011chCAua0iEpH3wxteDARmKABFHMQxl
 I3N/x/e/astgcSCNgpO4uDf9zEIRkNdzcHPzSMJ6C2Oo5W9XiZEekfw7WKj9nXfa
 G+eGckkAyCOQ/r2lZ9nA0ZUvQ2X6JISvxgohuaCNwTgsz3acTxbLnQK4YWHzQCQ=
 =iHi6
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

mlx5-updates-2018-05-17

mlx5 core dirver updates for both net-next and rdma-next branches.

From Christophe JAILLET, first three patches to use kvfree where needed.

From: Or Gerlitz <ogerlitz@mellanox.com>

Next six patches from Roi and Co adds support for merged
sriov e-switch which comes to serve cases where both PFs, VFs set
on them and both uplinks are to be used in single v-switch SW model.
When merged e-switch is supported, the per-port e-switch is logically
merged into one e-switch that spans both physical ports and all the VFs.

This model allows to offload TC eswitch rules between VFs belonging
to different PFs (and hence have different eswitch affinity), it also
sets the some of the foundations needed for uplink LAG support.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 17:47:55 -07:00
Daniel Borkmann 050fad7c45 bpf: fix truncated jump targets on heavy expansions
Recently during testing, I ran into the following panic:

  [  207.892422] Internal error: Accessing user space memory outside uaccess.h routines: 96000004 [#1] SMP
  [  207.901637] Modules linked in: binfmt_misc [...]
  [  207.966530] CPU: 45 PID: 2256 Comm: test_verifier Tainted: G        W         4.17.0-rc3+ #7
  [  207.974956] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
  [  207.982428] pstate: 60400005 (nZCv daif +PAN -UAO)
  [  207.987214] pc : bpf_skb_load_helper_8_no_cache+0x34/0xc0
  [  207.992603] lr : 0xffff000000bdb754
  [  207.996080] sp : ffff000013703ca0
  [  207.999384] x29: ffff000013703ca0 x28: 0000000000000001
  [  208.004688] x27: 0000000000000001 x26: 0000000000000000
  [  208.009992] x25: ffff000013703ce0 x24: ffff800fb4afcb00
  [  208.015295] x23: ffff00007d2f5038 x22: ffff00007d2f5000
  [  208.020599] x21: fffffffffeff2a6f x20: 000000000000000a
  [  208.025903] x19: ffff000009578000 x18: 0000000000000a03
  [  208.031206] x17: 0000000000000000 x16: 0000000000000000
  [  208.036510] x15: 0000ffff9de83000 x14: 0000000000000000
  [  208.041813] x13: 0000000000000000 x12: 0000000000000000
  [  208.047116] x11: 0000000000000001 x10: ffff0000089e7f18
  [  208.052419] x9 : fffffffffeff2a6f x8 : 0000000000000000
  [  208.057723] x7 : 000000000000000a x6 : 00280c6160000000
  [  208.063026] x5 : 0000000000000018 x4 : 0000000000007db6
  [  208.068329] x3 : 000000000008647a x2 : 19868179b1484500
  [  208.073632] x1 : 0000000000000000 x0 : ffff000009578c08
  [  208.078938] Process test_verifier (pid: 2256, stack limit = 0x0000000049ca7974)
  [  208.086235] Call trace:
  [  208.088672]  bpf_skb_load_helper_8_no_cache+0x34/0xc0
  [  208.093713]  0xffff000000bdb754
  [  208.096845]  bpf_test_run+0x78/0xf8
  [  208.100324]  bpf_prog_test_run_skb+0x148/0x230
  [  208.104758]  sys_bpf+0x314/0x1198
  [  208.108064]  el0_svc_naked+0x30/0x34
  [  208.111632] Code: 91302260 f9400001 f9001fa1 d2800001 (29500680)
  [  208.117717] ---[ end trace 263cb8a59b5bf29f ]---

The program itself which caused this had a long jump over the whole
instruction sequence where all of the inner instructions required
heavy expansions into multiple BPF instructions. Additionally, I also
had BPF hardening enabled which requires once more rewrites of all
constant values in order to blind them. Each time we rewrite insns,
bpf_adj_branches() would need to potentially adjust branch targets
which cross the patchlet boundary to accommodate for the additional
delta. Eventually that lead to the case where the target offset could
not fit into insn->off's upper 0x7fff limit anymore where then offset
wraps around becoming negative (in s16 universe), or vice versa
depending on the jump direction.

Therefore it becomes necessary to detect and reject any such occasions
in a generic way for native eBPF and cBPF to eBPF migrations. For
the latter we can simply check bounds in the bpf_convert_filter()'s
BPF_EMIT_JMP helper macro and bail out once we surpass limits. The
bpf_patch_insn_single() for native eBPF (and cBPF to eBPF in case
of subsequent hardening) is a bit more complex in that we need to
detect such truncations before hitting the bpf_prog_realloc(). Thus
the latter is split into an extra pass to probe problematic offsets
on the original program in order to fail early. With that in place
and carefully tested I no longer hit the panic and the rewrites are
rejected properly. The above example panic I've seen on bpf-next,
though the issue itself is generic in that a guard against this issue
in bpf seems more appropriate in this case.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-05-17 16:05:35 -07:00
Linus Torvalds 3acf4e3952 k10temp fixes
Fix race condition when accessing System Management Network registers
 Fix reading critical temperatures on F15h M60h and M70h
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJa+0BbAAoJEMsfJm/On5mBo3EQAJxtFC7pA7JzY0yZsXvaA+50
 ObN9EtG5mhVMZQfcOThcN6ZGzV12rpJltsCp6Poy0g8n7rgLiB5y2IJvinM7ETil
 6zbw5onfv2So/WyvXWBylEI0J4WjtGc8n17S1+nlT+Ppy4ID6PQPv1pGfr7YVI0o
 0T2sLSfDQD7vgtvpHi7A+4q2hbsI0HjS3LKI8CAy4UboZ8yltxJBsgV7gJ3fbv4Z
 tX9DOH05bGsCR/9vwoA3rRVbUKbvPnwTY36DCAyT53QuYRIBwREXi/xkxCkKdSsn
 X3o78TPkvE/qTyK1ZjuJ5yxDdLmesibiKOtyPBeaPaTQ+jcayfSr+rQrAvsZ2Ogp
 8pjZ5he3LR4/8wdmBhZBBcDXDdBMar8SRMSpPrBRyWONpn5fSLuszUkintKTND4c
 dH1zlXmYjRFsQBW2O+/b6k1Hq/p654mwD4hBbxHN7FVBnrWDWzUgd2xSpQLxSqkz
 sfyd6wsvrVeUCGHAsgVY9sXYlbrTjI1WWkOX4EAJC2YKvWDYTB/kQXg0I5vICN4m
 9tLyoC8tvKothIe8J1U5VUeGgpP5QES+yf7YNF9gc02D8l5xlsWuUAVrBI1XBOdS
 0MXFFFxM68Y6ufhIiahSXPM7vocSFi6CuuYbuz6Z09a2L9cahG4C5+Qe9E9h6PjM
 N4uOoFJGKckctQYJB0rO
 =SujR
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus-v4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "Two k10temp fixes:

   - fix race condition when accessing System Management Network
     registers

   - fix reading critical temperatures on F15h M60h and M70h

  Also add PCI ID's for the AMD Raven Ridge root bridge"

* tag 'hwmon-for-linus-v4.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (k10temp) Use API function to access System Management Network
  x86/amd_nb: Add support for Raven Ridge CPUs
  hwmon: (k10temp) Fix reading critical temperature register
2018-05-17 15:58:12 -07:00
John Fastabend 9617456054 bpf: parse and verdict prog attach may race with bpf map update
In the sockmap design BPF programs (SK_SKB_STREAM_PARSER,
SK_SKB_STREAM_VERDICT and SK_MSG_VERDICT) are attached to the sockmap
map type and when a sock is added to the map the programs are used by
the socket. However, sockmap updates from both userspace and BPF
programs can happen concurrently with the attach and detach of these
programs.

To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
primitive to ensure the program pointer is not refeched and
possibly NULL'd before the refcnt increment. This happens inside
a RCU critical section so although the pointer reference in the map
object may be NULL (by a concurrent detach operation) the reference
from READ_ONCE will not be free'd until after grace period. This
ensures the object returned by READ_ONCE() is valid through the
RCU criticl section and safe to use as long as we "know" it may
be free'd shortly.

Daniel spotted a case in the sock update API where instead of using
the READ_ONCE() program reference we used the pointer from the
original map, stab->bpf_{verdict|parse|txmsg}. The problem with this
is the logic checks the object returned from the READ_ONCE() is not
NULL and then tries to reference the object again but using the
above map pointer, which may have already been NULL'd by a parallel
detach operation. If this happened bpf_porg_inc_not_zero could
dereference a NULL pointer.

Fix this by using variable returned by READ_ONCE() that is checked
for NULL.

Fixes: 2f857d0460 ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 00:27:37 +02:00
John Fastabend a593f70831 bpf: sockmap update rollback on error can incorrectly dec prog refcnt
If the user were to only attach one of the parse or verdict programs
then it is possible a subsequent sockmap update could incorrectly
decrement the refcnt on the program. This happens because in the
rollback logic, after an error, we have to decrement the program
reference count when its been incremented. However, we only increment
the program reference count if the user has both a verdict and a
parse program. The reason for this is because, at least at the
moment, both are required for any one to be meaningful. The problem
fixed here is in the rollback path we decrement the program refcnt
even if only one existing. But we never incremented the refcnt in
the first place creating an imbalance.

This patch fixes the error path to handle this case.

Fixes: 2f857d0460 ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 00:27:37 +02:00
Shahar Klein 10ff5359f8 net/mlx5e: Explicitly set source e-switch in offloaded TC rules
Set a specific source e-switch when setting a rule that matches on the
ingress port.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:35 -07:00
Shahar Klein 3e99df8772 net/mlx5: Add source e-switch owner
The source e-switch owner allows a vport on one e-switch port be associated
with a rule defined on the second port e-switch.

The role of the source eswitch owner valid bit in the flow group is to
allow the firmware fail driver attempts to wild card the source eswitch
match field. If this bit is not set, the firmware ignores the source
eswitch owner field totally.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Rabie Loulou 56e858df9f net/mlx5e: Explicitly set destination e-switch in FDB rules
Set a specific destination e-switch when setting a destination vport.

Signed-off-by: Rabie Loulou <rabiel@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Shahar Klein b17f7fc10f net/mlx5: Add destination e-switch owner
The destination e-switch owner allows a rule in namespace of one e-switch
owner to point to a vport that is natively associated with another
e-switch owner.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Shahar Klein 65360e5451 net/mlx5: Properly handle a vport destination when setting FTE
When creating FTE, properly distinguish between destination being vport
or tir. The previous code just worked accidentally b/c of both dest being
in the same offset within a union.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
Roi Dayan a6d0456912 net/mlx5: Add merged e-switch cap
When merged e-switch is supported, the per-port e-switch is logically
merged into one e-switch that spans both physical ports and all the VFs.
Under merged eswitch, both the matching on source vport and setting
destination vport can have a 2nd attribute which is the vhca id of the
eswitch owner.

For example:
esw0: {match: <src vport=1 owner=0> action: fwd to <dst vport=7, owner=1>}
is a flow set on eswitch0 matching on source vport=1 from his eswitch
and the action being fwd to dest vport=7 of eswitch1.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Or Gerlitz Klein <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17 14:17:34 -07:00
David S. Miller 538e2de104 Merge branch 'net-Allow-more-drivers-with-COMPILE_TEST'
Florian Fainelli says:

====================
net: Allow more drivers with COMPILE_TEST

This patch series includes more drivers to be build tested with COMPILE_TEST
enabled. This helps cover some of the issues I just ran into with missing
a driver *sigh*.

Chanves in v3:

- drop the TI Keystone NETCP driver from the COMPILE_TEST additions

Changes in v2:

- allow FEC to build outside of CONFIG_ARM/ARM64 by defining a layout of
  registers, this is not meant to run, so this is not a real issue if we
  are not matching the correct register layout
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:11:07 -04:00
Florian Fainelli 3c0596f8be net: phy: Allow MDIO_MOXART and MDIO_SUN4I with COMPILE_TEST
Those drivers build just fine with COMPILE_TEST, so make that possible.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:11:06 -04:00
Florian Fainelli 78cc6e7ef9 net: ethernet: freescale: Allow FEC with COMPILE_TEST
The Freescale FEC driver builds fine with COMPILE_TEST, so make that
possible.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:11:06 -04:00
Florian Fainelli 2652113ff0 net: ethernet: ti: Allow most drivers with COMPILE_TEST
Most of the TI drivers build just fine with COMPILE_TEST, cpmac (AR7) is
the exception because it uses a header file from
arch/mips/include/asm/mach-ar7/ar7.h and keystone netcp which requires
help from drivers/soc/ti/ for queue management helpers.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:11:06 -04:00
David Ahern 33fa382324 vlan: Add extack messages for link create
Add informative messages for error paths related to adding a
VLAN to a device.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:08:55 -04:00
Manish Chopra 8a8633978b qede: Add build_skb() support.
This patch makes use of build_skb() throughout in driver's receieve
data path [HW gro flow and non HW gro flow]. With this, driver can
build skb directly from the page segments which are already mapped
to the hardware instead of allocating new SKB via netdev_alloc_skb()
and memcpy the data which is quite costly.

This really improves performance (keeping same or slight gain in rx
throughput) in terms of CPU utilization which is significantly reduced
[almost half] in non HW gro flow where for every incoming MTU sized
packet driver had to allocate skb, memcpy headers etc. Additionally
in that flow, it also gets rid of bunch of additional overheads
[eth_get_headlen() etc.] to split headers and data in the skb.

Tested with:
system: 2 sockets, 4 cores per socket, hyperthreading, 2x4x2=16 cores
iperf [server]: iperf -s
iperf [client]: iperf -c <server_ip> -t 500 -i 10 -P 32

HW GRO off – w/o build_skb(), throughput: 36.8 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.59    0.00   32.93    0.00    0.00   43.07    0.00    0.00   23.42

HW GRO off - with build_skb(), throughput: 36.9 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.70    0.00   31.70    0.00    0.00   25.68    0.00    0.00   41.92

HW GRO on - w/o build_skb(), throughput: 36.9 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.86    0.00   24.14    0.00    0.00    6.59    0.00    0.00   68.41

HW GRO on - with build_skb(), throughput: 37.5 Gbits/sec

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
Average:     all    0.87    0.00   23.75    0.00    0.00    6.19    0.00    0.00   69.19

Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:06:53 -04:00
Willem de Bruijn 113f99c335 net: test tailroom before appending to linear skb
Device features may change during transmission. In particular with
corking, a device may toggle scatter-gather in between allocating
and writing to an skb.

Do not unconditionally assume that !NETIF_F_SG at write time implies
that the same held at alloc time and thus the skb has sufficient
tailroom.

This issue predates git history.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:05:01 -04:00
David S. Miller 56a9a9e737 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-05-17

This series contains updates to ixgbe, ixgbevf and ice drivers.

Cathy Zhou resolves sparse warnings by using the force attribute.

Mauro S M Rodrigues fixes a bug where IRQs were not freed if a PCI error
recovery system opts to remove the device which causes
ixgbe_io_error_detected() to return PCI_ERS_RESULT_DISCONNECT before
calling ixgbe_close_suspend() which results in IRQs not freed and
crashing when the remove handler calls pci_disable_device().  Resolved
this by calling ixgbe_close_suspend() before evaluating the PCI channel
state.

Pavel Tatashin releases the rtnl_lock during the call to
ixgbe_close_suspend() to allow scaling if device_shutdown() is
multi-threaded.

Emil modifies ixgbe to not validate the MAC address during a reset,
unless the MAC was set on the host so that the VF will get a new MAC
address every time it reloads.  Also updates ixgbevf to set
hw->mac.perm_addr in order to retain the custom MAC on a reset.

Anirudh updates the ice NVM read/erase/update AQ commands to align with
the latest specification.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 17:02:57 -04:00
David S. Miller 374edea4aa Merge branch 'ip6_gre-Fixes-in-headroom-handling'
Petr Machata says:

====================
net: ip6_gre: Fixes in headroom handling

This series mends some problems in headroom management in ip6_gre
module. The current code base has the following three closely-related
problems:

- ip6gretap tunnels neglect to ensure there's enough writable headroom
  before pushing GRE headers.

- ip6erspan does this, but assumes that dev->needed_headroom is primed.
  But that doesn't happen until ip6_tnl_xmit() is called later. Thus for
  the first packet, ip6erspan actually behaves like ip6gretap above.

- ip6erspan shares some of the code with ip6gretap, including
  calculations of needed header length. While there is custom
  ERSPAN-specific code for calculating the headroom, the computed
  values are overwritten by the ip6gretap code.

The first two issues lead to a kernel panic in situations where a packet
is mirrored from a veth device to the device in question. They are
fixed, respectively, in patches #1 and #2, which include the full panic
trace and a reproducer.

The rest of the patchset deals with the last issue. In patches #3 to #6,
several functions are split up into reusable parts. Finally in patch #7
these blocks are used to compose ERSPAN-specific callbacks where
necessary to fix the hlen calculation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:20 -04:00
Petr Machata 2d665034f2 net: ip6_gre: Fix ip6erspan hlen calculation
Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
overwrites these settings with GRE-specific ones.

Similarly for changelink callbacks, which are handled by
ip6gre_changelink() calls ip6gre_tnl_change() calls
ip6gre_tnl_link_config() as well.

The difference ends up being 12 vs. 20 bytes, and this is generally not
a problem, because a 12-byte request likely ends up allocating more and
the extra 8 bytes are thus available. However correct it is not.

So replace the newlink and changelink callbacks with an ERSPAN-specific
ones, reusing the newly-introduced _common() functions.

Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Petr Machata c8632fc30b net: ip6_gre: Split up ip6gre_changelink()
Extract from ip6gre_changelink() a reusable function
ip6gre_changelink_common(). This will allow introduction of
ERSPAN-specific _changelink() function with not a lot of code
duplication.

Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Petr Machata 7fa38a7c85 net: ip6_gre: Split up ip6gre_newlink()
Extract from ip6gre_newlink() a reusable function
ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
made customizable for ERSPAN, thus reorder it with calls to
ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
function without a lot of duplicity.

Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Petr Machata a6465350ef net: ip6_gre: Split up ip6gre_tnl_change()
Split a reusable function ip6gre_tnl_copy_tnl_parm() from
ip6gre_tnl_change(). This will allow ERSPAN-specific code to
reuse the common parts while customizing the behavior for ERSPAN.

Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Petr Machata a483373ead net: ip6_gre: Split up ip6gre_tnl_link_config()
The function ip6gre_tnl_link_config() is used for setting up
configuration of both ip6gretap and ip6erspan tunnels. Split the
function into the common part and the route-lookup part. The latter then
takes the calculated header length as an argument. This split will allow
the patches down the line to sneak in a custom header length computation
for the ERSPAN tunnel.

Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Petr Machata 5691484df9 net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()
dev->needed_headroom is not primed until ip6_tnl_xmit(), so it starts
out zero. Thus the call to skb_cow_head() fails to actually make sure
there's enough headroom to push the ERSPAN headers to. That can lead to
the panic cited below. (Reproducer below that).

Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.

[  190.703567] kernel BUG at net/core/skbuff.c:104!
[  190.708384] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  190.714007] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_temp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[  190.728975] CPU: 1 PID: 959 Comm: kworker/1:2 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[  190.737647] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[  190.747006] Workqueue: ipv6_addrconf addrconf_dad_work
[  190.752222] RIP: 0010:skb_panic+0xc3/0x100
[  190.756358] RSP: 0018:ffff8801d54072f0 EFLAGS: 00010282
[  190.761629] RAX: 0000000000000085 RBX: ffff8801c1a8ecc0 RCX: 0000000000000000
[  190.768830] RDX: 0000000000000085 RSI: dffffc0000000000 RDI: ffffed003aa80e54
[  190.776025] RBP: ffff8801bd1ec5a0 R08: ffffed003aabce19 R09: ffffed003aabce19
[  190.783226] R10: 0000000000000001 R11: ffffed003aabce18 R12: ffff8801bf695dbe
[  190.790418] R13: 0000000000000084 R14: 00000000000006c0 R15: ffff8801bf695dc8
[  190.797621] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[  190.805786] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  190.811582] CR2: 000055fa929aced0 CR3: 0000000003228004 CR4: 00000000001606e0
[  190.818790] Call Trace:
[  190.821264]  <IRQ>
[  190.823314]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
[  190.828940]  ? ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
[  190.834562]  skb_push+0x78/0x90
[  190.837749]  ip6erspan_tunnel_xmit+0x5e4/0x1982 [ip6_gre]
[  190.843219]  ? ip6gre_tunnel_ioctl+0xd90/0xd90 [ip6_gre]
[  190.848577]  ? debug_check_no_locks_freed+0x210/0x210
[  190.853679]  ? debug_check_no_locks_freed+0x210/0x210
[  190.858783]  ? print_irqtrace_events+0x120/0x120
[  190.863451]  ? sched_clock_cpu+0x18/0x210
[  190.867496]  ? cyc2ns_read_end+0x10/0x10
[  190.871474]  ? skb_network_protocol+0x76/0x200
[  190.875977]  dev_hard_start_xmit+0x137/0x770
[  190.880317]  ? do_raw_spin_trylock+0x6d/0xa0
[  190.884624]  sch_direct_xmit+0x2ef/0x5d0
[  190.888589]  ? pfifo_fast_dequeue+0x3fa/0x670
[  190.892994]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
[  190.898455]  ? __lock_is_held+0xa0/0x160
[  190.902422]  __qdisc_run+0x39e/0xfc0
[  190.906041]  ? _raw_spin_unlock+0x29/0x40
[  190.910090]  ? pfifo_fast_enqueue+0x24b/0x3e0
[  190.914501]  ? sch_direct_xmit+0x5d0/0x5d0
[  190.918658]  ? pfifo_fast_dequeue+0x670/0x670
[  190.923047]  ? __dev_queue_xmit+0x172/0x1770
[  190.927365]  ? preempt_count_sub+0xf/0xd0
[  190.931421]  __dev_queue_xmit+0x410/0x1770
[  190.935553]  ? ___slab_alloc+0x605/0x930
[  190.939524]  ? print_irqtrace_events+0x120/0x120
[  190.944186]  ? memcpy+0x34/0x50
[  190.947364]  ? netdev_pick_tx+0x1c0/0x1c0
[  190.951428]  ? __skb_clone+0x2fd/0x3d0
[  190.955218]  ? __copy_skb_header+0x270/0x270
[  190.959537]  ? rcu_read_lock_sched_held+0x93/0xa0
[  190.964282]  ? kmem_cache_alloc+0x344/0x4d0
[  190.968520]  ? cyc2ns_read_end+0x10/0x10
[  190.972495]  ? skb_clone+0x123/0x230
[  190.976112]  ? skb_split+0x820/0x820
[  190.979747]  ? tcf_mirred+0x554/0x930 [act_mirred]
[  190.984582]  tcf_mirred+0x554/0x930 [act_mirred]
[  190.989252]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[  190.996109]  ? __lock_acquire+0x706/0x26e0
[  191.000239]  ? sched_clock_cpu+0x18/0x210
[  191.004294]  tcf_action_exec+0xcf/0x2a0
[  191.008179]  tcf_classify+0xfa/0x340
[  191.011794]  __netif_receive_skb_core+0x8e1/0x1c60
[  191.016630]  ? debug_check_no_locks_freed+0x210/0x210
[  191.021732]  ? nf_ingress+0x500/0x500
[  191.025458]  ? process_backlog+0x347/0x4b0
[  191.029619]  ? print_irqtrace_events+0x120/0x120
[  191.034302]  ? lock_acquire+0xd8/0x320
[  191.038089]  ? process_backlog+0x1b6/0x4b0
[  191.042246]  ? process_backlog+0xc2/0x4b0
[  191.046303]  process_backlog+0xc2/0x4b0
[  191.050189]  net_rx_action+0x5cc/0x980
[  191.053991]  ? napi_complete_done+0x2c0/0x2c0
[  191.058386]  ? mark_lock+0x13d/0xb40
[  191.062001]  ? clockevents_program_event+0x6b/0x1d0
[  191.066922]  ? print_irqtrace_events+0x120/0x120
[  191.071593]  ? __lock_is_held+0xa0/0x160
[  191.075566]  __do_softirq+0x1d4/0x9d2
[  191.079282]  ? ip6_finish_output2+0x524/0x1460
[  191.083771]  do_softirq_own_stack+0x2a/0x40
[  191.087994]  </IRQ>
[  191.090130]  do_softirq.part.13+0x38/0x40
[  191.094178]  __local_bh_enable_ip+0x135/0x190
[  191.098591]  ip6_finish_output2+0x54d/0x1460
[  191.102916]  ? ip6_forward_finish+0x2f0/0x2f0
[  191.107314]  ? ip6_mtu+0x3c/0x2c0
[  191.110674]  ? ip6_finish_output+0x2f8/0x650
[  191.114992]  ? ip6_output+0x12a/0x500
[  191.118696]  ip6_output+0x12a/0x500
[  191.122223]  ? ip6_route_dev_notify+0x5b0/0x5b0
[  191.126807]  ? ip6_finish_output+0x650/0x650
[  191.131120]  ? ip6_fragment+0x1a60/0x1a60
[  191.135182]  ? icmp6_dst_alloc+0x26e/0x470
[  191.139317]  mld_sendpack+0x672/0x830
[  191.143021]  ? igmp6_mcf_seq_next+0x2f0/0x2f0
[  191.147429]  ? __local_bh_enable_ip+0x77/0x190
[  191.151913]  ipv6_mc_dad_complete+0x47/0x90
[  191.156144]  addrconf_dad_completed+0x561/0x720
[  191.160731]  ? addrconf_rs_timer+0x3a0/0x3a0
[  191.165036]  ? mark_held_locks+0xc9/0x140
[  191.169095]  ? __local_bh_enable_ip+0x77/0x190
[  191.173570]  ? addrconf_dad_work+0x50d/0xa20
[  191.177886]  ? addrconf_dad_work+0x529/0xa20
[  191.182194]  addrconf_dad_work+0x529/0xa20
[  191.186342]  ? addrconf_dad_completed+0x720/0x720
[  191.191088]  ? __lock_is_held+0xa0/0x160
[  191.195059]  ? process_one_work+0x45d/0xe20
[  191.199302]  ? process_one_work+0x51e/0xe20
[  191.203531]  ? rcu_read_lock_sched_held+0x93/0xa0
[  191.208279]  process_one_work+0x51e/0xe20
[  191.212340]  ? pwq_dec_nr_in_flight+0x200/0x200
[  191.216912]  ? get_lock_stats+0x4b/0xf0
[  191.220788]  ? preempt_count_sub+0xf/0xd0
[  191.224844]  ? worker_thread+0x219/0x860
[  191.228823]  ? do_raw_spin_trylock+0x6d/0xa0
[  191.233142]  worker_thread+0xeb/0x860
[  191.236848]  ? process_one_work+0xe20/0xe20
[  191.241095]  kthread+0x206/0x300
[  191.244352]  ? process_one_work+0xe20/0xe20
[  191.248587]  ? kthread_stop+0x570/0x570
[  191.252459]  ret_from_fork+0x3a/0x50
[  191.256082] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[  191.275327] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d54072f0
[  191.281024] ---[ end trace 7ea51094e099e006 ]---
[  191.285724] Kernel panic - not syncing: Fatal exception in interrupt
[  191.292168] Kernel Offset: disabled
[  191.295697] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Reproducer:

	ip link add h1 type veth peer name swp1
	ip link add h3 type veth peer name swp3

	ip link set dev h1 up
	ip address add 192.0.2.1/28 dev h1

	ip link add dev vh3 type vrf table 20
	ip link set dev h3 master vh3
	ip link set dev vh3 up
	ip link set dev h3 up

	ip link set dev swp3 up
	ip address add dev swp3 2001:db8:2::1/64

	ip link set dev swp1 up
	tc qdisc add dev swp1 clsact

	ip link add name gt6 type ip6erspan \
		local 2001:db8:2::1 remote 2001:db8:2::2 oseq okey 123
	ip link set dev gt6 up

	sleep 1

	tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
		action mirred egress mirror dev gt6
	ping -I h1 192.0.2.2

Fixes: e41c7c68ea ("ip6erspan: make sure enough headroom at xmit.")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Petr Machata 01b8d064d5 net: ip6_gre: Request headroom in __gre6_xmit()
__gre6_xmit() pushes GRE headers before handing over to ip6_tnl_xmit()
for generic IP-in-IP processing. However it doesn't make sure that there
is enough headroom to push the header to. That can lead to the panic
cited below. (Reproducer below that).

Fix by requesting either needed_headroom if already primed, or just the
bare minimum needed for the header otherwise.

[  158.576725] kernel BUG at net/core/skbuff.c:104!
[  158.581510] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
[  158.587174] Modules linked in: act_mirred cls_matchall ip6_gre ip6_tunnel tunnel6 gre sch_ingress vrf veth x86_pkg_temp_thermal mlx_platform nfsd e1000e leds_mlxcpld
[  158.602268] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 4.17.0-rc4-net_master-custom-139 #10
[  158.610938] Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2F"/"SA000874", BIOS 4.6.5 03/08/2016
[  158.620426] RIP: 0010:skb_panic+0xc3/0x100
[  158.624586] RSP: 0018:ffff8801d3f27110 EFLAGS: 00010286
[  158.629882] RAX: 0000000000000082 RBX: ffff8801c02cc040 RCX: 0000000000000000
[  158.637127] RDX: 0000000000000082 RSI: dffffc0000000000 RDI: ffffed003a7e4e18
[  158.644366] RBP: ffff8801bfec8020 R08: ffffed003aabce19 R09: ffffed003aabce19
[  158.651574] R10: 000000000000000b R11: ffffed003aabce18 R12: ffff8801c364de66
[  158.658786] R13: 000000000000002c R14: 00000000000000c0 R15: ffff8801c364de68
[  158.666007] FS:  0000000000000000(0000) GS:ffff8801d5400000(0000) knlGS:0000000000000000
[  158.674212] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.680036] CR2: 00007f4b3702dcd0 CR3: 0000000003228002 CR4: 00000000001606e0
[  158.687228] Call Trace:
[  158.689752]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.694475]  ? __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.699141]  skb_push+0x78/0x90
[  158.702344]  __gre6_xmit+0x246/0xd80 [ip6_gre]
[  158.706872]  ip6gre_tunnel_xmit+0x3bc/0x610 [ip6_gre]
[  158.711992]  ? __gre6_xmit+0xd80/0xd80 [ip6_gre]
[  158.716668]  ? debug_check_no_locks_freed+0x210/0x210
[  158.721761]  ? print_irqtrace_events+0x120/0x120
[  158.726461]  ? sched_clock_cpu+0x18/0x210
[  158.730572]  ? sched_clock_cpu+0x18/0x210
[  158.734692]  ? cyc2ns_read_end+0x10/0x10
[  158.738705]  ? skb_network_protocol+0x76/0x200
[  158.743216]  ? netif_skb_features+0x1b2/0x550
[  158.747648]  dev_hard_start_xmit+0x137/0x770
[  158.752010]  sch_direct_xmit+0x2ef/0x5d0
[  158.755992]  ? pfifo_fast_dequeue+0x3fa/0x670
[  158.760460]  ? pfifo_fast_change_tx_queue_len+0x810/0x810
[  158.765975]  ? __lock_is_held+0xa0/0x160
[  158.770002]  __qdisc_run+0x39e/0xfc0
[  158.773673]  ? _raw_spin_unlock+0x29/0x40
[  158.777781]  ? pfifo_fast_enqueue+0x24b/0x3e0
[  158.782191]  ? sch_direct_xmit+0x5d0/0x5d0
[  158.786372]  ? pfifo_fast_dequeue+0x670/0x670
[  158.790818]  ? __dev_queue_xmit+0x172/0x1770
[  158.795195]  ? preempt_count_sub+0xf/0xd0
[  158.799313]  __dev_queue_xmit+0x410/0x1770
[  158.803512]  ? ___slab_alloc+0x605/0x930
[  158.807525]  ? ___slab_alloc+0x605/0x930
[  158.811540]  ? memcpy+0x34/0x50
[  158.814768]  ? netdev_pick_tx+0x1c0/0x1c0
[  158.818895]  ? __skb_clone+0x2fd/0x3d0
[  158.822712]  ? __copy_skb_header+0x270/0x270
[  158.827079]  ? rcu_read_lock_sched_held+0x93/0xa0
[  158.831903]  ? kmem_cache_alloc+0x344/0x4d0
[  158.836199]  ? skb_clone+0x123/0x230
[  158.839869]  ? skb_split+0x820/0x820
[  158.843521]  ? tcf_mirred+0x554/0x930 [act_mirred]
[  158.848407]  tcf_mirred+0x554/0x930 [act_mirred]
[  158.853104]  ? tcf_mirred_act_wants_ingress.part.2+0x10/0x10 [act_mirred]
[  158.860005]  ? __lock_acquire+0x706/0x26e0
[  158.864162]  ? mark_lock+0x13d/0xb40
[  158.867832]  tcf_action_exec+0xcf/0x2a0
[  158.871736]  tcf_classify+0xfa/0x340
[  158.875402]  __netif_receive_skb_core+0x8e1/0x1c60
[  158.880334]  ? nf_ingress+0x500/0x500
[  158.884059]  ? process_backlog+0x347/0x4b0
[  158.888241]  ? lock_acquire+0xd8/0x320
[  158.892050]  ? process_backlog+0x1b6/0x4b0
[  158.896228]  ? process_backlog+0xc2/0x4b0
[  158.900291]  process_backlog+0xc2/0x4b0
[  158.904210]  net_rx_action+0x5cc/0x980
[  158.908047]  ? napi_complete_done+0x2c0/0x2c0
[  158.912525]  ? rcu_read_unlock+0x80/0x80
[  158.916534]  ? __lock_is_held+0x34/0x160
[  158.920541]  __do_softirq+0x1d4/0x9d2
[  158.924308]  ? trace_event_raw_event_irq_handler_exit+0x140/0x140
[  158.930515]  run_ksoftirqd+0x1d/0x40
[  158.934152]  smpboot_thread_fn+0x32b/0x690
[  158.938299]  ? sort_range+0x20/0x20
[  158.941842]  ? preempt_count_sub+0xf/0xd0
[  158.945940]  ? schedule+0x5b/0x140
[  158.949412]  kthread+0x206/0x300
[  158.952689]  ? sort_range+0x20/0x20
[  158.956249]  ? kthread_stop+0x570/0x570
[  158.960164]  ret_from_fork+0x3a/0x50
[  158.963823] Code: 14 3e ff 8b 4b 78 55 4d 89 f9 41 56 41 55 48 c7 c7 a0 cf db 82 41 54 44 8b 44 24 2c 48 8b 54 24 30 48 8b 74 24 20 e8 16 94 13 ff <0f> 0b 48 c7 c7 60 8e 1f 85 48 83 c4 20 e8 55 ef a6 ff 89 74 24
[  158.983235] RIP: skb_panic+0xc3/0x100 RSP: ffff8801d3f27110
[  158.988935] ---[ end trace 5af56ee845aa6cc8 ]---
[  158.993641] Kernel panic - not syncing: Fatal exception in interrupt
[  159.000176] Kernel Offset: disabled
[  159.003767] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Reproducer:

	ip link add h1 type veth peer name swp1
	ip link add h3 type veth peer name swp3

	ip link set dev h1 up
	ip address add 192.0.2.1/28 dev h1

	ip link add dev vh3 type vrf table 20
	ip link set dev h3 master vh3
	ip link set dev vh3 up
	ip link set dev h3 up

	ip link set dev swp3 up
	ip address add dev swp3 2001:db8:2::1/64

	ip link set dev swp1 up
	tc qdisc add dev swp1 clsact

	ip link add name gt6 type ip6gretap \
		local 2001:db8:2::1 remote 2001:db8:2::2
	ip link set dev gt6 up

	sleep 1

	tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
		action mirred egress mirror dev gt6
	ping -I h1 192.0.2.2

Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:50:06 -04:00
Roman Mashak 7c5995b33d tc-testing: fixed copy-pasting error in ife tests
Reported-by: Vlad Buslov <vladbu@mellanox.com>
Reported-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:31:43 -04:00
Dan Carpenter 990a9d4975 net/ncsi: prevent a couple array underflows
We recently refactored this code and introduced a static checker
warning.  Smatch complains that if cmd->index is zero then we would
underflow the arrays.  That's obviously true.

The question is whether we prevent cmd->index from being zero at a
different level.  I've looked at the code and I don't immediately see
a check for that.

Fixes: 062b3e1b6d ("net/ncsi: Refactor MAC, VLAN filters")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:27:39 -04:00
Eric Dumazet be7f3e5999 net/smc: init conn.tx_work & conn.send_lock sooner
syzkaller found that following program crashes the host :

{
  int fd = socket(AF_SMC, SOCK_STREAM, 0);
  int val = 1;

  listen(fd, 0);
  shutdown(fd, SHUT_RDWR);
  setsockopt(fd, 6, TCP_NODELAY, &val, 4);
}

Simply initialize conn.tx_work & conn.send_lock at socket creation,
rather than deeper in the stack.

ODEBUG: assert_init not available (active state 0) object type: timer_list hint:           (null)
WARNING: CPU: 1 PID: 13988 at lib/debugobjects.c:329 debug_print_object+0x16a/0x210 lib/debugobjects.c:326
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 13988 Comm: syz-executor0 Not tainted 4.17.0-rc4+ #46
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 panic+0x22f/0x4de kernel/panic.c:184
 __warn.cold.8+0x163/0x1b3 kernel/panic.c:536
 report_bug+0x252/0x2d0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
RIP: 0010:debug_print_object+0x16a/0x210 lib/debugobjects.c:326
RSP: 0018:ffff880197a37880 EFLAGS: 00010086
RAX: 0000000000000061 RBX: 0000000000000005 RCX: ffffc90001ed0000
RDX: 0000000000004aaf RSI: ffffffff8160f6f1 RDI: 0000000000000001
RBP: ffff880197a378c0 R08: ffff8801aa7a0080 R09: ffffed003b5e3eb2
R10: ffffed003b5e3eb2 R11: ffff8801daf1f597 R12: 0000000000000001
R13: ffffffff88d96980 R14: ffffffff87fa19a0 R15: ffffffff81666ec0
 debug_object_assert_init+0x309/0x500 lib/debugobjects.c:692
 debug_timer_assert_init kernel/time/timer.c:724 [inline]
 debug_assert_init kernel/time/timer.c:776 [inline]
 del_timer+0x74/0x140 kernel/time/timer.c:1198
 try_to_grab_pending+0x439/0x9a0 kernel/workqueue.c:1223
 mod_delayed_work_on+0x91/0x250 kernel/workqueue.c:1592
 mod_delayed_work include/linux/workqueue.h:541 [inline]
 smc_setsockopt+0x387/0x6d0 net/smc/af_smc.c:1367
 __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
 __do_sys_setsockopt net/socket.c:1914 [inline]
 __se_sys_setsockopt net/socket.c:1911 [inline]
 __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 01d2f7e2cd ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:25:35 -04:00
Jiri Pirko 3b734ff604 nfp: flower: fix error path during representor creation
Don't store repr pointer to reprs array until the representor is
successfully created. This avoids message about "representor
destruction" even when it was never created. Also it cleans-up the flow.
Also, check return value after port alloc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:23:29 -04:00
David S. Miller 6d9f868fc7 Merge branch 'mvpp2-small-improvements'
Antoine Tenart says:

====================
net: mvpp2: small improvements

Those 3 patches are small improvements to the Marvell PPv2 driver. The
series does not conflict with the one sent about phylink and
1000/2500baseX support, so the two series can live in parallel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:55 -04:00
Yan Markman 934e0f8330 net: mvpp2: print rx error with rate-limit
Prevent flood of RX error prints during heavy traffic with weak signal
in link by checking net_ratelimit() before using netdev_err().

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: small rework, commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:55 -04:00
Yan Markman 5b0ab2f41d net: mvpp2: set mac address does not require the stop/start sequence
Remove special stop/start handling from the set_mac_address callback.
All this special care is not needed, and can be removed. It also
simplifies the up/down status in the driver and helps avoiding possible
link status mismatch issues.

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:54 -04:00
Yan Markman 914365f1c9 net: mvpp2: avoid checking for free aggregated descriptors twice
Avoid repeating the check for free aggregated descriptors when it
already failed at the beginning of the function.

Signed-off-by: Yan Markman <ymarkman@marvell.com>
[Antoine: commit message]
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:18:54 -04:00
Jesper Dangaard Brouer deea81228b selftests/bpf: check return value of fopen in test_verifier.c
Commit 0a67487403 ("selftests/bpf: Only run tests if !bpf_disabled")
forgot to check return value of fopen.

This caused some confusion, when running test_verifier (from
tools/testing/selftests/bpf/) on an older kernel (< v4.4) as it will
simply seqfault.

This fix avoids the segfault and prints an error, but allow program to
continue.  Given the sysctl was introduced in 1be7f75d16 ("bpf:
enable non-root eBPF programs"), we know that the running kernel
cannot support unpriv, thus continue with unpriv_disabled = true.

Fixes: 0a67487403 ("selftests/bpf: Only run tests if !bpf_disabled")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-17 22:18:46 +02:00
David S. Miller 808e2fc3b0 Merge branch 'mvpp2-phylink-conversion'
Antoine Tenart says:

====================
net: mvpp2: phylink conversion

This series convert the Marvell PPv2 driver to phylink (models the MAC
to PHY link).

One important point is the PPv2 driver supports two probe modes: device
tree and ACPI. This series only brings phylink support for the device
tree mode, as the ACPI one will need further work. Still, the driver
should be working as before when using ACPI. This split should be
temporary, and was discussed with Marcin (in Cc.) who added ACPI support
to the driver.

Also as the SFP cages on both DB boards can be considered as non-wired.
We thus chose not to describe those SFP cages and we use fixed-link.

The rest of the series uses phylink to add support for 1000BaseX and
2500BaseX modes in the PPv2 driver. To do this, two patches are needed
in the common PHY framework (patches 3 and 4). The last 4 patches modify
the device tree to use the new PPv2 functionalities.

The series has been tested for the device tree mode on the 7040-db,
8040-db and 8040-mcbin boards, to ensure all the interface where working
as expected.

@Dave: patches 7 to 10 should go through the mvebu tree (Gregory in
Cc.) to avoid any conflict with the other mvebu dt patches taken during
this cycle.

The series is based on today's net-next.

Since v2:
  - Removed the SFP description from the DB boards, as their SFP cages
    are wired properly. We now use fixed-link.
  - Because of this rework, split the series in two, so that the SFP
    part is reviewed separately.
  - Small fixes in the phylink patch.
  - Rebased on the latest net-next branch.

Since v1:
  - Chose a different approach to the SFP changes, as the previous ones
    weren't valid and reworked both BD boards device trees.
  - Misc fixes.
  - Added Kishon's acked-by on one patch.
  - Rebaed on latest net-next branch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:11:40 -04:00
Antoine Tenart a6fe31de86 net: mvpp2: 2500baseX support
This patch adds the 2500Base-X PHY mode support in the Marvell PPv2
driver. 2500Base-X is quite close to 1000Base-X and SGMII modes and uses
nearly the same code path.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17 16:11:40 -04:00