Commit Graph

546238 Commits

Author SHA1 Message Date
Michael S. Tsirkin 3ea79249e8 macvtap: fix TUNSETSNDBUF values > 64k
Upon TUNSETSNDBUF,  macvtap reads the requested sndbuf size into
a local variable u.
commit 39ec7de709 ("macvtap: fix uninitialized access on
TUNSETIFF") changed its type to u16 (which is the right thing to
do for all other macvtap ioctls), breaking all values > 64k.

The value of TUNSETSNDBUF is actually a signed 32 bit integer, so
the right thing to do is to read it into an int.

Cc: David S. Miller <davem@davemloft.net>
Fixes: 39ec7de709 ("macvtap: fix uninitialized access on TUNSETIFF")
Reported-by: Mark A. Peloquin
Bisected-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by:  Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:44:39 -07:00
Nicolas Dichtel 83cf9a2521 ip6tunnel: make rx/tx bytes counters consistent
Like the previous patch, which fixes ipv4 tunnels, here is the ipv6 part.

Before the patch, the external ipv6 header + gre header were included on
tx.

After the patch:
$ ping -c1 192.168.6.121 ; ip -s l ls dev ip6gre1
PING 192.168.6.121 (192.168.6.121) 56(84) bytes of data.
64 bytes from 192.168.6.121: icmp_req=1 ttl=64 time=1.92 ms

--- 192.168.6.121 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.923/1.923/1.923/0.000 ms
7: ip6gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/gre6 20:01:06:60:30:08:c1:c3:00:00:00:00:00:00:01:23 peer 20:01:06:60:30:08:c1:c3:00:00:00:00:00:00:01:21
    RX: bytes  packets  errors  dropped overrun mcast
    84         1        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    84         1        0       0       0       0
$ ping -c1 192.168.1.121 ; ip -s l ls dev ip6tnl1
PING 192.168.1.121 (192.168.1.121) 56(84) bytes of data.
64 bytes from 192.168.1.121: icmp_req=1 ttl=64 time=2.28 ms

--- 192.168.1.121 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.288/2.288/2.288/0.000 ms
8: ip6tnl1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/tunnel6 2001:660:3008:c1c3::123 peer 2001:660:3008:c1c3::121
    RX: bytes  packets  errors  dropped overrun mcast
    84         1        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    84         1        0       0       0       0

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:36:42 -07:00
Nicolas Dichtel bc22a0e2ea iptunnel: make rx/tx bytes counters consistent
This was already done a long time ago in
commit 64194c31a0 ("inet: Make tunnel RX/TX byte counters more consistent")
but tx path was broken (at least since 3.10).

Before the patch the gre header was included on tx.

After the patch:
$ ping -c1 192.168.0.121 ; ip -s l ls dev gre1
PING 192.168.0.121 (192.168.0.121) 56(84) bytes of data.
64 bytes from 192.168.0.121: icmp_req=1 ttl=64 time=2.95 ms

--- 192.168.0.121 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.955/2.955/2.955/0.000 ms
7: gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1468 qdisc noqueue state UNKNOWN mode DEFAULT group default
    link/gre 10.16.0.249 peer 10.16.0.121
    RX: bytes  packets  errors  dropped overrun mcast
    84         1        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    84         1        0       0       0       0

Reported-by: Julien Meunier <julien.meunier@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:36:22 -07:00
David S. Miller ac81374493 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patch contains Netfilter fixes for your net tree, they are:

1) nf_log_unregister() should only set to NULL the logger that is being
   unregistered, instead of everything else. Patch from Florian Westphal.

2) Fix a crash when accessing physoutdev from PREROUTING in br_netfilter.
   This is partially reverting the patch to shrink nf_bridge_info to 32 bytes.
   Also from Florian.

3) Use existing match/target extensions in the internal nft_compat extension
   lists when the extension is family unspecific (ie. NFPROTO_UNSPEC).

4) Wait for rcu grace period before leaving nf_log_unregister().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:32:20 -07:00
Erik Hugne 4e3ae00100 tipc: reinitialize pointer after skb linearize
The msg pointer into header may change after skb linearization.
We must reinitialize it after calling skb_linearize to prevent
operating on a freed or invalid pointer.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reported-by: Tamás Végh <tamas.vegh@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:31:20 -07:00
Kevin Hao aab0c0e62e Revert "net/phy: Add Vitesse 8641 phy ID"
This reverts commit 1298267b54.

That commit claim that the Vitesse VSC8641 is compatible with Vitesse
82xx. But this is not true. It seems that all the registers used
in Vitesse phy driver are not compatible between 8641 and 82xx.
It does cause malfunction of the Ethernet on p1010rdb-pa board.
So we definitely need a rework in order to support the 8641 phy
in this driver.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:29:34 -07:00
David Woodhouse 7a8a8e75d5 8139cp: Call __cp_set_rx_mode() from cp_tx_timeout()
Unless we reset the RX config, on real hardware I don't seem to receive
any packets after a TX timeout.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:23:40 -07:00
David Woodhouse fc27bd115b 8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings()
This can be called from cp_tx_timeout() with interrupts disabled.
Spotted by Francois Romieu <romieu@fr.zoreil.com>

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 22:23:40 -07:00
Nikola Forró 0315e38270 net: Fix behaviour of unreachable, blackhole and prohibit routes
Man page of ip-route(8) says following about route types:

  unreachable - these destinations are unreachable.  Packets are dis‐
  carded and the ICMP message host unreachable is generated.  The local
  senders get an EHOSTUNREACH error.

  blackhole - these destinations are unreachable.  Packets are dis‐
  carded silently.  The local senders get an EINVAL error.

  prohibit - these destinations are unreachable.  Packets are discarded
  and the ICMP message communication administratively prohibited is
  generated.  The local senders get an EACCES error.

In the inet6 address family, this was correct, except the local senders
got ENETUNREACH error instead of EHOSTUNREACH in case of unreachable route.
In the inet address family, all three route types generated ICMP message
net unreachable, and the local senders got ENETUNREACH error.

In both address families all three route types now behave consistently
with documentation.

Signed-off-by: Nikola Forró <nforro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 21:45:08 -07:00
Ivan Vecera ba5ca7848b bna: check for dma mapping errors
Check for DMA mapping errors, recover from them and register them in
ethtool stats like other errors.

Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20 21:36:57 -07:00
Eric Dumazet c2e7204d18 tcp_cubic: do not set epoch_start in the future
Tracking idle time in bictcp_cwnd_event() is imprecise, as epoch_start
is normally set at ACK processing time, not at send time.

Doing a proper fix would need to add an additional state variable,
and does not seem worth the trouble, given CUBIC bug has been there
forever before Jana noticed it.

Let's simply not set epoch_start in the future, otherwise
bictcp_update() could overflow and CUBIC would again
grow cwnd too fast.

This was detected thanks to a packetdrill test Neal wrote that was flaky
before applying this fix.

Fixes: 30927520db ("tcp_cubic: better follow cubic curve after idle period")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Cc: Jana Iyengar <jri@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:35:07 -07:00
Taku Izumi adb094e5e7 fjes: fix off-by-one error at fjes_hw_update_zone_task()
Dan Carpenter reported off-by-one error of fjes at
http://www.mail-archive.com/netdev@vger.kernel.org/msg77520.html

Actually this is a bug.
ep_shm_info[epidx].{es_status, zone} should be update
inside for loop.

This patch fixes this bug.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:34:09 -07:00
Jiri Benc 6cf3564214 MAINTAINERS: remove bouncing email address for qlcnic
I got this automated message from <shahed.shaikh@qlogic.com> when submitting
a qlcnic patch:

> Shahed Shaikh is no longer with QLogic. If you require assistance please
> contact Ariel Elior Ariel.Elior@qlogic.com

There's no point in having a bouncing address in MAINTAINERS.

CC: Dept-GELinuxNICDev@qlogic.com
CC: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:33:24 -07:00
David S. Miller 1bdc0b109b Merge branch 'vxlan-fixes'
Jiri Benc says:

====================
vxlan fixes

This fixes various issues with vxlan related to IPv6.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:32:16 -07:00
Jiri Benc ac7eccd4d4 bnx2x: track vxlan port count
The callback for adding vxlan port can be called with the same port for
both IPv4 and IPv6. Do not disable the offloading when the same port for
both protocols is added and later one of them removed.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:32:16 -07:00
Jiri Benc 1e5b311ab2 be2net: allow offloading with the same port for IPv4 and IPv6
The callback for adding vxlan port can be called with the same port for both
IPv4 and IPv6. Do not disable the offloading if this occurs.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:32:16 -07:00
Jiri Benc 378fddc281 qlcnic: track vxlan port count
The callback for adding vxlan port can be called with the same port for
both IPv4 and IPv6. Do not disable the offloading when the same port for
both protocols is added and later one of them removed.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:32:16 -07:00
Jiri Benc 057ba29bbe vxlan: reject IPv6 addresses if IPv6 is not configured
When IPv6 address is set without IPv6 configured, the vxlan socket is mostly
treated as an IPv4 one but various lookus in fdb etc. still take the
AF_INET6 into account. This creates incosistencies with weird consequences.

Just reject IPv6 addresses in such case.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:32:15 -07:00
Jiri Benc 9dc2ad1008 vxlan: set needed headroom correctly
vxlan_setup is called when allocating the net_device, i.e. way before
vxlan_newlink (or vxlan_dev_configure) is called. This means
vxlan->default_dst is actually unset in vxlan_setup and the condition that
sets needed_headroom always takes the else branch.

Set the needed_headrom at the point when we have the information about
the address family available.

Fixes: e4c7ed4153 ("vxlan: add ipv6 support")
Fixes: 2853af6a2e ("vxlan: use dev->needed_headroom instead of dev->hard_header_len")
CC: Cong Wang <cwang@twopensource.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:32:15 -07:00
Michael Grzeschik c38f6ac74c MAINTAINERS: add arcnet and take maintainership
Add entry for arcnet to MAINTAINERS file and add myself as the
maintainer of the subsystem.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Cc: davem@davemloft.net
Cc: joe@perches.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:26:55 -07:00
Michael Grzeschik 980137a203 ARCNET: fix hard_header_len limit
For arcnet the bare minimum header only contains the 4 bytes to
specify source, dest and offset (1, 1 and 2 bytes respectively).
The corresponding struct is struct arc_hardware.

The struct archdr contains additionally a union of possible soft
headers. When doing $insertusecasehere packets might well
include short (or even no?) soft headers.

For this reason only use arc_hardware instead of archdr to
determine the hard_header_len for an arcnet device.

Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:26:55 -07:00
David S. Miller 1dbb2413cb Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:

====================
pull request: bluetooth 2015-09-17

Here's one important patch for the 4.3-rc series that fixes an issue
with Bluetooth LE encryption failing because of a too early check for
the SMP context.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:25:51 -07:00
Sasha Levin 34f5b00664 atm: deal with setting entry before mkip was called
If we didn't call ATMARP_MKIP before ATMARP_ENCAP the VCC descriptor is
non-existant and we'll end up dereferencing a NULL ptr:

[1033173.491930] kasan: GPF could be caused by NULL-ptr deref or user memory accessirq event stamp: 123386
[1033173.493678] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[1033173.493689] Modules linked in:
[1033173.493697] CPU: 9 PID: 23815 Comm: trinity-c64 Not tainted 4.2.0-next-20150911-sasha-00043-g353d875-dirty #2545
[1033173.493706] task: ffff8800630c4000 ti: ffff880063110000 task.ti: ffff880063110000
[1033173.493823] RIP: clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
[1033173.493826] RSP: 0018:ffff880063117a88  EFLAGS: 00010203
[1033173.493828] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 000000000000000c
[1033173.493830] RDX: 0000000000000002 RSI: ffffffffb3f10720 RDI: 0000000000000014
[1033173.493832] RBP: ffff880063117b80 R08: ffff88047574d9a4 R09: 0000000000000000
[1033173.493834] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1000c622f53
[1033173.493836] R13: ffff8800cb905500 R14: ffff8808d6da2000 R15: 00000000fffffdfd
[1033173.493840] FS:  00007fa56b92d700(0000) GS:ffff880478000000(0000) knlGS:0000000000000000
[1033173.493843] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1033173.493845] CR2: 0000000000000000 CR3: 00000000630e8000 CR4: 00000000000006a0
[1033173.493855] Stack:
[1033173.493862]  ffffffffb0b60444 000000000000eaea 0000000041b58ab3 ffffffffb3c3ce32
[1033173.493867]  ffffffffb0b6f3e0 ffffffffb0b60444 ffffffffb5ea2e50 1ffff1000c622f5e
[1033173.493873]  ffff8800630c4cd8 00000000000ee09a ffffffffb3ec4888 ffffffffb5ea2de8
[1033173.493874] Call Trace:
[1033173.494108] do_vcc_ioctl (net/atm/ioctl.c:170)
[1033173.494113] vcc_ioctl (net/atm/ioctl.c:189)
[1033173.494116] svc_ioctl (net/atm/svc.c:605)
[1033173.494200] sock_do_ioctl (net/socket.c:874)
[1033173.494204] sock_ioctl (net/socket.c:958)
[1033173.494244] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[1033173.494290] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[1033173.494295] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
[1033173.494362] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 50 09 00 00 49 8b 9e 60 06 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 14 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 14 09 00
All code

========
   0:   fa                      cli
   1:   48 c1 ea 03             shr    $0x3,%rdx
   5:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
   9:   0f 85 50 09 00 00       jne    0x95f
   f:   49 8b 9e 60 06 00 00    mov    0x660(%r14),%rbx
  16:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
  1d:   fc ff df
  20:   48 8d 7b 14             lea    0x14(%rbx),%rdi
  24:   48 89 fa                mov    %rdi,%rdx
  27:   48 c1 ea 03             shr    $0x3,%rdx
  2b:*  0f b6 04 02             movzbl (%rdx,%rax,1),%eax               <-- trapping instruction
  2f:   48 89 fa                mov    %rdi,%rdx
  32:   83 e2 07                and    $0x7,%edx
  35:   38 d0                   cmp    %dl,%al
  37:   7f 08                   jg     0x41
  39:   84 c0                   test   %al,%al
  3b:   0f 85 14 09 00 00       jne    0x955

Code starting with the faulting instruction
===========================================
   0:   0f b6 04 02             movzbl (%rdx,%rax,1),%eax
   4:   48 89 fa                mov    %rdi,%rdx
   7:   83 e2 07                and    $0x7,%edx
   a:   38 d0                   cmp    %dl,%al
   c:   7f 08                   jg     0x16
   e:   84 c0                   test   %al,%al
  10:   0f 85 14 09 00 00       jne    0x92a
[1033173.494366] RIP clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
[1033173.494368]  RSP <ffff880063117a88>

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 22:13:32 -07:00
Florian Westphal 1d325d217c ipv6: ip6_fragment: fix headroom tests and skb leak
David Woodhouse reports skb_under_panic when we try to push ethernet
header to fragmented ipv6 skbs:

 skbuff: skb_under_panic: text:c1277f1e len:1294 put:14 head:dec98000
 data:dec97ffc tail:0xdec9850a end:0xdec98f40 dev:br-lan
[..]
ip6_finish_output2+0x196/0x4da

David further debugged this:
  [..] offending fragments were arriving here with skb_headroom(skb)==10.
  Which is reasonable, being the Solos ADSL card's header of 8 bytes
  followed by 2 bytes of PPP frame type.

The problem is that if netfilter ipv6 defragmentation is used, skb_cow()
in ip6_forward will only see reassembled skb.

Therefore, headroom is overestimated by 8 bytes (we pulled fragment
header) and we don't check the skbs in the frag_list either.

We can't do these checks in netfilter defrag since outdev isn't known yet.

Furthermore, existing tests in ip6_fragment did not consider the fragment
or ipv6 header size when checking headroom of the fraglist skbs.

While at it, also fix a skb leak on memory allocation -- ip6_fragment
must consume the skb.

I tested this e1000 driver hacked to not allocate additional headroom
(we end up in slowpath, since LL_RESERVED_SPACE is 16).

If 2 bytes of headroom are allocated, fastpath is taken (14 byte
ethernet header was pulled, so 16 byte headroom available in all
fragments).

Reported-by: David Woodhouse <dwmw2@infradead.org>
Diagnosed-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 21:36:15 -07:00
David Woodhouse ce816eb064 solos-pci: Increase headroom on received packets
A comment in include/linux/skbuff.h says that:

 * Various parts of the networking layer expect at least 32 bytes of
 * headroom, you should not reduce this.

This was demonstrated by a panic when handling fragmented IPv6 packets:
http://marc.info/?l=linux-netdev&m=144236093519172&w=2

It's not entirely clear if that comment is still valid — and if it is,
perhaps netif_rx() ought to be enforcing it with a warning.

But either way, it is rather stupid from a performance point of view
for us to be receiving packets into a buffer which doesn't have enough
room to prepend an Ethernet header — it means that *every* incoming
packet is going to be need to be reallocated. So let's fix that.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 21:29:07 -07:00
Javier Martinez Canillas 88c796640e net: ks8851: Export OF module alias information
Drivers needs to export the OF id table and this be built into
the module or udev won't have the necessary information to autoload
the driver module when the device is registered via OF.

Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 21:24:02 -07:00
Eric Dumazet 4671fc6d47 net/mlx4_en: really allow to change RSS key
When changing rss key, we do not want to overwrite user provided key
by the one provided by netdev_rss_key_fill(), which is the host random
key generated at boot time.

Fixes: 947cbb0ac2 ("net/mlx4_en: Support for configurable RSS hash function")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eyal Perry <eyalpe@mellanox.com>
CC: Amir Vadai <amirv@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 21:03:59 -07:00
David Ahern 58189ca7b2 net: Fix vti use case with oif in dst lookups
Steffen reported that the recent change to add oif to dst lookups breaks
the VTI use case. The problem is that with the oif set in the flow struct
the comparison to the nh_oif is triggered. Fix by splitting the
FLOWI_FLAG_VRFSRC into 2 flags -- one that triggers the vrf device cache
bypass (FLOWI_FLAG_VRFSRC) and another telling the lookup to not compare
nh oif (FLOWI_FLAG_SKIP_NH_OIF).

Fixes: 42a7b32b73 ("xfrm: Add oif to dst lookups")

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 16:36:34 -07:00
Hariprasad Shenai d828755eae cxgb4: add device ID for few T5 adapters
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 16:24:51 -07:00
Phil Sutter 2e64126bb0 net: qdisc: enhance default_qdisc documentation
Aside from some lingual cleanup, point out which interfaces are not or
partly covered by this setting.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 16:09:22 -07:00
David Ahern 562d897d15 net: Add documentation for VRF device
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 16:06:43 -07:00
Joe Stringer cc5706056b openvswitch: Fix IPv6 exthdr handling with ct helpers.
Static code analysis reveals the following bug:

        net/openvswitch/conntrack.c:281 ovs_ct_helper()
        warn: unsigned 'protoff' is never less than zero.

This signedness bug breaks error handling for IPv6 extension headers when
using conntrack helpers. Fix the error by using a local signed variable.

Fixes:  cae3a2627520: "openvswitch: Allow attaching helpers to ct
action"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 15:31:49 -07:00
Roopa Prabhu 37a1d3611c ipv6: include NLM_F_REPLACE in route replace notifications
This patch adds NLM_F_REPLACE flag to ipv6 route replace notifications.
This makes nlm_flags in ipv6 replace notifications consistent
with ipv4.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17 15:00:27 -07:00
Pablo Neira Ayuso ad5001cc7c netfilter: nf_log: wait for rcu grace after logger unregistration
The nf_log_unregister() function needs to call synchronize_rcu() to make sure
that the objects are not dereferenced anymore on module removal.

Fixes: 5962815a6a ("netfilter: nf_log: use an array of loggers instead of list")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-17 13:37:31 +02:00
Johan Hedberg d8949aad3e Bluetooth: Delay check for conn->smp in smp_conn_security()
There are several actions that smp_conn_security() might make that do
not require a valid SMP context (conn->smp pointer). One of these
actions is to encrypt the link with an existing LTK. If the SMP
context wasn't initialized properly we should still allow the
independent actions to be done, i.e. the check for the context should
only be done at the last possible moment.

Reported-by: Chuck Ebbert <cebbert.lkml@gmail.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # 4.0+
2015-09-17 12:28:27 +02:00
Julia Lawall 20471ed4d4 dccp: drop null test before destroy functions
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x;
@@

-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);

@@
expression x;
@@

-if (x != NULL) {
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
  x = NULL;
-}
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 16:49:43 -07:00
Julia Lawall adf78edac0 net: core: drop null test before destroy functions
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL) {
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
  x = NULL;
-}
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 16:49:43 -07:00
Julia Lawall 58d29e3ce9 atm: he: drop null test before destroy functions
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 16:49:43 -07:00
Jesse Gross 982b527004 openvswitch: Fix mask generation for nested attributes.
Masks were added to OVS flows in a way that was backwards compatible
with userspace programs that did not generate masks. As a result, it is
possible that we may receive flows that do not have a mask and we need
to synthesize one.

Generating a mask requires iterating over attributes and descending into
nested attributes. For each level we need to know the size to generate the
correct mask. We do this with a linked table of attribute types.

Although the logic to handle these nested attributes was there in concept,
there are a number of bugs in practice. Examples include incomplete links
between tables, variable length attributes being treated as nested and
missing sanity checks.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 16:25:41 -07:00
Sjoerd Simons 892aa01df2 net: stmmac: Use msleep rather then udelay for reset delay
The reset delays used for stmmac are in the order of 10ms to 1 second,
which is far too long for udelay usage, so switch to using msleep.

Practically this fixes the PHY not being reliably detected in some cases
as udelay wouldn't actually delay for long enough to let the phy
reliably be reset.

Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 15:05:29 -07:00
Roopa Prabhu d64f69b037 rtnetlink: catch -EOPNOTSUPP errors from ndo_bridge_getlink
problem reported:
	kernel 4.1.3
	------------
	# bridge vlan
	port	vlan ids
	eth0	 1 PVID Egress Untagged
	 	90
	 	91
	 	92
	 	93
	 	94
	 	95
	 	96
	 	97
	 	98
	 	99
	 	100

	vmbr0	 1 PVID Egress Untagged
	 	94

	kernel 4.2
	-----------
	# bridge vlan
	port	vlan ids

ndo_bridge_getlink can return -EOPNOTSUPP when an interfaces
ndo_bridge_getlink op is set to switchdev_port_bridge_getlink
and CONFIG_SWITCHDEV is not defined. This today can happen to
bond, rocker and team devices. This patch adds -EOPNOTSUPP
checks after calls to ndo_bridge_getlink.

Fixes: 85fdb95672 ("switchdev: cut over to new switchdev_port_bridge_getlink")
Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 15:03:57 -07:00
Simon Guinot daf158d0d5 net: mvneta: fix DMA buffer unmapping in mvneta_rx()
This patch fixes a regression introduced by the commit a84e328941
("net: mvneta: fix refilling for Rx DMA buffers"). Due to this commit
the newly allocated Rx buffers are DMA-unmapped in place of those passed
to the networking stack. Obviously, this causes data corruptions.

This patch fixes the issue by ensuring that the right Rx buffers are
DMA-unmapped.

Reported-by: Oren Laskin <oren@igneous.io>
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: a84e328941 ("net: mvneta: fix refilling for Rx DMA buffers")
Cc: <stable@vger.kernel.org> # v3.8+
Tested-by: Oren Laskin <oren@igneous.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:54:50 -07:00
David S. Miller 244b7f4324 Merge branch 'ip6tunnel_dst'
Martin KaFai Lau says:

====================
ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

v4:
- Fix a compilation error in patch 5 when CONFIG_LOCKDEP is turned on and
  re-test it

v3:
- Merge a 'if else if' test in patch 4
- Use rcu_dereference_protected in patch 5 to fix a sparse check when
  CONFIG_SPARSE_RCU_POINTER is enabled

v2:
- Add patch 4 and 5 to remove the spinlock

v1:
This patch series is to fix the dst refcnt bugs in ip6_tunnel.

Patch 1 and 2 are the prep works.  Patch 3 is the fix.

I can reproduce the bug by adding and removing the ip6gre tunnel
while running a super_netperf TCP_CRR test.  I get the following
trace by adding WARN_ON_ONCE(newrefcnt < 0) to dst_release():

[  312.760432] ------------[ cut here ]------------
[  312.774664] WARNING: CPU: 2 PID: 10263 at net/core/dst.c:288 dst_release+0xf3/0x100()
[  312.776041] Modules linked in: k10temp coretemp hwmon ip6_gre ip6_tunnel tunnel6 ipmi_devintf ipmi_ms\
ghandler ip6table_filter ip6_tables xt_NFLOG nfnetlink_log nfnetlink xt_comment xt_statistic iptable_fil\
ter ip_tables x_tables nfsv3 nfs_acl nfs fscache lockd grace mptctl netconsole autofs4 rpcsec_gss_krb5 a\
uth_rpcgss oid_registry sunrpc ipv6 dm_mod loop iTCO_wdt iTCO_vendor_support serio_raw rtc_cmos pcspkr i\
2c_i801 i2c_core lpc_ich mfd_core ehci_pci ehci_hcd e1000e mlx4_en ptp pps_core vxlan udp_tunnel ip6_udp\
_tunnel mlx4_core sg button ext3 jbd mpt2sas raid_class
[  312.785302] CPU: 2 PID: 10263 Comm: netperf Not tainted 4.2.0-rc8-00046-g4db9b63-dirty #15
[  312.791695] Hardware name: Quanta Freedom /Windmill-EP, BIOS F03_3B04 09/12/2013
[  312.792965]  ffffffff819dca2c ffff8811dfbdf6f8 ffffffff816537de ffff88123788fdb8
[  312.794263]  0000000000000000 ffff8811dfbdf738 ffffffff81052646 ffff8811dfbdf768
[  312.795593]  ffff881203a98180 00000000ffffffff ffff88242927a000 ffff88120a2532e0
[  312.796946] Call Trace:
[  312.797380]  [<ffffffff816537de>] dump_stack+0x45/0x57
[  312.798288]  [<ffffffff81052646>] warn_slowpath_common+0x86/0xc0
[  312.799699]  [<ffffffff8105273a>] warn_slowpath_null+0x1a/0x20
[  312.800852]  [<ffffffff8159f9b3>] dst_release+0xf3/0x100
[  312.801834]  [<ffffffffa03f1308>] ip6_tnl_dst_store+0x48/0x70 [ip6_tunnel]
[  312.803738]  [<ffffffffa03fd0b6>] ip6gre_xmit2+0x536/0x720 [ip6_gre]
[  312.804774]  [<ffffffffa03fd40a>] ip6gre_tunnel_xmit+0x16a/0x410 [ip6_gre]
[  312.805986]  [<ffffffff8159934b>] dev_hard_start_xmit+0x23b/0x390
[  312.808810]  [<ffffffff815a2f5f>] ? neigh_destroy+0xef/0x140
[  312.809843]  [<ffffffff81599a6c>] __dev_queue_xmit+0x48c/0x4f0
[  312.813931]  [<ffffffff81599ae3>] dev_queue_xmit_sk+0x13/0x20
[  312.814993]  [<ffffffff815a0832>] neigh_direct_output+0x12/0x20
[  312.817448]  [<ffffffffa021d633>] ip6_finish_output2+0x183/0x460 [ipv6]
[  312.818762]  [<ffffffff81306fc5>] ? find_next_bit+0x15/0x20
[  312.819671]  [<ffffffffa021fd79>] ip6_finish_output+0x89/0xe0 [ipv6]
[  312.820720]  [<ffffffffa021fe14>] ip6_output+0x44/0xe0 [ipv6]
[  312.821762]  [<ffffffff815c8809>] ? nf_hook_slow+0x69/0xc0
[  312.823123]  [<ffffffffa021d232>] ip6_xmit+0x242/0x4c0 [ipv6]
[  312.824073]  [<ffffffffa021c9f0>] ? ac6_proc_exit+0x20/0x20 [ipv6]
[  312.825116]  [<ffffffffa024c751>] inet6_csk_xmit+0x61/0xa0 [ipv6]
[  312.826127]  [<ffffffff815eb590>] tcp_transmit_skb+0x4f0/0x9b0
[  312.827441]  [<ffffffff815ed267>] tcp_connect+0x637/0x7a0
[  312.828327]  [<ffffffffa0245906>] tcp_v6_connect+0x2d6/0x550 [ipv6]
[  312.829581]  [<ffffffff81606f05>] __inet_stream_connect+0x95/0x2f0
[  312.830600]  [<ffffffff810ae13a>] ? hrtimer_try_to_cancel+0x1a/0xf0
[  312.833456]  [<ffffffff812fba19>] ? timerqueue_add+0x59/0xb0
[  312.834407]  [<ffffffff81607198>] inet_stream_connect+0x38/0x50
[  312.835886]  [<ffffffff8157cb17>] SYSC_connect+0xb7/0xf0
[  312.840035]  [<ffffffff810af6d3>] ? do_setitimer+0x1b3/0x200
[  312.840983]  [<ffffffff810af75a>] ? alarm_setitimer+0x3a/0x70
[  312.841941]  [<ffffffff8157d7ae>] SyS_connect+0xe/0x10
[  312.842818]  [<ffffffff81659297>] entry_SYSCALL_64_fastpath+0x12/0x6a
[  312.844206] ---[ end trace 43f3ecd86c3b1313 ]---
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:46 -07:00
Martin KaFai Lau 70da5b5c53 ipv6: Replace spinlock with seqlock and rcu in ip6_tunnel
This patch uses a seqlock to ensure consistency between idst->dst and
idst->cookie.  It also makes dst freeing from fib tree to undergo a
rcu grace period.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:05 -07:00
Martin KaFai Lau 8e3d5be736 ipv6: Avoid double dst_free
It is a prep work to get dst freeing from fib tree undergo
a rcu grace period.

The following is a common paradigm:
if (ip6_del_rt(rt))
	dst_free(rt)

which means, if rt cannot be deleted from the fib tree, dst_free(rt) now.
1. We don't know the ip6_del_rt(rt) failure is because it
   was not managed by fib tree (e.g. DST_NOCACHE) or it had already been
   removed from the fib tree.
2. If rt had been managed by the fib tree, ip6_del_rt(rt) failure means
   dst_free(rt) has been called already.  A second
   dst_free(rt) is not always obviously safe.  The rt may have
   been destroyed already.
3. If rt is a DST_NOCACHE, dst_free(rt) should not be called.
4. It is a stopper to make dst freeing from fib tree undergo a
   rcu grace period.

This patch is to use a DST_NOCACHE flag to indicate a rt is
not managed by the fib tree.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:05 -07:00
Martin KaFai Lau cdf3464e6c ipv6: Fix dst_entry refcnt bugs in ip6_tunnel
Problems in the current dst_entry cache in the ip6_tunnel:

1. ip6_tnl_dst_set is racy.  There is no lock to protect it:
   - One major problem is that the dst refcnt gets messed up. F.e.
     the same dst_cache can be released multiple times and then
     triggering the infamous dst refcnt < 0 warning message.
   - Another issue is the inconsistency between dst_cache and
     dst_cookie.

   It can be reproduced by adding and removing the ip6gre tunnel
   while running a super_netperf TCP_CRR test.

2. ip6_tnl_dst_get does not take the dst refcnt before returning
   the dst.

This patch:
1. Create a percpu dst_entry cache in ip6_tnl
2. Use a spinlock to protect the dst_cache operations
3. ip6_tnl_dst_get always takes the dst refcnt before returning

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:05 -07:00
Martin KaFai Lau f230d1e891 ipv6: Rename the dst_cache helper functions in ip6_tunnel
It is a prep work to fix the dst_entry refcnt bugs in
ip6_tunnel.

This patch rename:
1. ip6_tnl_dst_check() to ip6_tnl_dst_get() to better
   reflect that it will take a dst refcnt in the next patch.
2. ip6_tnl_dst_store() to ip6_tnl_dst_set() to have a more
   conventional name matching with ip6_tnl_dst_get().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:04 -07:00
Martin KaFai Lau a3c119d392 ipv6: Refactor common ip6gre_tunnel_init codes
It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel.

This patch refactors some common init codes used by both
ip6gre_tunnel_init and ip6gre_tap_init.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15 14:53:04 -07:00
Pablo Neira Ayuso ba378ca9c0 netfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC
Fix lookup of existing match/target structures in the corresponding list
by skipping the family check if NFPROTO_UNSPEC is used.

This is resulting in the allocation and insertion of one match/target
structure for each use of them. So this not only bloats memory
consumption but also severely affects the time to reload the ruleset
from the iptables-compat utility.

After this patch, iptables-compat-restore and iptables-compat take
almost the same time to reload large rulesets.

Fixes: 0ca743a559 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-14 18:10:57 +02:00
Florian Westphal 63cdbc06b3 netfilter: bridge: fix routing of bridge frames with call-iptables=1
We can't re-use the physoutdev storage area.

1.  When using NFQUEUE in PREROUTING, we attempt to bump a bogus
refcnt since nf_bridge->physoutdev is garbage (ipv4/ipv6 address)

2. for same reason, we crash in physdev match in FORWARD or later if
skb is routed instead of bridged.

This increases nf_bridge_info to 40 bytes, but we have no other choice.

Fixes: 72b1e5e4ca ("netfilter: bridge: reduce nf_bridge_info to 32 bytes again")
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-14 18:08:33 +02:00