Commit Graph

591236 Commits

Author SHA1 Message Date
Tom Herbert 8eb30be035 ipv6: Create ip6_tnl_xmit
This patch renames ip6_tnl_xmit2 to ip6_tnl_xmit and exports it. Other
users like GRE will be able to call this. The original ip6_tnl_xmit
function is renamed to ip6_tnl_start_xmit (this is an ndo_start_xmit
function).

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 19:23:31 -04:00
Tom Herbert 308edfdf15 gre6: Cleanup GREv6 receive path, call common GRE functions
- Create gre_rcv function. This calls gre_parse_header and ip6gre_rcv.
  - Call ip6_tnl_rcv. Doing this and using gre_parse_header eliminates
    most of the code in ip6gre_rcv.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 19:23:31 -04:00
Tom Herbert 95f5c64c3c gre: Move utility functions to common headers
Several of the GRE functions defined in net/ipv4/ip_gre.c are usable
for IPv6 GRE implementation (that is they are protocol agnostic).

These include:
  - GRE flag handling functions are move to gre.h
  - GRE build_header is moved to gre.h and renamed gre_build_header
  - parse_gre_header is moved to gre_demux.c and renamed gre_parse_header
  - iptunnel_pull_header is taken out of gre_parse_header. This is now
    done by caller. The header length is returned from gre_parse_header
    in an int* argument.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 19:23:31 -04:00
Tom Herbert 0d3c703a9d ipv6: Cleanup IPv6 tunnel receive path
Some basic changes to make IPv6 tunnel receive path look more like
IPv4 path:
  - Make ip6_tnl_rcv non-static so that GREv6 and others can call it
  - Make ip6_tnl_rcv look like ip_tunnel_rcv
  - Switch to gro_cells_receive
  - Make ip6_tnl_rcv non-static and export it

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 19:23:31 -04:00
David S. Miller 570d632008 Merge branch 'tcp-preempt'
Eric Dumazet says:

====================
net: make TCP preemptible

Most of TCP stack assumed it was running from BH handler.

This is great for most things, as TCP behavior is very sensitive
to scheduling artifacts.

However, the prequeue and backlog processing are problematic,
as they need to be flushed with BH being blocked.

To cope with modern needs, TCP sockets have big sk_rcvbuf values,
in the order of 16 MB, and soon 32 MB.
This means that backlog can hold thousands of packets, and things
like TCP coalescing or collapsing on this amount of packets can
lead to insane latency spikes, since BH are blocked for too long.

It is time to make UDP/TCP stacks preemptible.

Note that fast path still runs from BH handler.

v2: Added "tcp: make tcp_sendmsg() aware of socket backlog"
    to reduce latency problems of large sends.

v3: Fixed a typo in tcp_cdg.c
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:26 -04:00
Eric Dumazet d41a69f1d3 tcp: make tcp_sendmsg() aware of socket backlog
Large sendmsg()/write() hold socket lock for the duration of the call,
unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
are parked into socket backlog for a long time.
Critical decisions like fast retransmit might be delayed.
Receivers have to maintain a big out of order queue with additional cpu
overhead, and also possible stalls in TX once windows are full.

Bidirectional flows are particularly hurt since the backlog can become
quite big if the copy from user space triggers IO (page faults)

Some applications learnt to use sendmsg() (or sendmmsg()) with small
chunks to avoid this issue.

Kernel should know better, right ?

Add a generic sk_flush_backlog() helper and use it right
before a new skb is allocated. Typically we put 64KB of payload
per skb (unless MSG_EOR is requested) and checking socket backlog
every 64KB gives good results.

As a matter of fact, tests with TSO/GSO disabled give very nice
results, as we manage to keep a small write queue and smaller
perceived rtt.

Note that sk_flush_backlog() maintains socket ownership,
so is not equivalent to a {release_sock(sk); lock_sock(sk);},
to ensure implicit atomicity rules that sendmsg() was
giving to (possibly buggy) applications.

In this simple implementation, I chose to not call tcp_release_cb(),
but we might consider this later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:26 -04:00
Eric Dumazet 5413d1babe net: do not block BH while processing socket backlog
Socket backlog processing is a major latency source.

With current TCP socket sk_rcvbuf limits, I have sampled __release_sock()
holding cpu for more than 5 ms, and packets being dropped by the NIC
once ring buffer is filled.

All users are now ready to be called from process context,
we can unblock BH and let interrupts be serviced faster.

cond_resched_softirq() could be removed, as it has no more user.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:26 -04:00
Eric Dumazet 860fbbc343 sctp: prepare for socket backlog behavior change
sctp_inq_push() will soon be called without BH being blocked
when generic socket code flushes the socket backlog.

It is very possible SCTP can be converted to not rely on BH,
but this needs to be done by SCTP experts.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:26 -04:00
Eric Dumazet e61da9e259 udp: prepare for non BH masking at backlog processing
UDP uses the generic socket backlog code, and this will soon
be changed to not disable BH when protocol is called back.

We need to use appropriate SNMP accessors.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:25 -04:00
Eric Dumazet 7309f8821f dccp: do not assume DCCP code is non preemptible
DCCP uses the generic backlog code, and this will soon
be changed to not disable BH when protocol is called back.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:25 -04:00
Eric Dumazet fb3477c0f4 tcp: do not block bh during prequeue processing
AFAIK, nothing in current TCP stack absolutely wants BH
being disabled once socket is owned by a thread running in
process context.

As mentioned in my prior patch ("tcp: give prequeue mode some care"),
processing a batch of packets might take time, better not block BH
at all.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:25 -04:00
Eric Dumazet c10d9310ed tcp: do not assume TCP code is non preemptible
We want to to make TCP stack preemptible, as draining prequeue
and backlog queues can take lot of time.

Many SNMP updates were assuming that BH (and preemption) was disabled.

Need to convert some __NET_INC_STATS() calls to NET_INC_STATS()
and some __TCP_INC_STATS() to TCP_INC_STATS()

Before using this_cpu_ptr(net->ipv4.tcp_sk) in tcp_v4_send_reset()
and tcp_v4_send_ack(), we add an explicit preempt disabled section.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 17:02:25 -04:00
David S. Miller 5e59c83f23 Merge branch 'xgene-channel-number'
Iyappan Subramanian says:

====================
drivers: net: xgene: fix: Get channel number from device binding

This patch set adds 'channel' property to get ethernet to CPU channel number,
thus decoupling the Linux driver from static resource selection.

v2: Address review comments from v1
- removed irq reference from Linux driver
- added 'channel' property to get ethernet to CPU channel number

v1:
- Initial version
====================

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 16:47:55 -04:00
Iyappan Subramanian 6619ac5a44 dtb: xgene: Add channel property
Added 'channel' property, describing ethernet to CPU channel number.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 16:47:55 -04:00
Iyappan Subramanian 4cac949f59 Documentation: dtb: xgene: Add channel property
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 16:47:54 -04:00
Iyappan Subramanian 2a37daa634 drivers: net: xgene: Get channel number from device binding
This patch gets ethernet to CPU channel (prefetch buffer number) from
the newly added 'channel' property, thus decoupling Linux driver from
resource management.

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 16:47:54 -04:00
Linus Torvalds 689de1d6ca Minimal fix-up of bad hashing behavior of hash_64()
This is a fairly minimal fixup to the horribly bad behavior of hash_64()
with certain input patterns.

In particular, because the multiplicative value used for the 64-bit hash
was intentionally bit-sparse (so that the multiply could be done with
shifts and adds on architectures without hardware multipliers), some
bits did not get spread out very much.  In particular, certain fairly
common bit ranges in the input (roughly bits 12-20: commonly with the
most information in them when you hash things like byte offsets in files
or memory that have block factors that mean that the low bits are often
zero) would not necessarily show up much in the result.

There's a bigger patch-series brewing to fix up things more completely,
but this is the fairly minimal fix for the 64-bit hashing problem.  It
simply picks a much better constant multiplier, spreading the bits out a
lot better.

NOTE! For 32-bit architectures, the bad old hash_64() remains the same
for now, since 64-bit multiplies are expensive.  The bigger hashing
cleanup will replace the 32-bit case with something better.

The new constants were picked by George Spelvin who wrote that bigger
cleanup series.  I just picked out the constants and part of the comment
from that series.

Cc: stable@vger.kernel.org
Cc: George Spelvin <linux@horizon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-02 13:01:51 -07:00
Linus Torvalds 98bcf28636 Merge tag 'md/4.6-rc6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD fixes from Shaohua Li:
 "This update includes several trival fixes.  The only important one is
  to fix MD bio merge, which has big performance impact"

* tag 'md/4.6-rc6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  raid5: delete unnecessary warnning
  MD: make bio mergeable
  md/raid0: remove empty line printk from dump_zones
  md/raid0: fix uninitialized variable bug
2016-05-02 12:22:51 -07:00
Linus Torvalds 33656a1f2e Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fix from Jan Kara:
 "A fix of a regression in UDF that got introduced in 4.6-rc1 by one of
  the charset encoding fixes"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix conversion of 'dstring' fields to UTF8
2016-05-02 09:59:57 -07:00
Linus Torvalds 5f40adbc3e Late GPIO fixes for the v4.6 series:
- A serious ACPI fix targeted for stable: lookup strings
   were being free:ed.
 - Revert two patches from the RCAR driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXJwiIAAoJEEEQszewGV1zaAUP/jJMysIIi2lmM1A7vxmA2ofp
 pkOaQQsvykVAAkgyIhtCGcUfaenDeWBdAB3DPIZGJgGK6Azsa+4LiqwHrJgKxXCE
 nkEoMZhFn6p4i+EURIE1g5k/8BVIdYllzz9n6k7grjV7YBgL5bN2aNvKC6WeOkMh
 1ATh4kBIgrRuZPXekb2xosEAgynJMO2fMLWrLNsbX2m7yTCfm/BSsm7/KyOBylPr
 0YTxvhJztd530/K1CEWwqKLZ1wFjMPldvgRnoDObo7PeOhZ1hQxMOGCmDR5tKFGM
 bkxA+r9XV/5suAjrPrMa+KS8Dz/jsMZ0dsgf1vPe1llOQKxlqJo4OYmjVj5jeOI9
 R/qOwBhjmwb9MFDcfLPXJgr79VdI+wg7jdv3UlusKR0MaVugwhS+RugYnw/P0GrG
 cuN3INWEwPYbBN3I81BMnCk6BGcRLyH+v9Nb7w99satZ0Cvee4k6qYrfla/iC2Yo
 ZSFTqlZeql971Ae6MNW8juVc4WWESGKavNsMVyQqnUOLE8Fh3dRbsiYIFzBdF4dV
 ASgMt9nCuicnjmX7U+Wi4XJVIWWWgBnkLfHdJxeqpoBvdjXhOy4f/sYUALR61gWK
 /PcTdzDvaiv1sDkue6mC2Lv8GL9gluxznew1zUndtyY+OCADXvgHgZpM7rBrIKCw
 UJlUF9lmCtOMmub32rVS
 =x3jf
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Here are some late but important fixes for the v4.6 kernel series.
  ACPI and RCAR, so two driver fixes (PM related) and a self-evident
  string lookup fix for ACPI GPIOs:

   - A serious ACPI fix targeted for stable: lookup strings were being
     free'd.

   - Revert two patches from the RCAR driver"

* tag 'gpio-v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib-acpi: Duplicate con_id string when adding it to the crs lookup list
  Revert "gpio: rcar: Fine-grained Runtime PM support"
  Revert "gpio: rcar: Add Runtime PM handling for interrupts"
2016-05-02 09:54:22 -07:00
Linus Torvalds 9c5d1bc2b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) MODULE_FIRMWARE firmware string not correct for iwlwifi 8000 chips,
    from Sara Sharon.

 2) Fix SKB size checks in batman-adv stack on receive, from Sven
    Eckelmann.

 3) Leak fix on mac80211 interface add error paths, from Johannes Berg.

 4) Cannot invoke napi_disable() with BH disabled in myri10ge driver,
    fix from Stanislaw Gruszka.

 5) Fix sign extension problem when computing feature masks in
    net_gso_ok(), from Marcelo Ricardo Leitner.

 6) lan78xx driver doesn't count packets and packet lengths in its
    statistics properly, fix from Woojung Huh.

 7) Fix the buffer allocation sizes in pegasus USB driver, from Petko
    Manolov.

 8) Fix refcount overflows in bpf, from Alexei Starovoitov.

 9) Unified dst cache handling introduced a preempt warning in
    ip_tunnel, fix by resetting rather then setting the cached route.
    From Paolo Abeni.

10) Listener hash collision test fix in soreuseport, from Craig Gallak

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  gre: do not pull header in ICMP error processing
  net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
  tipc: only process unicast on intended node
  cxgb3: fix out of bounds read
  net/smscx5xx: use the device tree for mac address
  soreuseport: Fix TCP listener hash collision
  net: l2tp: fix reversed udp6 checksum flags
  ip_tunnel: fix preempt warning in ip tunnel creation/updating
  samples/bpf: fix trace_output example
  bpf: fix check_map_func_compatibility logic
  bpf: fix refcnt overflow
  drivers: net: cpsw: use of_phy_connect() in fixed-link case
  dt: cpsw: phy-handle, phy_id, and fixed-link are mutually exclusive
  drivers: net: cpsw: don't ignore phy-mode if phy-handle is used
  drivers: net: cpsw: fix segfault in case of bad phy-handle
  drivers: net: cpsw: fix parsing of phy-handle DT property in dual_emac config
  MAINTAINERS: net: Change maintainer for GRETH 10/100/1G Ethernet MAC device driver
  gre: reject GUE and FOU in collect metadata mode
  pegasus: fixes reported packet length
  pegasus: fixes URB buffer allocation size;
  ...
2016-05-02 09:40:42 -07:00
Linus Torvalds ba22906a9f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:

 1) Fix panics with SR-IOV, from Babu Moger.

 2) Wire up preadv2/pwritev2.

 3) Allow proper auto-loading of VIO devices, from John Paul Adrian
    Glaubitz.

 4) Recognize Sonoma cpus, from Khalid Aziz.

 5) Fix bootup regressions caused by syscall trace fixes made recently.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix bootup regressions on some Kconfig combinations.
  sparc64: recognize and support Sonoma CPU type
  sparc: Implement and wire up vio_hotplug for vio.
  sparc: Implement and wire up modalias_show for vio.
  sparc/pci: Refactor dev_archdata initialization into pci_init_dev_archdata
  sparc/defconfigs: Remove CONFIG_IPV6_PRIVACY
  sparc: Write up preadv2/pwritev2 syscalls.
  sparc/PCI: Fix for panic while enabling SR-IOV
2016-05-02 09:32:50 -07:00
Jiri Benc b7f8fe251e gre: do not pull header in ICMP error processing
iptunnel_pull_header expects that IP header was already pulled; with this
expectation, it pulls the tunnel header. This is not true in gre_err.
Furthermore, ipv4_update_pmtu and ipv4_redirect expect that skb->data points
to the IP header.

We cannot pull the tunnel header in this path. It's just a matter of not
calling iptunnel_pull_header - we don't need any of its effects.

Fixes: bda7bb4634 ("gre: Allow multiple protocol listener for gre protocol.")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:19:58 -04:00
David S. Miller 582a1db988 Merge branch 'qed-selftests'
Sudarsana Reddy Kalluru says:

====================
qed/qede: ethtool selftests support.

This series adds the driver support for following selftests:
1. Register test
2. Memory test
3. Clock test
4. Interrupt test
5. Internal loopback test
Patch (1) adds the qed driver infrastructure for selftests. Patches (2) and
(3) add qede driver support for ethtool selftests.

Please consider applying this series to "net-next".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:16:45 -04:00
Sudarsana Reddy Kalluru 16f46bf054 qede: add implementation for internal loopback test.
This patch adds the qede implementation for internal loopback test.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:16:39 -04:00
Sudarsana Reddy Kalluru 3044a02eeb qede: add support for selftests.
This patch adds the qede ethtool support for the following tests:
- interrupt test
- memory test
- register test
- clock test

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:16:39 -04:00
Sudarsana Reddy Kalluru 03dc76ca1e qed: add infrastructure for device self tests.
This patch adds the functionality and APIs needed for selftests.
It adds the ability to configure the link-mode which is required for the
implementation of loopback tests. It adds the APIs for clock test,
register test, interrupt test and memory test.

Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:16:39 -04:00
Andrew Lunn 158bc065f2 net: dsa: mv88e6xxx: replace ds with ps where possible
The dsa_switch structure ds is actually needed in very few places,
mostly during setup of the switch. The private structure ps is however
needed nearly everywhere. Pass ps, not ds internally.

[vd: rebased Andrew's patch.]

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-02 00:16:23 -04:00
David S. Miller 8cd14ccbfd Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-05-01

This series contains updates to i40e and i40evf.

The theme of this series is code reduction, with several code cleanups in
this series.  Starting with Neerav's removal of the code that implemented
the HMC AQ APIs and calls, since they are now obsolete and not supported
by firmware.

Anjali changes the default of VFs to make sure they are not trusted or
privileged until its explicitly set for trust through the new NDO op
interface.  Also limited the number of MAC and VLAN addresses a VF can
add if it is untrusted/privileged.

Carolyn syncs the VF code for the changes made to the PF for the RSS
hash tuple settings, which ends up cleaning up much of the existing code.

Jesse cleans up compiler warnings which were found with gcc's W=2 option.
Then removed duplicate code, especially since only one copy was actually
being used.

Jacob addresses an issue which was found when testing GCC 6's which
happens to produce new warnings when you left shift a signed value
beyond the storage sizeof the type.  The converts i40e & i40evf to use
the BIT() macro more consistently.

Alex actually bucks the trend of code removal by adding support for
both drivers to use GSO_PARTIAL so that segmentation of frames with
checksums enabled in outer headers is supported.  Fortunately it does
not take much to add this support!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 23:38:49 -04:00
Tim Bingham 2c94b53738 net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case
Prior to commit d92cff89a0 ("net_dbg_ratelimited: turn into no-op
when !DEBUG") the implementation of net_dbg_ratelimited() was buggy
for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases.

The bug was that net_ratelimit() was being called and, despite
returning true, nothing was being printed to the console. This
resulted in messages like the following -

"net_ratelimit: %d callbacks suppressed"

with no other output nearby.

After commit d92cff89a0 ("net_dbg_ratelimited: turn into no-op when
!DEBUG") the bug is fixed for the DEBUG case. However, there's no
output at all for CONFIG_DYNAMIC_DEBUG case.

This patch restores debug output (if enabled) for the
CONFIG_DYNAMIC_DEBUG case.

Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG
case. The implementation takes care to check that dynamic debugging is
enabled before calling net_ratelimit().

Fixes: d92cff89a0 ("net_dbg_ratelimited: turn into no-op when !DEBUG")
Signed-off-by: Tim Bingham <tbingham@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 21:34:01 -04:00
Marcelo Ricardo Leitner 0970f5b366 sctp: signal sk_data_ready earlier on data chunks reception
Dave Miller pointed out that fb586f2530 ("sctp: delay calls to
sk_data_ready() as much as possible") may insert latency specially if
the receiving application is running on another CPU and that it would be
better if we signalled as early as possible.

This patch thus basically inverts the logic on fb586f2530 and signals
it as early as possible, similar to what we had before.

Fixes: fb586f2530 ("sctp: delay calls to sk_data_ready() as much as possible")
Reported-by: Dave Miller <davem@davemloft.net>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 21:06:10 -04:00
Hamish Martin efe790502b tipc: only process unicast on intended node
We have observed complete lock up of broadcast-link transmission due to
unacknowledged packets never being removed from the 'transmq' queue. This
is traced to nodes having their ack field set beyond the sequence number
of packets that have actually been transmitted to them.
Consider an example where node 1 has sent 10 packets to node 2 on a
link and node 3 has sent 20 packets to node 2 on another link. We
see examples of an ack from node 2 destined for node 3 being treated as
an ack from node 2 at node 1. This leads to the ack on the node 1 to node
2 link being increased to 20 even though we have only sent 10 packets.
When node 1 does get around to sending further packets, none of the
packets with sequence numbers less than 21 are actually removed from the
transmq.
To resolve this we reinstate some code lost in commit d999297c3d ("tipc:
reduce locking scope during packet reception") which ensures that only
messages destined for the receiving node are processed by that node. This
prevents the sequence numbers from getting out of sync and resolves the
packet leakage, thereby resolving the broadcast-link transmission
lock-ups we observed.

While we are aware that this change only patches over a root problem that
we still haven't identified, this is a sanity test that it is always
legitimate to do. It will remain in the code even after we identify and
fix the real problem.

Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: John Thompson <john.thompson@alliedtelesis.co.nz>
Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 21:03:30 -04:00
Michal Schmidt 0b86a2a1e5 cxgb3: fix out of bounds read
An out of bounds read of 2 bytes was discovered in cxgb3 with KASAN.

t3_config_rss() expects both arrays it gets as parameters to have
terminators. setup_rss(), the caller, forgets to add a terminator to
one of the arrays. Thankfully the iteration in t3_config_rss() stops
anyway, but in the last iteration the check for the terminator
is an out of bounds read.

Add the missing terminator to rspq_map[].

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 20:59:43 -04:00
Arnd Bergmann c489565b53 net/smscx5xx: use the device tree for mac address
This takes the MAC address for smsc75xx/smsc95xx USB network devices
from a the device tree. This is required to get a usable persistent
address on the popular beagleboard, whose hardware designers
accidentally forgot that an ethernet device really requires an a
MAC address to be functional.

The Raspberry Pi also ships smsc9514 without a serial EEPROM, stores
the MAC address in ROM accessible via VC4 firmware.

The smsc75xx and smsc95xx drivers are just two copies of the
same code, so better fix both.

[lkundrak@v3.sk: updated to use of_get_property() as per suggestion from
Arnd, reworded the message and comments a bit]

Tested-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 20:57:45 -04:00
Marek Vasut 70e927b98b mdio_bus: Fix MDIO bus scanning in __mdiobus_register()
Since commit b74766a0a0 ("phylib: don't return NULL
from get_phy_device()") in linux-next, phy_get_device() will return
ERR_PTR(-ENODEV) instead of NULL if the PHY device ID is all ones.

This causes problem with stmmac driver and likely some other drivers
which call mdiobus_register(). I triggered this bug on SoCFPGA MCVEVK
board with linux-next 20160427 and 20160428. In case of the stmmac, if
there is no PHY node specified in the DT for the stmmac block, the stmmac
driver ( drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c function
stmmac_mdio_register() ) will call mdiobus_register() , which will
register the MDIO bus and probe for the PHY.

The mdiobus_register() resp. __mdiobus_register() iterates over all of
the addresses on the MDIO bus and calls mdiobus_scan() for each of them,
which invokes get_phy_device(). Before the aforementioned patch, the
mdiobus_scan() would return NULL if no PHY was found on a given address
and mdiobus_register() would continue and try the next PHY address. Now,
mdiobus_scan() returns ERR_PTR(-ENODEV), which is caught by the
'if (IS_ERR(phydev))' condition and the loop exits immediately if the
PHY address does not contain PHY.

Repair this by explicitly checking for the ERR_PTR(-ENODEV) and if this
error comes around, continue with the next PHY address.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@opensource.altera.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 20:50:08 -04:00
Alexander Duyck 1c7b4a23d1 i40e/i40evf: Add support for GSO partial with UDP_TUNNEL_CSUM and GRE_CSUM
This patch makes it so that i40e and i40evf can use GSO_PARTIAL to support
segmentation for frames with checksums enabled in outer headers.  As a
result we can now send data over these types of tunnels at over 20Gb/s
versus the 12Gb/s that was previously possible on my system.

The advantage with the i40e parts is that this offload is mostly
transparent as the hardware still deals with the inner and/or outer IPv4
headers so the IP ID is still incrementing for both when this offload is
performed.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:16 -07:00
Jacob Keller ae63bff0d7 i40evf: make use of BIT() macro to avoid signed left shift
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:08 -07:00
Jacob Keller 2101bac2d4 i40e: make use of BIT() macro to prevent left shift of signed values
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:08 -07:00
Jacob Keller dcb57456e7 i40e/i40evf: fix I40E_MASK signed shift overflow warnings
GCC 6 has a new warning which will display when you attempt to left
shift a signed value beyond the storage size of the type. I40E_MASK
generates a mask value for 32bit registers. Properly typecast the mask
value and place the values in parenthesis to prevent macro expansion
issues.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:08 -07:00
Harshitha Ramamurthy 5a6fc256e7 i40e/i40evf : Bump driver version from 1.5.5 to 1.5.10
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:07 -07:00
Catherine Sullivan a3aa5036cf i40e: Update device ids for X722
Add a device ID for X722.

Change-Id: I574f2345ab341de98a6a1c212d0603af853e48b0
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:07 -07:00
Jesse Brandeburg de38fef610 i40e: Drop extra copy of function
i40e_release_rx_desc was in two files, but was only used
and needed in txrx.c.  Get rid of the extra copy.

Change-Id: I86e18239aa03531fc198b6c052847475084a9200
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:06 -07:00
Jesse Brandeburg a1b5a24fcc i40e: Use consistent type for vf_id
The driver was all over the place using signed or unsigned types
for vf_id, when it should always be signed.

This fixes warnings of type unsafe comparisons from gcc with W=2.

Change-Id: I2cb681f83d0f68ca124d2e4131e4ac0d9f8a6b22
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:06 -07:00
Jesse Brandeburg cdc3d93257 i40e: PTP - avoid aggregate return warnings
Aggregate return warnings are when struct types are returned
and must be copied to the lvalue with a struct copy by the compiler.

This fixes warnings of type aggregate-return from gcc with W=2.

Change-Id: I896b1bf514544bf0faeb458869d79914b9f1b168
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:06 -07:00
Catherine Sullivan 3ed439c56e i40e: Fix uninitialized variable
We have an uninitialized variable warning for valid_len for one case in
validate_vf_mesg. To fix this, just initialize it to 0 at the top of the
function and remove all of the now redundant assignments to 0 in the
individual cases.

Change-Id: Iacbd97f4c521ed8d662eef803a598d8707708cfd
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:05 -07:00
Carolyn Wyborny b29699b399 i40evf: RSS Hash Option parameters
This patch syncs the VF code for the changes made to the PF for the RSS
hash tuple settings.  Since the VF still cannot change the RSS hash
settings, change the code to make this clear to the user.  Previously,
the default settings were returned in this function. However, the
default can be changed by the PF so this does not make sense anymore.

Change-Id: I085eaf005fc7978b440d2a1bf2b2dd7cadaff39b
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:05:05 -07:00
Neerav Parikh 2b79c58d80 i40e: Remove HMC AQ API implementation
Remove the code that implements the HMC AQ APIs and call these APIs.
This is done because these are obsolete APIs and are not supported
by firmware.

Change-ID: I5d771d8f37c3e16e7b0a972ff9b27e75aa2d05d4
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:03:55 -07:00
Anjali Singhai Jain a856b5cb83 i40e: Prevent falling to promiscuous if the VF is not trusted
With this change a non trusted VF can never fall to promiscuous
mode when there is no room for a MAC/VLAN filter.

Change-Id: I8a155aa25c0bcdc6093414920c9ade4ee0bd20e8
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:03:37 -07:00
Anjali Singhai Jain 5f527ba962 i40e: Limit the number of MAC and VLAN addresses that can be added for VFs
If the VF is privileged/trusted it can do as it may please including
but not limited to hogging resources and playing unfair.
But if the VF is not privileged/trusted it still can add some number
(8) of MAC and VLAN addresses.
Other restrictions with respect to Port VLAN and normal VLAN still apply
to not privileged/trusted VF.

Change-Id: I3a9529201b184c8873e1ad2e300aff468c9e6296
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-05-01 17:03:03 -07:00
Jon Paul Maloy def22c47d7 tipc: set 'active' state correctly for first established link
When we are displaying statistics for the first link established between
two peers, it will always be presented as STANDBY although it in reality
is ACTIVE.

This happens because we forget to set the 'active' flag in the link
instance at the moment it is established. Although this is a bug, it only
has impact on the presentation view of the link, not on its actual
functionality.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-01 19:40:22 -04:00