Commit Graph

349616 Commits

Author SHA1 Message Date
Claudiu Manoil ee873fda3b gianfar: Pack struct gfar_priv_grp into three cachelines
* remove unused members(!): imask, ievent
* move space consuming interrupt name strings (int_name_* members) to
external structures, unessential for the driver's hot path
* keep high priority hot path data within the first 2 cache lines

This reduces struct gfar_priv_grp from 6 to 3 cache lines.
(Also fixed checkpatch warnings for the old code, in the process.)

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 15:22:02 -05:00
Claudiu Manoil 5fedcc14d4 gianfar: Cleanup gfar_parse_group() code
Factor out redundant code (improve readability, source code size).

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 15:22:02 -05:00
Claudiu Manoil 0cd3fdea07 gianfar: Optimize struct gfar_priv_tx_q for two cache lines
Resize and regroup structure members to eliminate memory holes and
to pack the structure into 2 cache lines (from 3).
tx_ring_size was resized from 4 to 2 bytes and few members were re-grouped
in order to eliminate byte holes and achieve compactness.
Where possible, few members were grouped according to their usage and access
order (i.e. start_xmit vs. clean_tx_ring members), less important members
were pushed at the end.

Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 15:22:02 -05:00
Eric W. Biederman 243bb4c6d7 ipv6: Fix inet6_csk_bind_conflict so it builds with user namespaces enabled
When attempting to build linux-next with user namespaces enabled I ran
into this fun build error.

  CC      net/ipv6/inet6_connection_sock.o
.../net/ipv6/inet6_connection_sock.c: In function ‘inet6_csk_bind_conflict’:
.../net/ipv6/inet6_connection_sock.c:37:12: error: incompatible types when initializing type ‘int’ using
 type ‘kuid_t’
.../net/ipv6/inet6_connection_sock.c:54:30: error: incompatible type for argument 1 of ‘uid_eq’
.../include/linux/uidgid.h:48:20: note: expected ‘kuid_t’ but argument is of type ‘int’
make[3]: *** [net/ipv6/inet6_connection_sock.o] Error 1
make[2]: *** [net/ipv6] Error 2
make[2]: *** Waiting for unfinished jobs....

Using kuid_t instead of int to hold the uid fixes this.

Cc: Tom Herbert <therbert@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 15:20:12 -05:00
Cong Wang 4e58a02750 pktgen: support net namespace
v3: make pktgen_threads list per-namespace
v2: remove a useless check

This patch add net namespace to pktgen, so that
we can use pktgen in different namespaces.

Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 15:17:46 -05:00
Frank Li dc975382d2 net: fec: add napi support to improve proformance
Add napi support

Before this patch

 iperf -s -i 1
 ------------------------------------------------------------
 Server listening on TCP port 5001
 TCP window size: 85.3 KByte (default)
 ------------------------------------------------------------
 [  4] local 10.192.242.153 port 5001 connected with 10.192.242.138 port 50004
 [ ID] Interval       Transfer     Bandwidth
 [  4]  0.0- 1.0 sec  41.2 MBytes   345 Mbits/sec
 [  4]  1.0- 2.0 sec  43.7 MBytes   367 Mbits/sec
 [  4]  2.0- 3.0 sec  42.8 MBytes   359 Mbits/sec
 [  4]  3.0- 4.0 sec  43.7 MBytes   367 Mbits/sec
 [  4]  4.0- 5.0 sec  42.7 MBytes   359 Mbits/sec
 [  4]  5.0- 6.0 sec  43.8 MBytes   367 Mbits/sec
 [  4]  6.0- 7.0 sec  43.0 MBytes   361 Mbits/sec

After this patch
 [  4]  2.0- 3.0 sec  51.6 MBytes   433 Mbits/sec
 [  4]  3.0- 4.0 sec  51.8 MBytes   435 Mbits/sec
 [  4]  4.0- 5.0 sec  52.2 MBytes   438 Mbits/sec
 [  4]  5.0- 6.0 sec  52.1 MBytes   437 Mbits/sec
 [  4]  6.0- 7.0 sec  52.1 MBytes   437 Mbits/sec
 [  4]  7.0- 8.0 sec  52.3 MBytes   439 Mbits/sec

Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 14:16:17 -05:00
Barry Grussling 72aa8e1b29 ethoc: Cleanup driver format
Cleanup the format of ethoc.c to meet network driver style as
per checkpatch.pl.

Signed-off-by: Barry Grussling <barry@grussling.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 14:07:05 -05:00
David Ward 040468a0a7 ip_gre: When TOS is inherited, use configured TOS value for non-IP packets
A GRE tunnel can be configured so that outgoing tunnel packets inherit
the value of the TOS field from the inner IP header. In doing so, when
a non-IP packet is transmitted through the tunnel, the TOS field will
always be set to 0.

Instead, the user should be able to configure a different TOS value as
the fallback to use for non-IP packets. This is helpful when the non-IP
packets are all control packets and should be handled by routers outside
the tunnel as having Internet Control precedence. One example of this is
the NHRP packets that control a DMVPN-compatible mGRE tunnel; they are
encapsulated directly by GRE and do not contain an inner IP header.

Under the existing behavior, the IFLA_GRE_TOS parameter must be set to
'1' for the TOS value to be inherited. Now, only the least significant
bit of this parameter must be set to '1', and when a non-IP packet is
sent through the tunnel, the upper 6 bits of this same parameter will be
copied into the TOS field. (The ECN bits get masked off as before.)

This behavior is backwards-compatible with existing configurations and
iproute2 versions.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 14:05:28 -05:00
Jiri Pirko 5c766d642b ipv4: introduce address lifetime
There are some usecase when lifetime of ipv4 addresses might be helpful.
For example:
1) initramfs networkmanager uses a DHCP daemon to learn network
configuration parameters
2) initramfs networkmanager addresses, routes and DNS configuration
3) initramfs networkmanager is requested to stop
4) initramfs networkmanager stops all daemons including dhclient
5) there are addresses and routes configured but no daemon running. If
the system doesn't start networkmanager for some reason, addresses and
routes will be used forever, which violates RFC 2131.

This patch is essentially a backport of ivp6 address lifetime mechanism
for ipv4 addresses.

Current "ip" tool supports this without any patch (since it does not
distinguish between ipv4 and ipv6 addresses in this perspective.

Also, this should be back-compatible with all current netlink users.

Reported-by: Pavel Šimerda <psimerda@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:59:57 -05:00
David S. Miller 5a1dc31708 Merge branch 'ipfrags'
Jesper Dangaard Brouer says:

====================
This patchset is V2, with some trivial code fixes, which were noticed
by DaveM. It is still a partly respin of my fragmentation optimization
patches: http://thread.gmane.org/gmane.linux.network/250914

This is not the complete patchset, from the gmane link above. In this
patchset, I primarily focus on adjusting cacheline for better SMP/NUMA
performance.

Once this patchset have been agreed upon, I will continue and respin
the rest of my patches.

This time around, I have created a frag DoS generator, via the tool
trafgen (http://netsniff-ng.org/).  To create a stable DoS scenario
(no longer relying on frame dropping due to disabled flow-control).

Two 10G interfaces are under-test, and uses Ethernet flow-control.  A
third interface is used for generating the DoS attack (this interface
is also 10G, but it does not need to be, as 500Kpps DoS is enough).

Test types summary (netperf):
 Test-20G64K     == 2x10G with 65K fragments
 Test-20G3F      == 2x10G with 3x fragments (3*1472 bytes)
 Test-20G64K+DoS == Same as 20G64K with frag DoS
 Test-20G3F+DoS  == Same as 20G3F  with frag DoS

Patch list:
 Patch-01 - net: cacheline adjust struct netns_frags for better frag performance
 Patch-02 - net: cacheline adjust struct inet_frags for better frag performance
 Patch-03 - net: cacheline adjust struct inet_frag_queue
 Patch-04 - net: frag helper functions for mem limit tracking
 Patch-05 - net: use lib/percpu_counter API for fragmentation mem accounting
 Patch-06 - net: frag, move LRU list maintenance outside of rwlock

Performance table summary:

 Test-type:  Test-20G64K    Test-20G3F  20G64K+DoS   20G3F+DoS
 ----------  -----------    ----------  ----------   ---------
  net-next:  15114.5 Mbit/s   8954.21     2444.28     3918.01 Mbit/s
  Patch-01:  16075.8 Mbit/s   8976.18     2621.49     4072.79 Mbit/s
  Patch-02:  17806.9 Mbit/s   9280.32     2478.62     4274.59 Mbit/s
  Patch-03:  17317.4 Mbit/s   9308.62     2546.05     4336.59 Mbit/s
  Patch-04:  17635.9 Mbit/s   9256.16     2535.25     4327.63 Mbit/s
  Patch-05:  18027.0 Mbit/s   9918.99     2492.62     3621.68 Mbit/s
  Patch-06:  18486.7 Mbit/s  10723.20     3657.85     4560.64 Mbit/s

 I cannot explain the under-DoS regression that patch-05/percpu_counter
 introduces.  But patch-06/LRU-lock corrects the situation again.

Below is a testlab setup description, with links to the trafgen DoS
packet config used.

Testlab
=======

Server setup
------------
The machine acting as a server:
 - 2x CPU (E5-2630)
 - Thus a NUMA arch/machine
 - 4x 10Gbit/s ports
 - NICs 2x Intel Dual port 82599 based (driver ixgbe)

Setup:
 - Interfaces uses Ethernet flow control
 - Flush all iptables
 - Remove all iptables related module.
 - Kill irqbalance
 - Pin each 10G NIC port to a *single* CPU each

Pinning can easily be done by command hacks::

 for x in /proc/irq/*/eth8*/../smp_affinity_list ; do echo 1 > $x; done
 for x in /proc/irq/*/eth9*/../smp_affinity_list ; do echo 3 > $x; done
 for x in /proc/irq/*/eth31*/../smp_affinity_list; do echo 6 > $x; done
 for x in /proc/irq/*/eth32*/../smp_affinity_list; do echo 8 > $x; done

Notice NUMA setting: The CPU to NIC tying is carefully choosen
according to the NUMA node setup.  Thus, NICs connected to a PCI-e
slot that is connected to a physical CPU socket are tied together.

Choosing only a single CPU per NIC (port) is just to ease provoking
and debugging this performance issue. (In real setups, you can choose
more CPU, just remember the NUMA node in the equation).

Tools
-----

Netperf is used, with option -T to ensure CPU binding.
The netserver processes, are NAPI pinned::

 numactl -m0 -c0 netserver
 numactl -m1 -c 1 netserver -p 1337

I now have a frag DoS generator, created via the tool:
  trafgen (see: http://netsniff-ng.org/)

Trafgen packet config file:
 http://people.netfilter.org/hawk/frag_work/trafgen/frag_packet03_small_frag.txf

Notice, I'm using features of trafgen, recently developed by Daniel
Borkmann, thus you need the latest git tree to use my trafgen packet
config.

 git://github.com/borkmann/netsniff-ng.git

Command line:
 trafgen --dev eth51 --conf frag_packet03_small_frag.txf -V -k 100 --cpus 2

Tests types
-----------

Test(20G64K) UDP-64K 2x 10Gbit/s with no DoS traffic:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 export SIZE=$((65507)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\
 netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\
 netperf         -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \
 wait $! && tail -n3 ${LOG}.* && \
 tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992        / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}'

Test(20G3F) UDP-3xfrags 2x 10Gbit/s with no DoS traffic:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 export SIZE=$((3*1472)); export TIME=$((20)); export LOG=/tmp/netperf.log ;\
 netperf -p 1337 -H 192.168.31.2 -T7,7 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.31 &\
 netperf         -H 192.168.81.2 -T2,2 -t UDP_STREAM -l $TIME -- -m $SIZE >> ${LOG}.81 && \
 wait $! && tail -n3 ${LOG}.* && \
tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992        / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}'

Awk script for summming results:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

tail -n3 ${LOG}.{31,81} | awk 'BEGIN{sum=0;} /212992        / {sum+=$4; print " +"$4} /==/ {print " file:"$2} END{print "sum:"sum" Mbit/s"}'
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:37:29 -05:00
Jesper Dangaard Brouer 3ef0eb0db4 net: frag, move LRU list maintenance outside of rwlock
Updating the fragmentation queues LRU (Least-Recently-Used) list,
required taking the hash writer lock.  However, the LRU list isn't
tied to the hash at all, so we can use a separate lock for it.

Original-idea-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:36:24 -05:00
Jesper Dangaard Brouer 6d7b857d54 net: use lib/percpu_counter API for fragmentation mem accounting
Replace the per network namespace shared atomic "mem" accounting
variable, in the fragmentation code, with a lib/percpu_counter.

Getting percpu_counter to scale to the fragmentation code usage
requires some tweaks.

At first view, percpu_counter looks superfast, but it does not
scale on multi-CPU/NUMA machines, because the default batch size
is too small, for frag code usage.  Thus, I have adjusted the
batch size by using __percpu_counter_add() directly, instead of
percpu_counter_sub() and percpu_counter_add().

The batch size is increased to 130.000, based on the largest 64K
fragment memory usage.  This does introduce some imprecise
memory accounting, but its does not need to be strict for this
use-case.

It is also essential, that the percpu_counter, does not
share cacheline with other writers, to make this scale.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:36:24 -05:00
Jesper Dangaard Brouer d433673e5f net: frag helper functions for mem limit tracking
This change is primarily a preparation to ease the extension of memory
limit tracking.

The change does reduce the number atomic operation, during freeing of
a frag queue.  This does introduce a some performance improvement, as
these atomic operations are at the core of the performance problems
seen on NUMA systems.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:36:24 -05:00
Jesper Dangaard Brouer 6e34a8b37a net: cacheline adjust struct inet_frag_queue
Fragmentation code cacheline adjusting of struct inet_frag_queue.

Take advantage of the size of struct timer_list, and move all but
spinlock_t lock, below the timer struct.  On 64-bit 'lru_list',
'list' and 'refcnt', fits exactly into the next cacheline, and a
new cacheline starts at 'fragments'.

The netns_frags *net pointer is moved to the end of the struct,
because its used in a compare, with "next/close-by" elements of
which this struct is embedded into.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:36:23 -05:00
Jesper Dangaard Brouer 5f8e1e8b76 net: cacheline adjust struct inet_frags for better frag performance
The globally shared rwlock, of struct inet_frags, shares
cacheline with the 'rnd' number, which is used by the hash
calculations.  Fix this, as this obviously is a bad idea, as
unnecessary cache-misses will occur when accessing the 'rnd'
number.

Also small note that, moving function ptr (*match) up in struct,
is to avoid it lands on the next cacheline (on 64-bit).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:36:23 -05:00
Jesper Dangaard Brouer cd39a7890a net: cacheline adjust struct netns_frags for better frag performance
This small cacheline adjustment of struct netns_frags improves
performance significantly for the fragmentation code.

Struct members 'lru_list' and 'mem' are both hot elements, and it
hurts performance, due to cacheline bouncing at every call point,
when they share a cacheline.  Also notice, how mem is placed
together with 'high_thresh' and 'low_thresh', as they are used in
the compare operations together.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:36:23 -05:00
Felipe Balbi 656a05c899 net: ks8851: convert to threaded IRQ
just as it should have been. It also helps
removing the, now unnecessary, workqueue.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-29 13:32:00 -05:00
YOSHIFUJI Hideaki / 吉藤英明 08433eff2d net neigh: Optimize neighbor entry size calculation.
When allocating memory for neighbour cache entry, if
tbl->entry_size is not set, we always calculate
sizeof(struct neighbour) + tbl->key_len, which is common
in the same table.

With this change, set tbl->entry_size during the table
initialization phase, if it was not set, and use it in
neigh_alloc() and neighbour_priv().

This change also allow us to have both of protocol private
data and device priate data at tha same time.

Note that the only user of prototcol private is DECnet
and the only user of device private is ATM CLIP.
Since those are exclusive, we have not been facing issues
here.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 23:17:51 -05:00
bingtian.ly@taobao.com cdda88912d net: avoid to hang up on sending due to sysctl configuration overflow.
I found if we write a larger than 4GB value to some sysctl
variables, the sending syscall will hang up forever, because these
variables are 32 bits, such large values make them overflow to 0 or
negative.

    This patch try to fix overflow or prevent from zero value setup
of below sysctl variables:

net.core.wmem_default
net.core.rmem_default

net.core.rmem_max
net.core.wmem_max

net.ipv4.udp_rmem_min
net.ipv4.udp_wmem_min

net.ipv4.tcp_wmem
net.ipv4.tcp_rmem

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Li Yu <raise.sail@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 23:15:27 -05:00
Jamie Gloudon f7b5d1b9bd via-rhine: add 64bit statistics.
Switch to use ndo_get_stats64 to get 64bit statistics.

Signed-off-by: Jamie Gloudon <jamie.gloudon@gmail.com>
Tested-by: Jamie Gloudon <jamie.gloudon@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:43:02 -05:00
David J. Choi 7ab59dc15e drivers/net/phy/micrel_phy: Add support for new PHYs
Summary of changes:
.Newly added phys
	-KSZ8081/KSZ8091, which has some phy ids.
	-KSZ8061
	-KSZ9031, which is Gigabit phy.
	-KSZ886X, which has a switch function.
	-KSZ8031, which has a same phy ids with KSZ8021.

Signed-off-by: David J. Choi <david.choi@micrel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:42:10 -05:00
Giuseppe CAVALLARO ef3d90491a net: phy: realtek: add rtl8211e driver
This patch adds the minimal driver to manage the
Realtek RTL8211E 10/100/1000 Transceivers.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:34:53 -05:00
Cong Wang 556e6256c8 netpoll: use the net namespace of current process instead of init_net
This will allow us to setup netconsole in a different namespace
rather than where init_net is.

Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:32:55 -05:00
Cong Wang faeed828f9 netpoll: use ipv6_addr_equal() to compare ipv6 addr
ipv6_addr_equal() is faster.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:32:55 -05:00
Cong Wang 5fbee843c3 netpoll: add RCU annotation to npinfo field
dev->npinfo is protected by RCU.

This fixes the following sparse warnings:

net/core/netpoll.c:177:48: error: incompatible types in comparison expression (different address spaces)
net/core/netpoll.c:200:35: error: incompatible types in comparison expression (different address spaces)
net/core/netpoll.c:221:35: error: incompatible types in comparison expression (different address spaces)
net/core/netpoll.c:327:18: error: incompatible types in comparison expression (different address spaces)

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:32:54 -05:00
David S. Miller ce4a600e47 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
Included is an NFC pull.  Samuel says:

"It brings the following goodies:

- LLCP socket timestamping (To be used e.g with the recently released nfctool
  application for a more efficient skb timestamping when sniffing).
- A pretty big pn533 rework from Waldemar, preparing the driver to support
  more flavours of pn533 based devices.
- HCI changes from Eric in preparation for the microread driver support.
- Some LLCP memory leak fixes, cleanups and slight improvements.
- pn544 and nfcwilink move to the devm_kzalloc API.
- An initial Secure Element (SE) API.
- An nfc.h license change from the original author, allowing non GPL
  application code to safely include it."

Also included are a pair of mac80211 pulls.  Johannes says:

"We found two bugs in the previous code, so I'm sending you a pull
request again this soon.

This contains two regulatory bug fixes, some of Thomas's hwsim beacon
timer work and a documentation fix from Bob."

"Another pull request for mac80211-next. This time, I have a number of
things, the patches are mostly self-explanatory. There are a few fixes
from Felix and myself, and random cleanups & improvements. The biggest
thing is the partial patchset from Marco preparing for mesh powersave."

Additionally, there are a pair of iwlwifi pulls.  Johannes says:

"For iwlwifi-next, I have a few cleanups/improvements as well as a few
not very important fixes and more preparations for new devices."

"Please pull a few updates for iwlwifi. These are just some cleanups and
a debug improvement."

On top of that, there is a slew of driver updates.  This includes
brcmfmac, mwifiex, ath9k, carl9170, and mwl8k as well as a handful
of others.  The bcma and ssb busses get some attention as well.
Still, I don't see any big headliners here.

Also included is a pull of the wireless tree, in order to resolve
some merge conflicts.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:21:38 -05:00
David S. Miller 8a67b05db9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
Jeff Kirsher says:

====================
This series contains updates to e1000e, ixgbevf, igb and igbvf.
Majority of the patches are code cleanups of e1000e where code
is removed (Yeah!).  The other two e1000e patches are fixes.  The
first is to fix the maximum frame size for 82579 devices.  The second
fix is to resolve an issue with devices other than 82579 that suffer
from dropped transactions on platforms with deep C-states when
jumbo frames are enabled.

The ixgbevf patch is to ensure that the driver fetches the correct,
refreshed value for link status and speed when the values have changed.

The igb and igbvf patches are a solution to an issue Stefan Assmann
reported, where when the PF is up and igbvf is loaded, the MAC address
is not generated using eth_hw_addr_random().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:18:17 -05:00
Oliver Hartkopp 2bf3440d7b can: rework skb reserved data handling
Added accessor and skb_reserve helpers for struct can_skb_priv.
Removed pointless skb_headroom() check.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
CC: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 18:17:25 -05:00
John W. Linville 4205e6ef4e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-01-28 14:43:00 -05:00
John W. Linville 9ebea3829f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/ath/ath9k/main.c
	drivers/net/wireless/iwlwifi/dvm/tx.c
2013-01-28 13:54:03 -05:00
Mitch A Williams 8d56b6d507 igbvf: be sane about random MAC addresses
Tighten up some of the code surrounding MAC addresses. Since the PF is
now giving all zeros instead of a random address, check for this case
and generate a random address. This ensures that we always know when we
have a random address and udev won't get upset about it.

Additionally, tighten up some of the log messages and clean up the
formatting.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28 03:18:13 -08:00
Mitch A Williams 5ac6f91d39 igb: Don't give VFs random MAC addresses
If the user has not assigned a MAC address to a VM, then don't give it a
random one. Instead, just give it zeros and let it figure out what to do
with them.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28 03:17:57 -08:00
Greg Rose aa19c2957b ixgbevf: Make sure link status and speed are fetched
A recent change makes it necessary to set get_link_status to ensure that
the driver fetches the correct, refreshed value for link status and speed
when it has changed in the physical function device.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28 00:19:52 -08:00
Bruce Allan 39149d4da6 e1000e: cleanup: remove comments which are no longer applicable
Code was removed but the applicable comments were not.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-28 00:12:03 -08:00
Bruce Allan a9bb629039 e1000e: cleanup hw.h
Remove unnecessary #include, forward prototype of struct e1000_adapter and
an empty comment; fix a comment which mentions "static data for the MAC"
which is not applicable to the following struct; and cleanup some
whitespace issues.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 23:50:50 -08:00
Bruce Allan 41c7d9c963 e1000e: cleanup: remove unused #define
All references to E1000_ERT_2048 have been removed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 23:28:29 -08:00
Bruce Allan 3e35d9918c e1000e: adjust PM QoS request
It has been found that devices other than 82579 (a.k.a. e1000_pch2lan)
suffer from dropped transactions on platforms with deep C-states when
jumbo frames are enabled.  For example, LOMs on ICH9- and ICH10-based
platforms which recently had early-receive de-featured (for stability
reasons) suffer from this.  To resolve this for all devices, when jumbo
frames are enabled set the PM QoS DMA latency request based on the size
of the receive packet buffer less one full frame.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 23:27:41 -08:00
Bruce Allan c3d2dbf403 e1000e: correct maximum frame size on 82579
The largest jumbo frame supported by the 82579 hardware is 9018.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 22:42:58 -08:00
Bruce Allan 6b598e1eac e1000e: cleanup: remove e1000e_commit_phy()
Remove the function e1000e_commit_phy() and replace the few calls to it
with the same function pointer that it would call.  The function pointer is
almost always set for the devices that access these code paths so there is
no risk of a NULL pointer dereference; for the few instances where the
function pointer might not be set (i.e. can be called for the few devices
which do not have this function pointer set), check for a valid function
pointer.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 22:35:46 -08:00
Bruce Allan dde3a5745c e1000e: cleanup: remove e1000_get_cable_length()
Remove the function e1000_get_cable_length() and replace the two calls
to it with the same function pointer that it would call.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 22:28:55 -08:00
Bruce Allan 84c1befe34 e1000e: cleanup: remove e1000_get_phy_cfg_done()
Remove the function e1000_get_phy_cfg_done() and replace the single call
to it with the same function pointer that it would call.  The function
pointer is always set so there is no risk of a NULL pointer dereference.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 22:22:11 -08:00
Bruce Allan fe90849f76 e1000e: cleanup: rename e1000_get_cfg_done()
In keeping with the e1000e driver function naming convention, the subject
function is renamed to indicate it is generic, i.e. it is applicable to
more than just a single MAC family (e.g. 80003es2lan, 82571, ich8lan).

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 21:48:28 -08:00
Bruce Allan c2c6629ba3 e1000e: cleanup: remove e1000_force_speed_duplex()
Remove the function e1000_force_speed_duplex() and replace the single call
to it with the same function pointer that it would call.  The function
pointer is always set so there is no risk of a NULL pointer dereference.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 21:41:41 -08:00
Bruce Allan 7de89f058e e1000e: cleanup: remove e1000_set_d0_lplu_state()
Replace the function e1000_set_d0_lplu_state() with the contents of it
coded in place of the single call to the function.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-01-27 21:32:01 -08:00
Eric Dumazet cef401de7b net: fix possible wrong checksum generation
Pravin Shelar mentioned that GSO could potentially generate
wrong TX checksum if skb has fragments that are overwritten
by the user between the checksum computation and transmit.

He suggested to linearize skbs but this extra copy can be
avoided for normal tcp skbs cooked by tcp_sendmsg().

This patch introduces a new SKB_GSO_SHARED_FRAG flag, set
in skb_shinfo(skb)->gso_type if at least one frag can be
modified by the user.

Typical sources of such possible overwrites are {vm}splice(),
sendfile(), and macvtap/tun/virtio_net drivers.

Tested:

$ netperf -H 7.7.8.84
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
7.7.8.84 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    3959.52

$ netperf -H 7.7.8.84 -t TCP_SENDFILE
TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 ()
port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    3216.80

Performance of the SENDFILE is impacted by the extra allocation and
copy, and because we use order-0 pages, while the TCP_STREAM uses
bigger pages.

Reported-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 00:27:15 -05:00
David S. Miller 61550022b9 Merge branch 'for-davem' of git://gitorious.org/linux-can/linux-can-next
Marc Kleine-Budde says:

====================
this is a pull-request for net-next/master. There is are 9 patches by
Fabio Baltieri and Kurt Van Dijck which add LED infrastructure and
support for CAN devices. Bernd Krumboeck adds a driver for the USB CAN
adapter from 8 devices. Oliver Hartkopp improves the CAN gateway
functionality. There are 4 patches by me, which clean up the CAN's
Kconfig.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 00:19:34 -05:00
Cong Wang 0e36cbb344 net: add RCU annotation to sk_dst_cache field
sock->sk_dst_cache is protected by RCU.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 00:15:28 -05:00
Cong Wang cec771d646 decnet: use correct RCU API to deref sk_dst_cache field
sock->sk_dst_cache is protected by RCU, therefore we should
use __sk_dst_get() to deref it once we lock the sock.

This fixes several sparse warnings.

Cc: linux-decnet-user@lists.sourceforge.net
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 00:15:27 -05:00
Amir Vadai 78fb2de711 net/mlx4_en: Initialize RFS filters lock and list in init_netdev
filters_lock might have been used while it was re-initialized.
Moved filters_lock and filters_list initialization to init_netdev instead of
alloc_resources which is called every time the device is configured.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 00:14:28 -05:00
Amir Vadai 7225922558 net/mlx4_en: Fix a race when closing TX queue
There is a possible race where the TX completion handler can clean the
entire TX queue between the decision that the queue is full and actually
closing it. To avoid this situation, check again if the queue is really
full, if not, reopen the transmit and continue with sending the packet.

CC: Eric Dumazet <edumazet@google.com>

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-28 00:14:24 -05:00