Profiles show false sharing in addr_compare() because refcnt/dtime
changes dirty the first inet_peer cache line, where are lying the keys
used at lookup time. If many cpus are calling inet_getpeer() and
inet_putpeer(), or need frag ids, addr_compare() is in 2nd position in
"perf top".
Before patch, my udpflood bench (16 threads) on my 2x4x2 machine :
5784.00 9.7% csum_partial_copy_generic [kernel]
3356.00 5.6% addr_compare [kernel]
2638.00 4.4% fib_table_lookup [kernel]
2625.00 4.4% ip_fragment [kernel]
1934.00 3.2% neigh_lookup [kernel]
1617.00 2.7% udp_sendmsg [kernel]
1608.00 2.7% __ip_route_output_key [kernel]
1480.00 2.5% __ip_append_data [kernel]
1396.00 2.3% kfree [kernel]
1195.00 2.0% kmem_cache_free [kernel]
1157.00 1.9% inet_getpeer [kernel]
1121.00 1.9% neigh_resolve_output [kernel]
1012.00 1.7% dev_queue_xmit [kernel]
# time ./udpflood.sh
real 0m44.511s
user 0m20.020s
sys 11m22.780s
# time ./udpflood.sh
real 0m44.099s
user 0m20.140s
sys 11m15.870s
After patch, no more addr_compare() in profiles :
4171.00 10.7% csum_partial_copy_generic [kernel]
1787.00 4.6% fib_table_lookup [kernel]
1756.00 4.5% ip_fragment [kernel]
1234.00 3.2% udp_sendmsg [kernel]
1191.00 3.0% neigh_lookup [kernel]
1118.00 2.9% __ip_append_data [kernel]
1022.00 2.6% kfree [kernel]
993.00 2.5% __ip_route_output_key [kernel]
841.00 2.2% neigh_resolve_output [kernel]
816.00 2.1% kmem_cache_free [kernel]
658.00 1.7% ia32_sysenter_target [kernel]
632.00 1.6% kmem_cache_alloc_node [kernel]
# time ./udpflood.sh
real 0m41.587s
user 0m19.190s
sys 10m36.370s
# time ./udpflood.sh
real 0m41.486s
user 0m19.290s
sys 10m33.650s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver was already keeping 64 bit counters, just not using the new interface.
Ps: IMHO drivers should not be duplicating network device
stats into ethtool stats. It is useless duplication.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device driver already uses 64 bit statistics, it just
doesn't use the 64 bit interface.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change to 64 bit statistics interface, driver was already maintaining 64 bit
value.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not much change, device was already keeping per cpu statistics.
Use recent 64 statistics interface.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert vmxnet3 driver to 64 bit statistics interface.
This driver was already counting packet per queue in a 64 bit value so not
a huge change. Eliminate unused old net_device_stats structure.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Scott J. Goldman <scottjg@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andi Kleen and Tim Chen reported huge contention on inetpeer
unused_peers.lock, on memcached workload on a 40 core machine, with
disabled route cache.
It appears we constantly flip peers refcnt between 0 and 1 values, and
we must insert/remove peers from unused_peers.list, holding a contended
spinlock.
Remove this list completely and perform a garbage collection on-the-fly,
at lookup time, using the expired nodes we met during the tree
traversal.
This removes a lot of code, makes locking more standard, and obsoletes
two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
removes two pointers in inet_peer structure.
There is still a false sharing effect because refcnt is in first cache
line of object [were the links and keys used by lookups are located], we
might move it at the end of inet_peer structure to let this first cache
line mostly read by cpus.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.
It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.
The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.
Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use same logic as SIT tunnel to handle link local address
for GRE tunnel. OSPFv3 requires link-local address to function.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver keeps stats in net_device stats therefore it
does not need to define it's own get_stats hook.
Also, use standard format for net_device_ops (without &).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change is meant to remove all support for displaying an ntuple as
strings via ETHTOOL_GRXNTUPLE. The reason for this change is due to the
fact that multiple issues have been found including:
- Multiple buffer overruns for strings being displayed.
- Incorrect filters displayed, cleared filters with ring of -2 are displayed
- Setting get_rx_ntuple displays no rules if defined.
- Endianess wrong on displayed values.
- Hard limit of 1024 filters makes display functionality extremely limited
The only driver that had supported this interface was ixgbe. Since it no
longer uses the interface and due to the issues mentioned above I am
submitting this patch to remove it.
v2:
Updated based on comments from Ben Hutchings
- Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated
- Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container
- Left ETHTOOL_GRXNTUPLE but commented it as deprecated
Also cleaned up set_rx_ntuple since there is no flow spec container to
maintain we can drop all the code for the alloc and free of it and just
return ops->set_rx_ntuple().
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes these errors after the removal of interrupt.h from netdevice.h:
drivers/net/ll_temac_main.c: In function 'temac_open':
drivers/net/ll_temac_main.c:859:2: error: implicit declaration of function 'request_irq'
drivers/net/ll_temac_main.c:870:2: error: implicit declaration of function 'free_irq'
drivers/net/ll_temac_main.c: In function 'temac_poll_controller':
drivers/net/ll_temac_main.c:903:2: error: implicit declaration of function 'disable_irq'
drivers/net/ll_temac_main.c:909:2: error: implicit declaration of function 'enable_irq'
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padding per MSDU will affect the length of next packet and hence
the exact length of next packet is uncertain here.
Also, aggregation of transmission buffer, while downloading the
data to the card, wont gain much on the AMSDU packets as the AMSDU
packets utilizes the transmission buffer space to the maximum
(adapter->tx_buf_size).
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
All MSDUs, except the last one in an AMSDU, should end up at 4
bytes boundary. There is need to check if enough skb_tailroom
space exists before padding the skb.
Also re-arranging code for better readablity.
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The check of skb list empty before calling skb_peek and skb_dequeue is
redundant. These functions returns NULL if the list is empty.
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Instead of counting the number of packets in txq
for particular RA list before AMSDU creation,
maintain a counter which will keep track of the
same.
This will reduce some MIPS while generating AMSDU
traffic as we only have to check the counter instead
of traversing through skb list.
Signed-off-by: Yogesh Ashok Powar <yogeshp@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
the code always returns ret regardless, so if(ret) check is unecessary.
Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Use less indentions and remove uneeded irq-save flags.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Adding new event that close RX BA session in case of periodic BT activity
limiting WLAN activity.
Signed-off-by: Shahar Levi <shahar_levi@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some devices support BT/WLAN co-existence algorigthms.
In order not to harm the system performance and user experience, the device
requests not to allow any RX BA session and tear down existing RX BA sessions
based on system constraints such as periodic BT activity that needs to limit
WLAN activity (eg.SCO or A2DP).
In such cases, the intention is to limit the duration of the RX PPDU and
therefore prevent the peer device to use A-MPDU aggregation.
Adding ieee80211_stop_rx_ba_session() callback
that can be used by the driver to stop existing BA sessions.
Signed-off-by: Shahar Levi <shahar_levi@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed Rx pause counter for Lancer. Swapping hi and lo words.
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the code always returns 'status' regardless, so if(status) check is unecessary.
Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
enic driver gets MTU change notifications for MTU changes in the
port profile associated to a dynamic vnic. This patch adds support
in enic driver to set new MTU on the dynamic vnic and dynamically
adjust its buffers with new MTU size in response to such notifications.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stacking VLANs on top of the macvlan device does not
work if the lowerdev device is using vlan filters set
by NETIF_F_HW_VLAN_FILTER. Add ndo ops to pass vlan
calls to lowerdev.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Outside of net/sctp/ipv6.c, IPV6 specific code needs to
be ifdef guarded.
This fixes build failures with IPV6 disabled.
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting tx power can be deferred during scan or changing channel.
If after that correct tx power settings will not be sent to device,
we can observe transmission problems and timeouts. Force to send
tx power settings also after partial rxon change, to assure device
always be configured with up-to-date settings.
Resolves:
https://bugzilla.kernel.org/show_bug.cgi?id=36492
Cc: stable@kernel.org # 2.6.39+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Avoid queue and run autowakeup_work when device is not present anymore.
That prevent rmmod and device remove crash introduced by:
commit 1c0bcf89d8
Author: Ivo van Doorn <ivdoorn@gmail.com>
Date: Sat Apr 30 17:18:18 2011 +0200
rt2x00: Add autowake support for USB hardware
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch fixes 802.11n stability and performance regression we have
since 2.6.35. It boost performance on my 5GHz N-only network from about
5MB/s to 8MB/s. Similar percentage boost can be observed on 2.4 GHz.
These are test results of 5x downloading of approximately 700MB iso
image:
vanilla: 5.27 5.22 4.94 4.47 5.31 ; avr 5.0420 std 0.35110
patched: 8.07 7.95 8.06 7.99 7.96 ; avr 8.0060 std 0.055946
This was achieved with NetworkManager configured to do not perform
periodical scans, by configuring constant BSSID. With periodical scans,
after some time, performance downgrade to unpatched driver level, like
in example below:
patched: 7.40 7.61 4.28 4.37 4.80 avr 5.6920 std 1.6683
However patch still make better here, since similar test on unpatched
driver make link disconnects with below messages after some time:
wlan1: authenticate with 00:23:69:35:d1:3f (try 1)
wlan1: authenticate with 00:23:69:35:d1:3f (try 2)
wlan1: authenticate with 00:23:69:35:d1:3f (try 3)
wlan1: authentication with 00:23:69:35:d1:3f timed out
On 2.6.35 kernel patch helps against connection hangs with messages:
iwlagn 0000:20:00.0: queue 10 stuck 3 time. Fw reload.
iwlagn 0000:20:00.0: On demand firmware reload
iwlagn 0000:20:00.0: Stopping AGG while state not ON or starting
Cc: stable@kernel.org # 2.6.35+
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The AT91SAM9X5 SOCs have a similar CAN core, but they only have 8 compared
to 16 mailboxes on the AT91SAM9263 SOC. Another difference is that the bits
defining the state of the CAN core are cleared on read, thus the driver
has to derive the state by looking at the error counters.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch prepares the driver for the at91sam9X5 processors,
which don't have the mb0 bug.
(See commit 3a5655a5b5 for more details.)
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This is the second of two patches converting the at91_can driver from a
compile time mailbox setup to a dynamic one.
This patch first adds a id_table to the platform driver. Depending on the
driver_data the constants for the mailbox setup is selected. Then all
remaining prime mailbox constants are converted to functions, using the
run time selected mailbox constants.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This is the first of two patches converting the at91_can driver from a
compile time mailbox setup to a dynamic one.
This patch converts all derived mailbox constants to functions.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
...instead of deriving it from AT91_MB_RX_FIRST and AT91_MB_RX_NUM.
This removes a level of computation, when switching the driver from
compile time constants to runtime values.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Semicolons are not necessary after switch/while/for/if braces
so remove them.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Semicolons are not necessary after switch/while/for/if braces
so remove them.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Semicolons are not necessary after switch/while/for/if braces
so remove them.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This saves a network device lookup on each packet transmitted,
for sockets that are bound to a network device.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>