Himanshu Madhani says:
====================
qlcnic: Bug fixes.
This series contains bug fixes for mailbox handling and multi Tx queue support
for all supported adapters.
changes from v1 -> v2
o updated patch to fix usage of netif_tx_{wake,stop} api during link change
as per David Miller's suggestion.
o Dropped patch to use spinklock per tx queue for more work.
o Added reworked patch for memory allocation failures.
o Added patch to allow capturing of dump, when auto recovery is disabled in firmware.
o Added patches for mailbox interrupt handling and debugging data for mailbox failure.
Please apply to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Do not enable mailbox polling in case of legacy interrupt.
Process mailbox AEN/response from the interrupt.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Allow driver to collect firmware dump, during a forced firmware dump
operation, when auto firmware recovery is disabled. Also, during this
operation, driver should not allow reset recovery to be performed.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Use vzalloc() instead of kzalloc() for allocation of
bootloader size memory. kzalloc() may fail to allocate
the size of bootloader
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Current code was not allowing the user to configure more
than one Tx ring using ethtool for 83xx/84xx adapter.
This regression was introduced by commit id
18afc102fd ("qlcnic: Enable
multiple Tx queue support for 83xx/84xx Series adapter.")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o TSS/RSS ring validation does not take into account that either
of these ring values can be 0. This patch fixes this validation
and would fail set_channel operation if any of these ring value
is 0. This regression was added as part of commit id
34e8c406fd ("qlcnic: refactor Tx/SDS
ring calculation and validation in driver.")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Driver should re-allocate all Tx queues after completing
diagnostic tests. This regression was added by commit id
c2c5e3a068 ("qlcnic: Enable
diagnostic test for multiple Tx queues.")
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Driver was using netif_tx_{stop,wake}_all_queues() api
during link change event. Remove these api calls to
manage queue start/stop event, as core networking stack
will manage this based on netif_carrier_{on,off} call.
These API's were modified as part of commit id
012ec81223 ("qlcnic: Multi Tx
queue support for 82xx Series adapter.")
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return -EPROTO error if fragments detected in checksum_setup_ip().
Fixes: 1431fb31ec ('xen-netback: fix fragment detection in checksum setup')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The neighbour code sends up an RTM_NEWNEIGH netlink notification if
the NUD state of a neighbour cache entry is changed by a timer (e.g.
from REACHABLE to STALE), even if the lladdr of the entry has not
changed.
But an administrative change to the the NUD state of a neighbour cache
entry that does not change the lladdr (e.g. via "ip -4 neigh change
... nud ...") does not trigger a netlink notification. This means
that netlink listeners will not hear about administrative NUD state
changes such as from a resolved state to PERMANENT.
This patch changes the neighbor code to generate an RTM_NEWNEIGH
message when the NUD state of an entry is changed administratively.
Signed-off-by: Bob Gilligan <gilligan@aristanetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Scott Feldman says:
====================
bonding: add some more netlink attributes
The following series implements five more bonding netlink attributes:
primary
primary_reselect
fail_over_mac
xmit_hash_policy
resend_igmp
Tested with modified iproute2 to verify attributes can be set at bond creation
time or set later. Verified sysfs interface to attributes continues to work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFLA_BOND_RESEND_IGMP to allow get/set of bonding parameter
resend_igmp via netlink.
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFLA_BOND_XMIT_HASH_POLICY to allow get/set of bonding parameter
xmit_hash_policy via netlink.
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFLA_BOND_FAIL_OVER_MAC to allow get/set of bonding parameter
fail_over_mac via netlink.
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFLA_BOND_PRIMARY_SELECT to allow get/set of bonding parameter
primary_select via netlink.
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFLA_BOND_PRIMARY to allow get/set of bonding parameter
primary via netlink.
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a build failure that appeared in v3.13-rc4 due to an
RTC/MFD update merged via -mm.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSr0jGAAoJELSic+t+oim9bWsP/26SiYoI/ulG76GyJQuPVuWj
21fRM0TEZaSVxZf4lzZahdak9kDg5wDFX15ii9PnIPJ8m5Nz5lHKLaKMXUEY8saA
9vUZBsZNVbyQ9i7uH1/3mPC1fA+3XqwXKPrp4roIv5kFYCRFD+NDyBoCwl/FNvTE
u72A9nmojJv6C0UuExFtyXRiJnAn2goXYQ0JvlFKrezv7yStNGNXakIH1za1gEOq
hdMIkJ9r5/jZFIGKr7SHZjsGYqwngKfVfaSdBeWZqp6Z0FHf7MRI6aJ/TcwpkLFH
HNezPznw1/CzAHmHAyszbcIMIh/GHxJzDldUr4g+kwForjJGaYIcrimxFPtcmW9P
MhJ5DdO765aqp24cImo1aNJXMnJYTFDguMWSCWLwxNSmlpQYg2iEiKF2astcCb+p
3v4cj9EYnEiPdKwSmA/kCb7fiEnefHm8vLMgRZwXIXiDfDSub4JPya+L4SxHFMAZ
OxZyk/ZGPNkDAsyKUx3KQMw4Gmvbro/TJMLn73CgdrW++39/CbSzl5hM8oRMpeMn
LTvBQOViAODlODQDI9B1gOqd4c/9zE1wGgY64rE7MtzlhlcPNhXYjL/Cb7+JNZ49
Mp6nxt/yH7rlFMdpIC0GgsgBP4ilh3GeLDZDi+AhnbD1773Yn/mCA4cohfpUug6j
Yme3Jd3/zWHrUkUPjkfO
=t/v5
-----END PGP SIGNATURE-----
Merge tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator/clk fix from Mark Brown:
"Fix s2mps11 build
This patch fixes a build failure that appeared in v3.13-rc4 due to an
RTC/MFD update merged via -mm"
* tag 's2mps11-build' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
mfd: s2mps11: Fix build after regmap field rename in sec-core.c
Pull scheduler fixes from Ingo Molnar:
"Three fixes for scheduler crashes, each triggers in relatively rare,
hardware environment dependent situations"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Rework sched_fair time accounting
math64: Add mul_u64_u32_shr()
sched: Remove PREEMPT_NEED_RESCHED from generic code
sched: Initialize power_orig for overlapping groups
This patch brings NUMA support and automatic fallback to vmalloc()
in case kmalloc() failed to allocate FQ hash table.
NUMA support depends on XPS being setup for the device before
qdisc allocation. After a XPS change, it might be worth creating
qdisc hierarchy again.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compile tested only.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Rasesh Mody <rmody@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While investigating performance problems on small RPC workloads,
I noticed linux TCP stack was always splitting the last TSO skb
into two parts (skbs). One being a multiple of MSS, and a small one
with the Push flag. This split is done even if TCP_NODELAY is set,
or if no small packet is in flight.
Example with request/response of 4K/4K
IP A > B: . ack 68432 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: . 65537:68433(2896) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: P 68433:69633(1200) ack 69632 win 2783 <nop,nop,timestamp 6524593 6525001>
IP B > A: . ack 68433 win 2768 <nop,nop,timestamp 6525001 6524593>
IP B > A: . 69632:72528(2896) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593>
IP B > A: P 72528:73728(1200) ack 69633 win 2768 <nop,nop,timestamp 6525001 6524593>
IP A > B: . ack 72528 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: . 69633:72529(2896) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001>
IP A > B: P 72529:73729(1200) ack 73728 win 2783 <nop,nop,timestamp 6524593 6525001>
We can avoid this split by including the Nagle tests at the right place.
Note : If some NIC had trouble sending TSO packets with a partial
last segment, we would have hit the problem in GRO/forwarding workload already.
tcp_minshall_update() is moved to tcp_output.c and is updated as we might
feed a TSO packet with a partial last segment.
This patch tremendously improves performance, as the traffic now looks
like :
IP A > B: . ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685>
IP A > B: P 94209:98305(4096) ack 98304 win 2783 <nop,nop,timestamp 6834277 6834685>
IP B > A: . ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277>
IP B > A: P 98304:102400(4096) ack 98305 win 2768 <nop,nop,timestamp 6834686 6834277>
IP A > B: . ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686>
IP A > B: P 98305:102401(4096) ack 102400 win 2783 <nop,nop,timestamp 6834279 6834686>
IP B > A: . ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279>
IP B > A: P 102400:106496(4096) ack 102401 win 2768 <nop,nop,timestamp 6834687 6834279>
IP A > B: . ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687>
IP A > B: P 102401:106497(4096) ack 106496 win 2783 <nop,nop,timestamp 6834280 6834687>
IP B > A: . ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280>
IP B > A: P 106496:110592(4096) ack 106497 win 2768 <nop,nop,timestamp 6834688 6834280>
Before :
lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K
280774
Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K':
205719.049006 task-clock # 9.278 CPUs utilized
8,449,968 context-switches # 0.041 M/sec
1,935,997 CPU-migrations # 0.009 M/sec
160,541 page-faults # 0.780 K/sec
548,478,722,290 cycles # 2.666 GHz [83.20%]
455,240,670,857 stalled-cycles-frontend # 83.00% frontend cycles idle [83.48%]
272,881,454,275 stalled-cycles-backend # 49.75% backend cycles idle [66.73%]
166,091,460,030 instructions # 0.30 insns per cycle
# 2.74 stalled cycles per insn [83.39%]
29,150,229,399 branches # 141.699 M/sec [83.30%]
1,943,814,026 branch-misses # 6.67% of all branches [83.32%]
22.173517844 seconds time elapsed
lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets"
IpOutRequests 16851063 0.0
IpExtOutOctets 23878580777 0.0
After patch :
lpq83:~# nstat >/dev/null;perf stat ./super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K
280877
Performance counter stats for './super_netperf 200 -t TCP_RR -H lpq84 -l 20 -- -r 4K,4K':
107496.071918 task-clock # 4.847 CPUs utilized
5,635,458 context-switches # 0.052 M/sec
1,374,707 CPU-migrations # 0.013 M/sec
160,920 page-faults # 0.001 M/sec
281,500,010,924 cycles # 2.619 GHz [83.28%]
228,865,069,307 stalled-cycles-frontend # 81.30% frontend cycles idle [83.38%]
142,462,742,658 stalled-cycles-backend # 50.61% backend cycles idle [66.81%]
95,227,712,566 instructions # 0.34 insns per cycle
# 2.40 stalled cycles per insn [83.43%]
16,209,868,171 branches # 150.795 M/sec [83.20%]
874,252,952 branch-misses # 5.39% of all branches [83.37%]
22.175821286 seconds time elapsed
lpq83:~# nstat | egrep "IpOutRequests|IpExtOutOctets"
IpOutRequests 11239428 0.0
IpExtOutOctets 23595191035 0.0
Indeed, the occupancy of tx skbs (IpExtOutOctets/IpOutRequests) is higher :
2099 instead of 1417, thus helping GRO to be more efficient when using FQ packet
scheduler.
Many thanks to Neal for review and ideas.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Van Jacobson <vanj@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These function to manipulate multiple addresses are not used anywhere
in current net-next tree. Some out of tree code maybe using these but
too bad; they should submit their code upstream..
Also, make __hw_addr_flush local since only used by dev_addr_lists.c
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
Please pull this batch of updates for the 3.14 stream...
For the Bluetooth bits, Gustavo says:
"This is the first batch of patches intended for 3.14. There is
nothing big here. Most of the code are refactors, clean up, small
fixes, plus some new device id support."
And...
"More patches to 3.14. Here we have the support for Low Energy
Connection Oriented Channels (LE CoC). Basically, as the name says,
this adds supports for connection oriented channels in the same way
we already have them for BR/EDR connections so profiles/protocols
that work on top of BR/EDR can now work on LE plus a plenty of new
possibilities for LE."
For the ath10k bits, Kalle says:
"Janusz and Marek implemented DFS support to ath10k, but the code is
not enabled yet due to missing cfg80211/mac80211 patches (it will be
enabled in the next pull request). Michal did some device reset fixes
and made it possible for ath10k to share an interrupt with another
device. And lots of smaller fixes from different people."
For the iwlwifi bits, Emmanuel says:
"I have here a big rework of the rate control by Eyal. This is obviously
the biggest part of this batch.
I also have enhancement of protection flags by Avri and a few bits for
WoWLAN by Eliad and Luca. Johannes cleans up the debugfs plus a few
fixes. I provided a few things for Bluetooth coexistence.
Besides this we have an implementation for low priority scan."
Along with all that, there are big batches of updates to mwifiex and
ath9k, Jeff Kirsher's FSF address fix patches, and a handful of other
bits here and there.
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains two Netfilter fixes for your net
tree, they are:
* Fix endianness in nft_reject, the NFTA_REJECT_TYPE netlink attributes
was not converted to network byte order as needed by all nfnetlink
subsystems, from Eric Leblond.
* Restrict SYNPROXY target to INPUT and FORWARD chains, this avoid a
possible crash due to misconfigurations, from Patrick McHardy.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This is similar to the set_peek_off patch where calling bind while the
socket is stuck in unix_dgram_recvmsg() will block and cause a hung task
spew after a while.
This is also the last place that did a straightforward mutex_lock(), so
there shouldn't be any more of these patches.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the IPsec git trees and some pure IPsec modules
to the IPsec section in the MAINTAINERS file.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using sk_dst_lock from softirq context is not supported right now.
Instead of adding BH protection everywhere,
udp_sk_rx_dst_set() can instead use xchg(), as suggested
by David.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Fixes: 9750223102 ("udp: ipv4: must add synchronization in udp_sk_rx_dst_set()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Driver bug fixes for SH PFC, TWL4030, MSM and RCAR.
- Update the MAINTAINERS
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSrsP1AAoJEEEQszewGV1zyn4P/j/tepOI0iPnEeUqc/Ah9lxd
lhZJytVH1mHpaf/4HJgmRayZrwria4fX6NEUwoIcZyuETLstvk2RnI6VdmuCRqBL
Oys/xxS3B55CIo36RHV1qADojPoNdK4lAbKKfklyJBnsDK7ozh+vPiZPANs0tzRS
ELaceaa1jX3O+ht3LvCbl8lguxu+6FpzzpQuFt/FCJTB4HuyTreg+wB7VWyoqsC7
Ry6BCKBjN8lewT8fvVkAdAeHaERkkCNxjBPDj4xU14hynWtgWikR0SOaFF7f5n2T
8dJKAqPIiVQP1Lb1WSJ6zEY1mCA7WjsOgudaDnF3vqPIMBaZKBM96cQHpI/JMdPU
VzrxC+k5BWxJC/XxE/AsES+Z949ctXFjiyj7b6SmcIU1rV46KpUuDLXwkbXSWs3G
FHWKIunHnHEmwqIDi+nDSCb7kdgxnboUZyiZJFr7s5XIJTqi7wv+wzSoIRP8N9w3
jNjWmrezJvP6YLdNBKh5jVDhY8XLcXr/U1QxP59tveO1isNwRvFJJG3zlvLwNhpG
ZGbqkygwoo3AvXc87ZLXFHIEf/vpOWr7ObWpuhF9OXoeWqmTaqD84XxUJSb7myPF
/XNIhjFLQ6kzHe5xivizLFatEXA5Oy9Rz2zTiWFZgbTciBRXU8fkoI56jcfoRwWG
yS/PxsVjr0FqrYsBvMJH
=V2Cs
-----END PGP SIGNATURE-----
Merge tag 'gpio-v3.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"All but one are long-standing bug fixes that are also tagged for
stable
- Driver bug fixes for SH PFC, TWL4030, MSM and RCAR.
- Update the MAINTAINERS"
* tag 'gpio-v3.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: rcar: Fix level interrupt handling
gpio: msm: Fix irq mask/unmask by writing bits instead of numbers
gpio: twl4030: Fix regression for twl gpio LED output
sh-pfc: Fix PINMUX_GPIO macro
MAINTAINERS: update GPIO maintainers entry
Pull two Ceph fixes from Sage Weil:
"One of these is fixing a regression from the d_flags file type patch
that went into -rc1 that broke instantiation of inodes and dentries
(we were doing dentries first). The other is just an off-by-one
corner case"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: Avoid data inconsistency due to d-cache aliasing in readpage()
ceph: initialize inode before instantiating dentry
There's a possible deadlock if we flush the peers notifying work during setting
mtu:
[ 22.991149] ======================================================
[ 22.991173] [ INFO: possible circular locking dependency detected ]
[ 22.991198] 3.10.0-54.0.1.el7.x86_64.debug #1 Not tainted
[ 22.991219] -------------------------------------------------------
[ 22.991243] ip/974 is trying to acquire lock:
[ 22.991261] ((&(&net_device_ctx->dwork)->work)){+.+.+.}, at: [<ffffffff8108af95>] flush_work+0x5/0x2e0
[ 22.991307]
but task is already holding lock:
[ 22.991330] (rtnl_mutex){+.+.+.}, at: [<ffffffff81539deb>] rtnetlink_rcv+0x1b/0x40
[ 22.991367]
which lock already depends on the new lock.
[ 22.991398]
the existing dependency chain (in reverse order) is:
[ 22.991426]
-> #1 (rtnl_mutex){+.+.+.}:
[ 22.991449] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[ 22.991477] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[ 22.991501] [<ffffffff81673659>] mutex_lock_nested+0x89/0x4f0
[ 22.991529] [<ffffffff815392b7>] rtnl_lock+0x17/0x20
[ 22.991552] [<ffffffff815230b2>] netdev_notify_peers+0x12/0x30
[ 22.991579] [<ffffffffa0340212>] netvsc_send_garp+0x22/0x30 [hv_netvsc]
[ 22.991610] [<ffffffff8108d251>] process_one_work+0x211/0x6e0
[ 22.991637] [<ffffffff8108d83b>] worker_thread+0x11b/0x3a0
[ 22.991663] [<ffffffff81095e5d>] kthread+0xed/0x100
[ 22.991686] [<ffffffff81681c6c>] ret_from_fork+0x7c/0xb0
[ 22.991715]
-> #0 ((&(&net_device_ctx->dwork)->work)){+.+.+.}:
[ 22.991715] [<ffffffff810de817>] check_prevs_add+0x967/0x970
[ 22.991715] [<ffffffff810dfdd9>] __lock_acquire+0xb19/0x1260
[ 22.991715] [<ffffffff810e0d12>] lock_acquire+0xa2/0x1f0
[ 22.991715] [<ffffffff8108afde>] flush_work+0x4e/0x2e0
[ 22.991715] [<ffffffff8108e1b5>] __cancel_work_timer+0x95/0x130
[ 22.991715] [<ffffffff8108e303>] cancel_delayed_work_sync+0x13/0x20
[ 22.991715] [<ffffffffa03404e4>] netvsc_change_mtu+0x84/0x200 [hv_netvsc]
[ 22.991715] [<ffffffff815233d4>] dev_set_mtu+0x34/0x80
[ 22.991715] [<ffffffff8153bc2a>] do_setlink+0x23a/0xa00
[ 22.991715] [<ffffffff8153d054>] rtnl_newlink+0x394/0x5e0
[ 22.991715] [<ffffffff81539eac>] rtnetlink_rcv_msg+0x9c/0x260
[ 22.991715] [<ffffffff8155cdd9>] netlink_rcv_skb+0xa9/0xc0
[ 22.991715] [<ffffffff81539dfa>] rtnetlink_rcv+0x2a/0x40
[ 22.991715] [<ffffffff8155c41d>] netlink_unicast+0xdd/0x190
[ 22.991715] [<ffffffff8155c807>] netlink_sendmsg+0x337/0x750
[ 22.991715] [<ffffffff8150d219>] sock_sendmsg+0x99/0xd0
[ 22.991715] [<ffffffff8150d63e>] ___sys_sendmsg+0x39e/0x3b0
[ 22.991715] [<ffffffff8150eba2>] __sys_sendmsg+0x42/0x80
[ 22.991715] [<ffffffff8150ebf2>] SyS_sendmsg+0x12/0x20
[ 22.991715] [<ffffffff81681d19>] system_call_fastpath+0x16/0x1b
This is because we hold the rtnl_lock() before ndo_change_mtu() and try to flush
the work in netvsc_change_mtu(), in the mean time, netdev_notify_peers() may be
called from worker and also trying to hold the rtnl_lock. This will lead the
flush won't succeed forever. Solve this by not canceling and flushing the work,
this is safe because the transmission done by NETDEV_NOTIFY_PEERS was
synchronized with the netif_tx_disable() called by netvsc_change_mtu().
Reported-by: Yaju Cao <yacao@redhat.com>
Tested-by: Yaju Cao <yacao@redhat.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull powerpc fixes from Ben Herrenschmidt:
"Uli's patch fixes a regression in ptrace caused by a mis-merge of a
previous LE patch. The rest are all more endian fixes, all fairly
trivial, found during testing of 3.13-rc's"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/powernv: Fix OPAL LPC access in Little Endian
powerpc/powernv: Fix endian issue in opal_xscom_read
powerpc: Fix endian issues in crash dump code
powerpc/pseries: Fix endian issues in MSI code
powerpc/pseries: Fix PCIE link speed endian issue
powerpc/pseries: Fix endian issues in nvram code
powerpc/pseries: Fix endian issues in /proc/ppc64/lparcfg
powerpc: Fix topology core_id endian issue on LE builds
powerpc: Fix endian issue in setup-common.c
powerpc: PTRACE_PEEKUSR always returns FPR0
Sebastian Hesselbarth says:
====================
net: phy: Ethernet PHY powerdown optimization
This is v2 of the ethernet PHY power optimization patches to reduce
power consumption of network PHYs with link that are either unused or
the corresponding netdev is down.
Compared to the last version, this patch set drops a patch to disable
unused PHYs after late initcall, as it is not compatible with a modular
mdio bus [1]. I'll investigate different ways to have a modular mdio bus
driver get notified when driver loading is done.
Again, a branch with v2 applied to v3.13-rc2 can also be found at
https://github.com/shesselba/linux-dove.git topic/ethphy-power-v2
[1] http://www.spinics.net/lists/arm-kernel/msg293028.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When phydev is going to HALTED state, we can try to suspend it to
safe more power. phy_suspend helper will check if PHY can be suspended,
so just call it when entering HALTED state.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This ensures PHYs are resumed on attach and suspended on detach.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds helper functions to resume and suspend a given phy_device
by calling the corresponding driver callbacks if available.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marvell PHYs support generic PHY suspend/resume, so provide those
callbacks to all marvell specific drivers.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using phydev, it should be phy_start/phy_stop'ed properly. This
driver doesn't do that, so add the corresponding calls to port_start/
stop respectively.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Members of 'struct association' are not in appropriate order to
reuse compiler added padding on 64bit architectures. In this patch
we reorder those struct members and help reduce the size of the
structure from 2776 bytes to 2720 bytes on 64 bit architectures.
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to i40e only (again).
Jesse provides a fix for when tx_rings structure is NULL and we do not want
to panic. Then refactors the flow control set up and disables L2 flow control
by default. Provides some trivial fixes as well as prevent compiler warnings.
Then to align to similar behaviour in ixgbe, use the total number of CPUs in
the system to suggest the number of transmit and receive queue pairs.
Shannon provides a i40e ethtool fix to get some more reasonable information
reports back out to the ethtool. In addition, fixes PF reset after offline
test, where it reorders the test to put the register test last as it is the
only one that needs a reset, and we wait to trigger the reset until after we
clear the testing bit. Lastly provides basic support for handling suspend
and resume for now, later on Wake-On-LAN support will be added.
Anjali provides changes to tell the stack about our actual number of queues
in order for RFS/RPS/XFS to work correctly. Then provides several patches to
implement dynamically changing the queue count for the main VSI. Adds
basic support for get/set channels for RSS so that the number of receive and
transmit queue pair can be changed via ethtool. Cleans up the use of
rtnl_lock in the reset patch since it runs from a work time.
Neerav Parikh cleans up the VF interface to remove FCoE code as this
feature will not be supported on VF interfaces.
v2:
- submitted patch 1 to net (since it was a fix needed for net), so dropped
from this series (this patch will get added to net-next when Dave syncs
his trees)
- Dropped patches 4 & 11 from previous submission because of feedback
received from Ben Hutchings and Sergei Shtylyov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If a user calls 'cpupower set --perf-bias 15', the process will end with
a SIGSEGV in libc because cpupower-set passes a NULL optarg to the atoi
call. This is because the getopt_long structure currently has all of
the options as having an optional_argument when they really have a
required argument. We change the structure to use required_argument to
match the short options and it resolves the issue.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1000439
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Thomas Renninger <trenn@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Francesco Fusco says:
====================
ovs: introduce arch-specific fast hashing improvements
From: Daniel Borkmann <dborkman@redhat.com>
We are introducing a fast hash function (see patch1) that can be
used in the context of OpenVSwitch to reduce the hashing footprint
(patch2). For details, please see individual patches!
v1->v2:
- Make hash generic and place it under lib
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently OVS uses jhash2() for calculating flow hashes in its
internal flow_hash() function. The performance of the flow_hash()
function is critical, as the input data can be hundreds of bytes
long.
OVS is largely deployed in x86_64 based datacenters. Therefore,
we argue that the performance critical fast path of OVS should
exploit underlying CPU features in order to reduce the per packet
processing costs. We replace jhash2 with the hash implementation
provided by the kernel hash lib, which exploits the crc32l
instruction to achieve high performance
Our patch greatly reduces the hash footprint from ~200 cycles of
jhash2() to around ~90 cycles in case of ovs_flow_hash_crc()
(measured with rdtsc over maximum length flow keys on an i7 Intel
CPU).
Additionally, we wrote a microbenchmark to stress the flow table
performance. The benchmark inserts random flows into the flow
hash and then performs lookups. Our hash deployed on a CRC32
capable CPU reduces the lookup for 1000 flows, 100 masks from
~10,100us to ~6,700us, for example.
Thus, simply use the newly introduced arch_fast_hash2() as a
drop-in replacement.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We introduce a new hashing library that is meant to be used in
the contexts where speed is more important than uniformity of the
hashed values. The hash library leverages architecture specific
implementation to achieve high performance and fall backs to
jhash() for the generic case.
On Intel-based x86 architectures, the library can exploit the crc32l
instruction, part of the Intel SSE4.2 instruction set, if the
instruction is supported by the processor. This implementation
is twice as fast as the jhash() implementation on an i7 processor.
Additional architectures, such as Arm64 provide instructions for
accelerating the computation of CRC, so they could be added as well
in follow-up work.
Signed-off-by: Francesco Fusco <ffusco@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace existing resource handling in the driver with managed
device resource, this ensures more consistent error values and
simplifies error paths.
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Several files refer to an old address for the Free Software Foundation
in the file header comment. Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.
CC: Wolfgang Grandegger <wg@grandegger.com>
CC: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>