The work queue is initialised in rtl_open (when the interface goes up),
but canceled in rtl_remove_one (when the PCI device gets removed). If
the network interface is not brought up, then the work queue struct is
not initialised. When the device is removed, the attempt to cancel the
uninitialised work queue causes a lockdep warning.
This patch fixes the issue by moving cancel_work_sync to rtl_close (to
match rtl_open). (Note that rtl_close is also called via
unregister_netdev in rtl_remove_one.)
Signed-off-by: Peter Wu <lekensteyn@gmail.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add MODULE_ALIAS, so that auto module loading can work.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Received packets are only scattered if this is enabled in both the
matching filter and the receiving queue. This was not being done for
filters inserted for RFS, so any packet requiring more than a single
descriptor was dropped.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 4bb1667255 (build some drivers only
when compile-testing), PTP_1588_CLOCK_PCH depends on (X86 ||
COMPILE_TEST). But PCH_GBE selects PTP_1588_CLOCK_PCH without
depending on x86. Fix this by adding the same dependency here.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: <fengguang.wu@intel.com> [intel's build test robot]
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: kbuild-all@01.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains Netfilter fixes for your net tree,
they are:
* Fix potential NULL dereference in the socket match if revision 0
is used, from Eric Dumazet.
* Fix missing expectation NAT initialization that results in dumping
the NAT part via ctnetlink, thus leading to problems in expectation
synchronization through conntrackd, from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
Please accept this batch of fixes intended for the 3.11 tree...
Alexey Khoroshilov fixes a suspend-related race in ath9k_htc.
Arnd Bergmann corrects the alignment of a structure in the ssb code
to be compatible with ARM devices.
Bob Copeland provides an ath5k fix that corrects a mistaken variable
initialization.
Felix Fietkau corrects some frame accounting for dropped frames
in ath9k.
Geert Uytterhoeven brings a Kconfig fix to indicate the DMA
requirements for rt2x00.
Larry Finger offers two rtlwifi fixes: one that properly initializes
a callback; and, a scattered collection of Kconfig, Makefile, and
EXPORT_SYMBOL changes that correct some build problems.
Finally, Sujith Manoharan provides an ath9k fix to disable a feature
on a specific hardware device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
here are two fixes for the v3.11 release cycle:
Maximilian Schneider contributes a patch for the esd_usb2 CAN driver. It adds
sanity checking to the data coming from the USB CAN adapter before using it.
Alexey Khoroshilov from the Linux Driver Verification project fixes an urb leak
in the error handling of the USB 8dev's usb_8dev_start() function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 9f00b2e7cf ("bridge: only expire the mdb entry when query is
received") added a nasty bug as an active timer can be reinitialized.
setup_timer() must be done once, no matter how many time mod_timer()
is called. br_multicast_new_group() is the right place to do this.
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Diagnosed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Limit the min/max value passed to the
/proc/sys/net/ipv4/tcp_syn_retries.
Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch doesn't change the compiled code because ARC_HDR_SIZE is 4
and sizeof(int) is 4, but the intent was to use the header size and not
the sizeof the header size.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The veth device doesn't provide the vlan features,
so TSO for example is disabled and that causes
performance issues when using tagged traffic.
The test topology looks like this:
br0 br1
/ \ / \
vnet veth0.10 ----- veth1.10 vnet
VM VM
The netperf results with current veth driver:
MIGRATED TCP STREAM TEST from 192.168.1.1 ()
port 0 AF_INET to 192.168.1.2 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.01 2210.22
Now after applying the proposed patch:
MIGRATED TCP STREAM TEST from 192.168.1.1 ()
port 0 AF_INET to 192.168.1.2 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 13067.47
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Clear cached vport vlan variable(vp->vlan) in PF on PCI FLR and
back-channel termination which will allow to configure guest VLAN
on VF after force off/shut down the guest VM.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver was using wrong mask for template version.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Commit b938662d88
("qlcnic: Fix ethtool supported port status for 83xx")
introduced regression for display of link status for 83xx
adapter while refactoring port status display. This patch
is to fix the link status display for 83xx adapter.
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o "qlcnic_sriov" structure pointer should be accessed only
when SR-IOV is enabled. Access this pointer after SR-IOV
PF check.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Multicast MAC was not getting programmed due to which multicast
packets were being dropped by FW.
This patch fixes commit 168e4fb54c11865668ad50eff81b5f2729e0e0f4
("qlcnic: Secondary unicast MAC address support.") which introduced
bug in handling multicast packets.
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Check for non-NULL set_mac_filter_count function pointer
before calling it fixes the panic.
This patch fixes regression introduced by patch
"qlcnic: Secondary unicast MAC address support." with
commit id 168e4fb54c11865668ad50eff81b5f2729e0e0f4.
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NAPI poll function does not re-enable the interrupt, if __QLCNIC_DEV_UP is not set
in adapter state. This was preventing driver from receiving any packet.
Signed-off-by: Pratik Pujar <pratik.pujar@qlogic.com>
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
83xx adapter specific code was accessing 82xx register which
resulted in invalid register offset. This patch uses proper
register access method.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two race conditions in existing code for doing IGMP
management in workqueue in vxlan. First, the vxlan_group_used
function checks the list of vxlan's without any protection, and
it is possible for open followed by close to occur before the
igmp work queue runs.
To solve these move the check into vxlan_open/stop so it is
protected by RTNL. And split into two work structures so that
there is no racy reference to underlying device state.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix memory leaks and other badness from VXLAN network namespace
teardown. When network namespace is removed, all the vxlan devices should
be unregistered (not closed).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If usb_8dev_start() fails to submit urb,
it unanchors the urb but forgets to free it.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The esd_usb2_read_bulk_callback() function is parsing the data that comes from
the USB CAN adapter. One datum is used as an index to access the dev->nets[]
array. This patch adds the missing bounds checking.
Acked-by: Matthias Fuchs <matthias.fuchs@esd.eu>
Signed-off-by: Maximilian Schneider <max@schneidersoft.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pull networking fixes from David Miller:
"A couple interesting SKB fragment handling fixes, plus the usual small
bits here and there:
1) Fix 64-bit divide build failure on 32-bit platforms in mlx5, from
Tim Gardner.
2) Get rid of a stupid reimplementation on "%*phC" in our sysfs MAC
address printing helper.
3) Fix NETIF_F_SG capability advertisement in hyperv driver, if the
device can't do checksumming offloads then it shouldn't say it can
do SG either. From Haiyang Zhang.
4) bgmac needs to depend on PHYLIB, from Hauke Mehrtens.
5) Don't leak DMA mappings on mapping failures, from Neil Horman.
6) We need to reset the transport header of SKBs in ipv4 before we
attempt to perform early socket demux, just like ipv6 does. From
Eric Dumazet.
7) Add missing locking on vxlan device removal, from Stephen
Hemminger.
8) xen-netfront has to make two passes over an SKB to prepare it for
transfer. One pass calculates the number of slots needed, the
second massages the SKB and fills the slots. Unfortunately, the
first pass doesn't calculate the number of slots properly so we
can end up trying to build a MAX_SKB_FRAGS + 1 SKB which doesn't
work out so well. Fix from Jan Beulich with help and discussion
with several others.
9) Fix a similar problem in tun and macvtap, which have to split up
scatter-gather elements at PAGE_SIZE boundaries. Don't do
zerocopy if it would result in a > MAX_SKB_FRAGS skb. Fixes from
Jason Wang.
10) On receive, once we've decoded the VLAN state completely, clear
skb->vlan_tci. Otherwise demuxed tunnels underneath can trigger
the VLAN code again, corrupting the packet. Fix from Eric
Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
vlan: fix a race in egress prio management
vlan: mask vlan prio bits
macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
tuntap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
pkt_sched: sch_qfq: remove a source of high packet delay/jitter
xen-netfront: pull on receive skb may need to happen earlier
vxlan: add necessary locking on device removal
hyperv: Fix the NETIF_F_SG flag setting in netvsc
net: Fix sysfs_format_mac() code duplication.
be2net: Fix to avoid hardware workaround when not needed
macvtap: do not assume 802.1Q when send vlan packets
macvtap: fix the missing ret value of TUNSETQUEUE
ipv4: set transport header earlier
mlx5 core: Fix __udivdi3 when compiling for 32 bit arches
bgmac: add dependency to phylib
net/irda: fixed style issues in irlan_eth
ethtool: fixed trailing statements in ethtool
ndisc: bool initializations should use true and false
atl1e: unmap partially mapped skb on dma error and free skb
Pull x86 fixes from Peter Anvin:
"Trying again to get the fixes queue, including the fixed IDT alignment
patch.
The UEFI patch is by far the biggest issue at hand: it is currently
causing quite a few machines to boot. Which is sad, because the only
reason they would is because their BIOSes touch memory that has
already been freed. The other major issue is that we finally have
tracked down the root cause of a significant number of machines
failing to suspend/resume"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Make sure IDT is page aligned
x86, suspend: Handle CPUs which fail to #GP on RDMSR
x86/platform/ce4100: Add header file for reboot type
Revert "UEFI: Don't pass boot services regions to SetVirtualAddressMap()"
efivars: check for EFI_RUNTIME_SERVICES
3.10 wasn't a good release for md. The bio changes left a couple of
bugs, and an md "fix" created another one.
These three patches appear to fix the issues and have been tagged for
-stable.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIVAwUAUedstTnsnt1WYoG5AQJmlBAAvQq9PryhyhPl1xC7K6ufoFfoRFO9yQPd
lbe1bpAgyDuStF2t0H+pOBJeI45P9p8oGDsGlvzNI0/1YPxJpv20d/EkQI3eJ4vm
B2iLt2NyZLs3zTc1V0oWBCCp2BT5vgdjHfCfyt7H2MQg0Lxtz9DjVnV0uF58EIWe
mEo5fipXtRlgHAsTay4hiDqoUNy1Lc4I9nPRGtxH8/qj4oB5YudMyNvAq9FFGwr0
31L/ok+mjZdsZJZ+T4jS6hcjy0Wv/TU9lO2Ou1Jua2hLwVk6jvJ5DWzwuqakauaP
MJqiiT4Wuv9V5LKvNknhT2fHfevfBJfhA1hTdVNQuVEHFVW5eb3nW8n8bYZUOhxD
arqWHTIPqK6x8VBfocmjgePnc7CqCS4psPN5DAmp+Kiu6aF0e+eNF4SH8PIV+PQB
xFW2CR3I6Y3evnUab7K+t2gw41vB5iJmlejvlCY3wd4OhRSt0rDQJc7b51tu7lbB
FR8yYSoZ2fvjAqcG4ScSGKol5lkWJALgaLBPMSUdc984pNpEORv+Euqgy2nhYtgo
fvF+loq0l9MVs9qxiuGzIMltt50nqdRrJMzWCAVbH3n9eYfzBj10A/eUUYqoH/zL
+A9spAj8Brk9stdaOT/ZXlDSnwT4S3Q5WEuDl/rYM7VjPj6yUrGiXozHGjdxkL/T
Q7+kYSLrXrA=
=mQEH
-----END PGP SIGNATURE-----
Merge tag 'md-3.11-fixes' of git://neil.brown.name/md
Pull md bug fixes from NeilBrown:
"Sorry boss, back at work now boss. Here's them nice shiny patches ya
wanted. All nicely tagged and justified for -stable and everyfing:
Three bug fixes for md in 3.10
3.10 wasn't a good release for md. The bio changes left a couple of
bugs, and an md "fix" created another one.
These three patches appear to fix the issues and have been tagged for
-stable"
* tag 'md-3.11-fixes' of git://neil.brown.name/md:
md/raid1: fix bio handling problems in process_checks()
md: Remove recent change which allows devices to skip recovery.
md/raid10: fix two problems with RAID10 resync.
Pull drm fixes from Dave Airlie:
"You'll be terribly disappointed in this, I'm not trying to sneak any
features in or anything, its mostly radeon and intel fixes, a couple
of ARM driver fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (34 commits)
drm/radeon/dpm: add debugfs support for RS780/RS880 (v3)
drm/radeon/dpm/atom: fix broken gcc harder
drm/radeon/dpm/atom: restructure logic to work around a compiler bug
drm/radeon/dpm: fix atom vram table parsing
drm/radeon: fix an endian bug in atom table parsing
drm/radeon: add a module parameter to disable aspm
drm/rcar-du: Use the GEM PRIME helpers
drm/shmobile: Use the GEM PRIME helpers
uvesafb: Really allow mtrr being 0, as documented and warn()ed
radeon kms: do not flush uninitialized hotplug work
drm/radeon/dpm/sumo: handle boost states properly when forcing a perf level
drm/radeon: align VM PTBs (Page Table Blocks) to 32K
drm/radeon: allow selection of alignment in the sub-allocator
drm/radeon: never unpin UVD bo v3
drm/radeon: fix UVD fence emit
drm/radeon: add fault decode function for CIK
drm/radeon: add fault decode function for SI (v2)
drm/radeon: add fault decode function for cayman/TN (v2)
drm/radeon: use radeon device for request firmware
drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate
...
The multicast search bit is disabled for the AR9003
family, but this is required for AR9002 too. Fix this in
the INI override routine.
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The present build configuration for the rtlwifi family of drivers will
fail under two known conditions:
(1) If rtlwifi is selected without selecting any of the dependent drivers,
there are errors in the build.
(2) If the PCI drivers are built into the kernel and the USB drivers are modules,
or vice versa, there are missing globals.
The first condition is fixed by never building rtlwifi unless at least one
of the device drivers is selected. The second failure is fixed by splitting
the PCI and USB codes out of rtlwifi, and creating their own mini drivers.
If the drivers that use them are modules, they will also be modules.
Although a number of files are touched by this patch, only Makefile and Kconfig
have undergone significant changes. The only modifications to the other files
were to export entry points needed by the new rtl_pci and rtl_usb units, or to
rename two variables that had names that were likely to cause namespace collisions.
Reported-by: Fengguang Wu <fengguang.wu@intel.com> [Condition 1]
Reported-by: Ben Hutchings <bhutchings@solarflare.com> [Condition 2]
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bit 32 was always set which looks to have been accidental,
according to git history.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If NO_DMA=y:
drivers/built-in.o: In function `rt2x00queue_unmap_skb':
drivers/net/wireless/rt2x00/rt2x00queue.c:129: undefined reference to `dma_unmap_single'
drivers/net/wireless/rt2x00/rt2x00queue.c:133: undefined reference to `dma_unmap_single'
drivers/built-in.o: In function `rt2x00queue_map_txskb':
drivers/net/wireless/rt2x00/rt2x00queue.c:112: undefined reference to `dma_map_single'
drivers/net/wireless/rt2x00/rt2x00queue.c:115: undefined reference to `dma_mapping_error'
drivers/built-in.o: In function `rt2x00queue_alloc_rxskb':
drivers/net/wireless/rt2x00/rt2x00queue.c:93: undefined reference to `dma_map_single'
drivers/net/wireless/rt2x00/rt2x00queue.c:95: undefined reference to `dma_mapping_error'
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-wireless@vger.kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The ARM OABI and EABI disagree on the alignment of structures
with small members, so module init tools may interpret the
ssb device table incorrectly, as shown by this warning when
building the b43 device driver in an OABI kernel:
FATAL: drivers/net/wireless/b43/b43: sizeof(struct ssb_device_id)=6 is
not a modulo of the size of section __mod_ssb_device_table=88.
Forcing the default (EABI) alignment on the structure makes this
problem go away. Since the ssb_device_id may have the same problem,
better fix both structures.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.
We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 48cc32d38a
("vlan: don't deliver frames for unknown vlans to protocols")
Florian made sure we set pkt_type to PACKET_OTHERHOST
if the vlan id is set and we could find a vlan device for this
particular id.
But we also have a problem if prio bits are set.
Steinar reported an issue on a router receiving IPv6 frames with a
vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device,
because skb->vlan_tci is set.
Forwarded frame is completely corrupted : We can see (8100:4000)
being inserted in the middle of IPv6 source address :
16:48:00.780413 IP6 2001:16d8:8100:4000:ee1c:0:9d9:bc87 >
9f94:4d95:2001:67c:29f4::: ICMP6, unknown icmp6 type (0), length 64
0x0000: 0000 0029 8000 c7c3 7103 0001 a0ae e651
0x0010: 0000 0000 ccce 0b00 0000 0000 1011 1213
0x0020: 1415 1617 1819 1a1b 1c1d 1e1f 2021 2223
0x0030: 2425 2627 2829 2a2b 2c2d 2e2f 3031 3233
It seems we are not really ready to properly cope with this right now.
We can probably do better in future kernels :
vlan_get_ingress_priority() should be a netdev property instead of
a per vlan_dev one.
For stable kernels, lets clear vlan_tci to fix the bugs.
Reported-by: Steinar H. Gunderson <sesse@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We try to linearize part of the skb when the number of iov is greater than
MAX_SKB_FRAGS. This is not enough since each single vector may occupy more than
one pages, so zerocopy_sg_fromiovec() may still fail and may break the guest
network.
Solve this problem by calculate the pages needed for iov before trying to do
zerocopy and switch to use copy instead of zerocopy if it needs more than
MAX_SKB_FRAGS.
This is done through introducing a new helper to count the pages for iov, and
call uarg->callback() manually when switching from zerocopy to copy to notify
vhost.
We can do further optimization on top.
This bug were introduced from b92946e291
(macvtap: zerocopy: validate vectors before building skb).
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We try to linearize part of the skb when the number of iov is greater than
MAX_SKB_FRAGS. This is not enough since each single vector may occupy more than
one pages, so zerocopy_sg_fromiovec() may still fail and may break the guest
network.
Solve this problem by calculate the pages needed for iov before trying to do
zerocopy and switch to use copy instead of zerocopy if it needs more than
MAX_SKB_FRAGS.
This is done through introducing a new helper to count the pages for iov, and
call uarg->callback() manually when switching from zerocopy to copy to notify
vhost.
We can do further optimization on top.
The bug were introduced from commit 0690899b4d
(tun: experimental zero copy tx support)
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
QFQ+ inherits from QFQ a design choice that may cause a high packet
delay/jitter and a severe short-term unfairness. As QFQ, QFQ+ uses a
special quantity, the system virtual time, to track the service
provided by the ideal system it approximates. When a packet is
dequeued, this quantity must be incremented by the size of the packet,
divided by the sum of the weights of the aggregates waiting to be
served. Tracking this sum correctly is a non-trivial task, because, to
preserve tight service guarantees, the decrement of this sum must be
delayed in a special way [1]: this sum can be decremented only after
that its value would decrease also in the ideal system approximated by
QFQ+. For efficiency, QFQ+ keeps track only of the 'instantaneous'
weight sum, increased and decreased immediately as the weight of an
aggregate changes, and as an aggregate is created or destroyed (which,
in its turn, happens as a consequence of some class being
created/destroyed/changed). However, to avoid the problems caused to
service guarantees by these immediate decreases, QFQ+ increments the
system virtual time using the maximum value allowed for the weight
sum, 2^10, in place of the dynamic, instantaneous value. The
instantaneous value of the weight sum is used only to check whether a
request of weight increase or a class creation can be satisfied.
Unfortunately, the problems caused by this choice are worse than the
temporary degradation of the service guarantees that may occur, when a
class is changed or destroyed, if the instantaneous value of the
weight sum was used to update the system virtual time. In fact, the
fraction of the link bandwidth guaranteed by QFQ+ to each aggregate is
equal to the ratio between the weight of the aggregate and the sum of
the weights of the competing aggregates. The packet delay guaranteed
to the aggregate is instead inversely proportional to the guaranteed
bandwidth. By using the maximum possible value, and not the actual
value of the weight sum, QFQ+ provides each aggregate with the worst
possible service guarantees, and not with service guarantees related
to the actual set of competing aggregates. To see the consequences of
this fact, consider the following simple example.
Suppose that only the following aggregates are backlogged, i.e., that
only the classes in the following aggregates have packets to transmit:
one aggregate with weight 10, say A, and ten aggregates with weight 1,
say B1, B2, ..., B10. In particular, suppose that these aggregates are
always backlogged. Given the weight distribution, the smoothest and
fairest service order would be:
A B1 A B2 A B3 A B4 A B5 A B6 A B7 A B8 A B9 A B10 A B1 A B2 ...
QFQ+ would provide exactly this optimal service if it used the actual
value for the weight sum instead of the maximum possible value, i.e.,
11 instead of 2^10. In contrast, since QFQ+ uses the latter value, it
serves aggregates as follows (easy to prove and to reproduce
experimentally):
A B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 A A A A A A A A A A B1 B2 ... B10 A A ...
By replacing 10 with N in the above example, and by increasing N, one
can increase at will the maximum packet delay and the jitter
experienced by the classes in aggregate A.
This patch addresses this issue by just using the above
'instantaneous' value of the weight sum, instead of the maximum
possible value, when updating the system virtual time. After the
instantaneous weight sum is decreased, QFQ+ may deviate from the ideal
service for a time interval in the order of the time to serve one
maximum-size packet for each backlogged class. The worst-case extent
of the deviation exhibited by QFQ+ during this time interval [1] is
basically the same as of the deviation described above (but, without
this patch, QFQ+ suffers from such a deviation all the time). Finally,
this patch modifies the comment to the function qfq_slot_insert, to
make it coherent with the fact that the weight sum used by QFQ+ can
now be lower than the maximum possible value.
[1] P. Valente, "Extending WF2Q+ to support a dynamic traffic mix",
Proceedings of AAA-IDEA'05, June 2005.
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are some driver core patches for 3.11-rc2. They aren't really
bugfixes, but a bunch of new helper macros for drivers to properly
create attribute groups, which drivers and subsystems need to fix up a
ton of race issues with incorrectly creating sysfs files (binary and
normal) after userspace has been told that the device is present.
Also here is the ability to create binary files as attribute groups, to
solve that race condition, which was impossible to do before this, so
that's my fault the drivers were broken.
The majority of the .c changes is indenting and moving code around a
bit. It affects no existing code, but allows the large backlog of 70+
patches that I already have created to start flowing into the different
subtrees, instead of having to live in my driver-core tree, causing
merge nightmares in linux-next for the next few months.
These were finalized too late for the -rc1 merge window, which is why
they were didn't make that pull request, testing and review from others
didn't happen until a few weeks ago, and then there's the whole
distraction of the past few days, which prevented these from getting to
you sooner, sorry about that.
Oh, and there's a bugfix for the documentation build warning in here as
well. All of these have been in linux-next this week, with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (GNU/Linux)
iEYEABECAAYFAlHoRUUACgkQMUfUDdst+ymkNACdHAjEXZZmXohDuCb2SqyMeQsz
AZcAn3qqJa/NoPEgTCgOkDlAQZM6BnC5
=+Gqk
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches from Greg KH:
"Here are some driver core patches for 3.11-rc2. They aren't really
bugfixes, but a bunch of new helper macros for drivers to properly
create attribute groups, which drivers and subsystems need to fix up a
ton of race issues with incorrectly creating sysfs files (binary and
normal) after userspace has been told that the device is present.
Also here is the ability to create binary files as attribute groups,
to solve that race condition, which was impossible to do before this,
so that's my fault the drivers were broken.
The majority of the .c changes is indenting and moving code around a
bit. It affects no existing code, but allows the large backlog of 70+
patches that I already have created to start flowing into the
different subtrees, instead of having to live in my driver-core tree,
causing merge nightmares in linux-next for the next few months.
These were finalized too late for the -rc1 merge window, which is why
they were didn't make that pull request, testing and review from
others didn't happen until a few weeks ago, and then there's the whole
distraction of the past few days, which prevented these from getting
to you sooner, sorry about that.
Oh, and there's a bugfix for the documentation build warning in here
as well. All of these have been in linux-next this week, with no
reported problems"
* tag 'driver-core-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver-core: fix new kernel-doc warning in base/platform.c
sysfs: use file mode defines from stat.h
sysfs: add more helper macro's for (bin_)attribute(_groups)
driver core: add default groups to struct class
driver core: Introduce device_create_groups
sysfs: prevent warning when only using binary attributes
sysfs: add support for binary attributes in groups
driver core: device.h: add RW and RO attribute macros
sysfs.h: add BIN_ATTR macro
sysfs.h: add ATTRIBUTE_GROUPS() macro
sysfs.h: add __ATTR_RW() macro
Pull phase two of __cpuinit removal from Paul Gortmaker:
"With the __cpuinit infrastructure removed earlier, this group of
commits only removes the function/data tagging that was done with the
various (now no-op) __cpuinit related prefixes.
Now that the dust has settled with yesterday's v3.11-rc1, there
hopefully shouldn't be any new users leaking back in tree, but I think
we can leave the harmless no-op stubs there for a release as a
courtesy to those who still have out of tree stuff and weren't paying
attention.
Although the commits are against the recent tag to allow for minor
context refreshes for things like yesterday's v3.11-rc1~ slab content,
the patches have been largely unchanged for weeks, aside from such
trivial updates.
For detail junkies, the largely boring and mostly irrelevant history
of the patches can be viewed at:
http://git.kernel.org/cgit/linux/kernel/git/paulg/cpuinit-delete.git
If nothing else, I guess it does at least demonstrate the level of
involvement required to shepherd such a treewide change to completion.
This is the same repository of patches that has been applied to the
end of the daily linux-next branches for the past several weeks"
* 'cpuinit_phase2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (28 commits)
block: delete __cpuinit usage from all block files
drivers: delete __cpuinit usage from all remaining drivers files
kernel: delete __cpuinit usage from all core kernel files
rcu: delete __cpuinit usage from all rcu files
net: delete __cpuinit usage from all net files
acpi: delete __cpuinit usage from all acpi files
hwmon: delete __cpuinit usage from all hwmon files
cpufreq: delete __cpuinit usage from all cpufreq files
clocksource+irqchip: delete __cpuinit usage from all related files
x86: delete __cpuinit usage from all x86 files
score: delete __cpuinit usage from all score files
xtensa: delete __cpuinit usage from all xtensa files
openrisc: delete __cpuinit usage from all openrisc files
m32r: delete __cpuinit usage from all m32r files
hexagon: delete __cpuinit usage from all hexagon files
frv: delete __cpuinit usage from all frv files
cris: delete __cpuinit usage from all cris files
metag: delete __cpuinit usage from all metag files
tile: delete __cpuinit usage from all tile files
sh: delete __cpuinit usage from all sh files
...
Except for a slightly big OMAP changes, all rest are small, mostly
boring changes; all either 3.11 regression fixes or stable materials.
- ASoC OMAP fixes due to non-DT OMAP4 removals
- Other ASoC driver changes (sglt5000, wm8978, wm8948, samsung)
- Fix missing locking for snd_pcm_stop() calls in many drivers
- Fix the blocking request_module() in OSS sequencer
- Fix old OSS vwsnd driver builds
- Add a new HD-audio HDMI codec ID
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJR5/PFAAoJEGwxgFQ9KSmkqb0QAJ9pio0iRL2T+N8m4deTAn/E
71suhNyMmqEwSt86UtOO6N/QKGCsUhQA1tXCWKgLE9Dqo4f3S7fWD0cGKDdk53QZ
D3qwjTPvFP05gMSmAkKxPr8C48HTrngSUxdrV5BSU0goRXYa/qqP7H1+jRbIhNv+
1WBXQJ8I0ALqLqs2MCw22snCVp6SOnYzvBn543jcJOf+weDqZ/tWlfgzhZAlIvKJ
OSkL8t+hiEyadjW/jSKm8Xe2Onb7tqZFG/x9ZggLq5HWQm7q261f3785X6EBtAKR
nm8yZZW88JGrVvCRwuE1hJNSptnIhFFYTc+DeQiicYMKT5ixgSG3N99sMYQcsi5J
jSUhLHeg3KeT44s//B+Vd/++lrwz5wcvnCLrcl0LqFaVrdIaZo6UUTMUHFPnl74f
mIqpGgQBH7dcbYUWF4X26Y2qpvbZmNaHmd++0ywVNz7ponue1DKtaGfeiYEFyzDw
BTnLI+sqg/1Y85ydE/04izeNtcS3y+iUw6A8slB+eS6MGl3B05oVZSewJOJhMccy
+pn0Y/BvPNf/IxcHgRQ65GZmbGTW3sWv1syPqYvXRHeiYYv9Ekf55Nr+LKPpFJdj
M9G4BBo56Y+xC8UpHYgcGfzmW8P+rzpc+yuGZC/z6kFO0Y9LL9ABUMOHhk4mPD7d
5/GebLUeG8gowWV8zgVP
=xUZh
-----END PGP SIGNATURE-----
Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Except for a slightly big OMAP changes, all rest are small, mostly
boring changes; all either 3.11 regression fixes or stable materials.
- ASoC OMAP fixes due to non-DT OMAP4 removals
- Other ASoC driver changes (sglt5000, wm8978, wm8948, samsung)
- Fix missing locking for snd_pcm_stop() calls in many drivers
- Fix the blocking request_module() in OSS sequencer
- Fix old OSS vwsnd driver builds
- Add a new HD-audio HDMI codec ID"
* tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ALSA: seq-oss: Initialize MIDI clients asynchronously
ALSA: hda - Add new GPU codec ID to snd-hda
staging: line6: Fix unlocked snd_pcm_stop() call
[media] saa7134: Fix unlocked snd_pcm_stop() call
ASoC: s6000: Fix unlocked snd_pcm_stop() call
ASoC: atmel: Fix unlocked snd_pcm_stop() call
ALSA: pxa2xx: Fix unlocked snd_pcm_stop() call
ALSA: usx2y: Fix unlocked snd_pcm_stop() call
ALSA: ua101: Fix unlocked snd_pcm_stop() call
ALSA: 6fire: Fix unlocked snd_pcm_stop() call
ALSA: atiixp: Fix unlocked snd_pcm_stop() call
ALSA: asihpi: Fix unlocked snd_pcm_stop() call
sound: oss/vwsnd: Always define vwsnd_mutex
sound: oss/vwsnd: Add missing inclusion of linux/delay.h
ASoC: wm8978: enable symmetric rates
ASoC: omap-mcbsp: Use different method for DMA request when booted with DT
ASoC: omap-dmic: Do not use platform_get_resource_byname() for DMA
ASoC: omap-mcpdm: Do not use platform_get_resource_byname() for DMA
ASoC: omap-pcm: Request the DMA channel differently when DT is involved
ASoC: Samsung: Set RFS and BFS in slave mode
...
Recent change to use bio_copy_data() in raid1 when repairing
an array is faulty.
The underlying may have changed the bio in various ways using
bio_advance and these need to be undone not just for the 'sbio' which
is being copied to, but also the 'pbio' (primary) which is being
copied from.
So perform the reset on all bios that were read from and do it early.
This also ensure that the sbio->bi_io_vec[j].bv_len passed to
memcmp is correct.
This fixes a crash during a 'check' of a RAID1 array. The crash was
introduced in 3.10 so this is suitable for 3.10-stable.
Cc: stable@vger.kernel.org (3.10)
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
commit 7ceb17e87b
md: Allow devices to be re-added to a read-only array.
allowed a bit more than just that. It also allows devices to be added
to a read-write array and to end up skipping recovery.
This patch removes the offending piece of code pending a rewrite for a
subsequent release.
More specifically:
If the array has a bitmap, then the device will still need a bitmap
based resync ('saved_raid_disk' is set under different conditions
is a bitmap is present).
If the array doesn't have a bitmap, then this is correct as long as
nothing has been written to the array since the metadata was checked
by ->validate_super. However there is no locking to ensure that there
was no write.
Bug was introduced in 3.10 and causes data corruption so
patch is suitable for 3.10-stable.
Cc: stable@vger.kernel.org (3.10)
Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
1/ When an different between blocks is found, data is copied from
one bio to the other. However bv_len is used as the length to
copy and this could be zero. So use r10_bio->sectors to calculate
length instead.
Using bv_len was probably always a bit dubious, but the introduction
of bio_advance made it much more likely to be a problem.
2/ When preparing some blocks for sync, we don't set BIO_UPTODATE
except on bios that we schedule for a read. This ensures that
missing/failed devices don't confuse the loop at the top of
sync_request write.
Commit 8be185f2c9 "raid10: Use bio_reset()"
removed a loop which set BIO_UPTDATE on all appropriate bios.
So we need to re-add that flag.
These bugs were introduced in 3.10, so this patch is suitable for
3.10-stable, and can remove a potential for data corruption.
Cc: stable@vger.kernel.org (3.10)
Reported-by: Brassow Jonathan <jbrassow@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
more DPM fixes for radeon.
* 'drm-fixes-3.11' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon/dpm: add debugfs support for RS780/RS880 (v3)
drm/radeon/dpm/atom: fix broken gcc harder
drm/radeon/dpm/atom: restructure logic to work around a compiler bug
drm/radeon/dpm: fix atom vram table parsing
drm/radeon: fix an endian bug in atom table parsing
drm/radeon: add a module parameter to disable aspm
This allows you to look at the current DPM state via
debugfs.
Due to the way the hardware works on these asics, there's
no way to look up exactly what power state we are in, so
we make the best guess we can based on the current sclk.
v2: Anthoine's version
v3: fix ref div
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull nfsd bugfixes from Bruce Fields:
"Just three minor bugfixes"
* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
svcrdma: underflow issue in decode_write_list()
nfsd4: fix minorversion support interface
lockd: protect nlm_blocked access in nlmsvc_retry_blocked