In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
doing the work here anyway, also store thoff for a later usage, e.g. in
the BPF filter.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Users of udp encapsulation currently have an encap_rcv callback which they can
use to hook into the udp receive path.
In situations where a encapsulation user allocates resources associated with a
udp encap socket, it may be convenient to be able to also hook the proto
.destroy operation. For example, if an encap user holds a reference to the
udp socket, the destroy hook might be used to relinquish this reference.
This patch adds a socket destroy hook into udp, which is set and enabled
in the same way as the existing encap_rcv hook.
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
The following patchset contains 7 Netfilter/IPVS fixes for 3.9-rc, they are:
* Restrict IPv6 stateless NPT targets to the mangle table. Many users are
complaining that this target does not work in the nat table, which is the
wrong table for it, from Florian Westphal.
* Fix possible use before initialization in the netns init path of several
conntrack protocol trackers (introduced recently while improving conntrack
netns support), from Gao Feng.
* Fix incorrect initialization of copy_range in nfnetlink_queue, spotted
by Eric Dumazet during the NFWS2013, patch from myself.
* Fix wrong calculation of next SCTP chunk in IPVS, from Julian Anastasov.
* Remove rcu_read_lock section in IPVS while calling ipv4_update_pmtu
not required anymore after change introduced in 3.7, again from Julian.
* Fix SYN looping in IPVS state sync if the backup is used a real server
in DR/TUN modes, this required a new /proc entry to disable the director
function when acting as backup, also from Julian.
* Remove leftover IP_NF_QUEUE Kconfig after ip_queue removal, noted by
Paul Bolle.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Changes:
v3->v2: rebase (no other changes)
passes selftest
v2->v1: read f->num_members only once
fix bug: test rollover mode + flag
Minimize packet drop in a fanout group. If one socket is full,
roll over packets to another from the group. Maintain flow
affinity during normal load using an rxhash fanout policy, while
dispersing unexpected traffic storms that hit a single cpu, such
as spoofed-source DoS flows. Rollover breaks affinity for flows
arriving at saturated sockets during those conditions.
The patch adds a fanout policy ROLLOVER that rotates between sockets,
filling each socket before moving to the next. It also adds a fanout
flag ROLLOVER. If passed along with any other fanout policy, the
primary policy is applied until the chosen socket is full. Then,
rollover selects another socket, to delay packet drop until the
entire system is saturated.
Probing sockets is not free. Selecting the last used socket, as
rollover does, is a greedy approach that maximizes chance of
success, at the cost of extreme load imbalance. In practice, with
sufficiently long queues to absorb bursts, sockets are drained in
parallel and load balance looks uniform in `top`.
To avoid contention, scales counters with number of sockets and
accesses them lockfree. Values are bounds checked to ensure
correctness.
Tested using an application with 9 threads pinned to CPUs, one socket
per thread and sufficient busywork per packet operation to limits each
thread to handling 32 Kpps. When sent 500 Kpps single UDP stream
packets, a FANOUT_CPU setup processes 32 Kpps in total without this
patch, 270 Kpps with the patch. Tested with read() and with a packet
ring (V1).
Also, passes psock_fanout.c unit test added to selftests.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang.
2) Insufficient space reserved for bridge netlink values, fix from
Stephen Hemminger.
3) Some dst_neigh_lookup*() callers don't interpret error pointer
correctly, fix from Zhouyi Zhou.
4) Fix transport match in SCTP active_path loops, from Xugeng Zhang.
5) Fix qeth driver handling of multi-order SKB frags, from Frank
Blaschka.
6) fec driver is missing napi_disable() call, resulting in crashes on
unload, from Georg Hofmann.
7) Don't try to handle PMTU events on a listening socket, fix from Eric
Dumazet.
8) Fix timestamp location calculations in IP option processing, from
David Ward.
9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig
tests, from Denis V Lunev.
10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings.
11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd
Bergmann.
12) bnx2x statistics don't handle 4GB rollover correctly, fix from
Maciej Żenczykowski.
13) Openvswitch bug fixes for vport del/new error reporting, missing
genlmsg_end() call in netlink processing, and mis-parsing of
LLC/SNAP ethernet types. From Rich Lane.
14) SKB pfmemalloc state should only be propagated from the head page of
a compound page, fix from Pavel Emelyanov.
15) Fix link handling in tg3 driver for 5715 chips when autonegotation
is disabled. From Nithin Sujir.
16) Fix inverted test of cpdma_check_free_tx_desc return value in
davinci_emac driver, from Mugunthan V N.
17) vlan_depth is incorrectly calculated in skb_network_protocol(), from
Li RongQing.
18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM
device mode backwards compat in cdc_ncm driver. From Bjørn Mork.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
inet: limit length of fragment queue hash table bucket lists
qeth: Fix scatter-gather regression
qeth: Fix invalid router settings handling
qeth: delay feature trace
tcp: dont handle MTU reduction on LISTEN socket
bnx2x: fix occasional statistics off-by-4GB error
vhost/net: fix heads usage of ubuf_info
bridge: Add support for setting BR_ROOT_BLOCK flag.
bnx2x: add missing napi deletion in error path
drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc()
ethernet/tulip: DE4x5 needs VIRT_TO_BUS
isdn: hisax: netjet requires VIRT_TO_BUS
net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility
rtnetlink: Mask the rta_type when range checking
Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally"
Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug
smsc75xx: configuration help incorrectly mentions smsc95xx
net: fec: fix missing napi_disable call
net: fec: restart the FEC when PHY speed changes
skb: Propagate pfmemalloc on skb from head page only
...
This patch introduces a constant limit of the fragment queue hash
table bucket list lengths. Currently the limit 128 is choosen somewhat
arbitrary and just ensures that we can fill up the fragment cache with
empty packets up to the default ip_frag_high_thresh limits. It should
just protect from list iteration eating considerable amounts of cpu.
If we reach the maximum length in one hash bucket a warning is printed.
This is implemented on the caller side of inet_frag_find to distinguish
between the different users of inet_fragment.c.
I dropped the out of memory warning in the ipv4 fragment lookup path,
because we already get a warning by the slab allocator.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Akindinov is reporting for a problem where SYNs are looping
between the master and backup server when the backup server is used as
real server in DR mode and has IPVS rules to function as director.
Even when the backup function is enabled we continue to forward
traffic and schedule new connections when the current master is using
the backup server as real server. While this is not a problem for NAT,
for DR and TUN method the backup server can not determine if a request
comes from client or from director.
To avoid such loops add new sysctl flag backup_only. It can be needed
for DR/TUN setups that do not need backup and director function at the
same time. When the backup function is enabled we stop any forwarding
and pass the traffic to the local stack (real server mode). The flag
disables the director function when the backup function is enabled.
For setups that enable backup function for some virtual services and
director function for other virtual services there should be another
more complex solution to support DR/TUN mode, may be to assign
per-virtual service syncid value, so that we can differentiate the
requests.
Reported-by: Dmitry Akindinov <dimak@stalker.com>
Tested-by: German Myzovsky <lawyer@sipnet.ru>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
This fixes a couple of problems. Firstly, some people are actually still
using old small-page flash and we broke it by removing the ready check.
Secondly. fix the handling of partitions on Broadcom 47xx devices.
Recent changes had made it misdetect the location of the NVRAM and
scribble over the bootloader when it tried to update the variables there.
With predictably sad results.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iEYEABECAAYFAlFG/koACgkQdwG7hYl686OShQCgyOXCXhTltBJU3wHDtTeT1rc9
Vx0AoL655JpZaUf+BpPsCSTagH4k28vQ
=OK6b
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20130318' of git://git.infradead.org/linux-mtd
Pull MTD fixes from David Woodhouse:
"This fixes a couple of problems. Firstly, some people are actually
still using old small-page flash and we broke it by removing the ready
check.
Secondly. fix the handling of partitions on Broadcom 47xx devices.
Recent changes had made it misdetect the location of the NVRAM and
scribble over the bootloader when it tried to update the variables
there. With predictably sad results."
* tag 'for-linus-20130318' of git://git.infradead.org/linux-mtd:
mtd: nand: reintroduce NAND_NO_READRDY as NAND_NEED_READRDY
mtd: bcm47xxpart: look for NVRAM at the end of device
Revert "mtd: bcm47xxpart: improve probing of nvram partition"
Things are calming down for arm-soc as well. This set of bug fixes is
dominated in size by the at91 platform bug fixes. Some of them were
meant to go through the framebuffer tree during the merge window, but
since the framebuffer maintainer could not be reached, I offered to
take them here. The other notable at91 change is the addition of pinctrl
definitions to fix the NAND controller.
The rest are mostly simple regression fixes:
* Our removal of VIRT_TO_BUS conflicted with Stephen Rothwell's
renaming of the Kconfig symbol. You will get a trivial merge conflict
here, we still want to remove it.
* missing bits for clocks on imx and s5pv210
* missing header inclusions in mmp and shmobile
* typos in s5pv210 camera and vt8500 clock support code
and three trivial fixes for pre-3.8 bugs:
* an old bogus build warning in the joystick driver
* a misleading Kconfig description
* a NULL pointer check on davinci
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUUchPmCrR//JCVInAQLxHhAA4bbv0+aS3vhEV8sMomBQ7XpjlI2wJ5wy
cd2jA04Gb54bQlRkZNuflIHH5xYq9bslR98Y3iEMqPHrxheDV5qgfZ9wO1E5b8wd
bl/Fj1bj7D7AeQpvhAYHZufQnV4xGSpW7j/6hkEWCDDgla82BaEwQq3aVCqFsZu5
u41xlWCFYbwS+sEcdALnGmFdEBtNHzsfwkY7AClcunARWcFTyIAm5J2VhO/1Z3eY
sA31DBizTsxhkfgOEXTDvyH1N3YwcGlm3Mb7J0ZfdU5d5QQlthmU1ims2fVPoo3t
x1rJNb5HARsJuuuFIgoRa/Vbcytqxj2+MhJGy2cLhsmAxr8L61cb618oniZxxDoW
y4DMurF790q3uSkJOrhtcAmGBmHNBdTHcvV4U05EYIQl64k/oY+L7IB18ACAHVqO
LwimbZ+KF1kxv/hVosGbu7l0EKDt7MS4ykc5QJAtiYu7RDikoRmH05742feWfQ+2
Fy6V1GqIyUCea1cWDjomeTx+lERknSWPweesrcyiRhIs2BsqrtDRDngse/S59Lf9
mUFiLh+tZqZxTh8HqZbnHbuJoqNvfVyZVYWrvifkH0Ji8VZqeLuzxx/8fBvnCDWz
tXZOkl4m2U4lVYzkYOLN9VAurEHSYcHOw51IIgQp4IfS3U32sA1a4/fF/ATq0ugP
tdJBtr7mpzA=
=oLKI
-----END PGP SIGNATURE-----
Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Arnd Bergmann:
"Things are calming down for arm-soc as well. This set of bug fixes is
dominated in size by the at91 platform bug fixes. Some of them were
meant to go through the framebuffer tree during the merge window, but
since the framebuffer maintainer could not be reached, I offered to
take them here. The other notable at91 change is the addition of
pinctrl definitions to fix the NAND controller.
The rest are mostly simple regression fixes:
- Our removal of VIRT_TO_BUS conflicted with Stephen Rothwell's
renaming of the Kconfig symbol. You will get a trivial merge
conflict here, we still want to remove it.
- missing bits for clocks on imx and s5pv210
- missing header inclusions in mmp and shmobile
- typos in s5pv210 camera and vt8500 clock support code
and three trivial fixes for pre-3.8 bugs:
- an old bogus build warning in the joystick driver
- a misleading Kconfig description
- a NULL pointer check on davinci"
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: fix CONFIG_VIRT_TO_BUS handling
ARM: i.MX35: enable MAX clock
ARM: Scorpion is a v7 architecture, not v6
ARM: mmp: add platform_device head file in gplugd
input/joystick: use get_cycles on ARM
[media] s5p-fimc: fix s5pv210 build
clk: vt8500: Fix "fix device clock divisor calculations"
ARM: i.MX25: Fix DT compilation
ARM: at91: fix infinite loop in at91_irq_suspend/resume
ARM: at91: add gpio suspend/resume support when using pinctrl
ARM: at91: fix LCD-wiring mode
atmel_lcdfb: fix 16-bpp modes on older SOCs
ARM: at91: dt: at91sam9x5: complete NAND pinctrl
ARM: at91: dt: at91sam9x5: correct NAND pins comments
ARM: davinci: edma: fix dmaengine induced null pointer dereference on da830
ARM: shmobile: marzen: Include mmc/host.h
ARM: EXYNOS: Add #dma-cells for generic dma binding support for PL330
ARM: S5PV210: Fix PL330 DMA controller clkdev entries
Commit 1d9d8639c0 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:
arch/x86/power/built-in.o: In function `restore_processor_state':
(.text+0x45c): undefined reference to `perf_restore_debug_store'
Fix it by defining the dummy function appropriately.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
TCPCT uses option-number 253, reserved for experimental use and should
not be used in production environments.
Further, TCPCT does not fully implement RFC 6013.
As a nice side-effect, removing TCPCT increases TCP's performance for
very short flows:
Doing an apache-benchmark with -c 100 -n 100000, sending HTTP-requests
for files of 1KB size.
before this patch:
average (among 7 runs) of 20845.5 Requests/Second
after:
average (among 7 runs) of 21403.6 Requests/Second
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/openvswitch/vport-internal_dev.c
Jesse Gross says:
====================
A couple of minor enhancements for net-next/3.10. The largest is an
extension to allow variable length metadata to be passed to userspace
with packets.
There is a merge conflict in net/openvswitch/vport-internal_dev.c:
A existing commit modifies internal_dev_mac_addr() and a new commit
deletes it. The new one is correct, so you can just remove that function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch generalizes VXLAN forwarding table entries allowing an administrator
to:
1) specify multiple destinations for a given MAC
2) specify alternate vni's in the VXLAN header
3) specify alternate destination UDP ports
4) use multicast MAC addresses as fdb lookup keys
5) specify multicast destinations
6) specify the outgoing interface for forwarded packets
The combination allows configuration of more complex topologies using VXLAN
encapsulation.
Changes since v1: rebase to 3.9.0-rc2
Signed-Off-By: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
caif_shm is an old implementation
caif_shm will be replaced by caif_virtio
[ As explained by Linus Walleij: "U5500 used this, but was cancelled
and the silicon did not reach anyone outside ST-Ericsson. Then for
the next platforms, we have gone for the leaner & cleaner approach
of using virtio, rpmesg and rproc." ]
Signed-off-by: Erwan Yvin <erwan.yvin@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Sjur Brendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit bd329e1 ("net: cdc_ncm: do not bind to NCM compatible MBIM devices")
introduced a new policy, preferring MBIM for dual NCM/MBIM functions if
the cdc_mbim driver was enabled. This caused a regression for users
wanting to use NCM.
Devices implementing NCM backwards compatibility according to section
3.2 of the MBIM v1.0 specification allow either NCM or MBIM on a single
USB function, using different altsettings. The cdc_ncm and cdc_mbim
drivers will both probe such functions, and must agree on a common
policy for selecting either MBIM or NCM. Until now, this policy has
been set at build time based on CONFIG_USB_NET_CDC_MBIM.
Use a module parameter to set the system policy at runtime, allowing the
user to prefer NCM on systems with the cdc_mbim driver.
Cc: Greg Suarez <gsuarez@smithmicro.com>
Cc: Alexey Orishko <alexey.orishko@stericsson.com>
Reported-by: Geir Haatveit <nospam@haatveit.nu>
Reported-by: Tommi Kyntola <kynde@ts.ray.fi>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=54791
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
* The GPIO descriptor work has exposed how broken the non-GPIOLIB
bits for OpenRISC were. We now require GPIOLIB as this is the
preferred way forward.
* The system.h split introduced a bug in llist.h for arches using
asm-generic/cmpxchg.h directly, which is currently only OpenRISC.
The patch here moves two defines from asm-generic/atomic.h to
asm-generic/cmpxchg.h to make things work as they should.
* The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does
not have the virt_to_bus methods, so there's a patch to remove it
again.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAlFCu28ACgkQ70gcjN2673OzpwCfdv7JYVybaCXWHhmcvGIpyrjU
ed0AnAuz0rGFqyN7vSAxmHtusUnJVEsT
=Lddj
-----END PGP SIGNATURE-----
Merge tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux
Pull OpenRISC bug fixes from Jonas Bonn:
- The GPIO descriptor work has exposed how broken the non-GPIOLIB bits
for OpenRISC were. We now require GPIOLIB as this is the preferred
way forward.
- The system.h split introduced a bug in llist.h for arches using
asm-generic/cmpxchg.h directly, which is currently only OpenRISC.
The patch here moves two defines from asm-generic/atomic.h to
asm-generic/cmpxchg.h to make things work as they should.
- The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does
not have the virt_to_bus methods, so there's a patch to remove it
again.
* tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux:
openrisc: remove HAVE_VIRT_TO_BUS
asm-generic: move cmpxchg*_local defs to cmpxchg.h
openrisc: require gpiolib
With this one we have:
- An ab8500 build failure fix.
- An ab8500 device tree parsing fix.
- A fix for twl4030_madc remove routine to work properly (when built-in).
- A fix for properly registering palmas interrupt handler.
- A fix for omap-usb init routine to actually write into the hostconfig
register.
- A couple of warning fixes for ab8500-gpadc and tps65912.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRQs2IAAoJEIqAPN1PVmxKN6sQAISiEmSjOiLRZ3b13qeKGiMX
3F3W5qXGYJQzhiVU/unPK9/2rTdfJ0Mqnuj88bLY+336pr89vi8g0k2VBWPfHlcv
jv2mJKWuNUM15D0d6uZ3t27jrXXBzNEvMjKKen2kYfm1+JcR+1N8pbJUZtOlgREJ
2Z23or/XP+ZCSbXcvzH1KvRFrcaBDedqgT4m0nat2dTfZ77MpKZEA58sutNBOUMa
YfoE2ncJBq5Ku4LBo6wGhsVptzilmpH3YBnSEgTh7tccyLrt4SnodNdT3vxmU4+1
C0r/oX9Idcij4/VAW/2bcEq1E12YUeKnPeVDAARXc+X/1Oyw+Mc6Cqd/rnvFomfe
9uTGMOmH1me1mSUgrzFNM1nnKXaZkWti56rLEe8tLsN7tkc+oWSJPIlWwjCmeWkC
R5SPgR5Jf6QeA4mW9qMxnBM3y8YlaV4awW6wwVYXGUDICZMG04qURSLo9NX3kDAA
AHsbE+IiSxdDqmZr8aejhaGd7PIVjfPhnM25TNlUZ5P5EWlApl4ha7QZ/BK2fx6k
DqaRujfB2uGcQAwqTprVpDI6lHUEneUKaa3zDpoxkP4vPVXDsRIiBo7HRfY8QXjg
vMWrYfLvAYBXHx69I5q2DghyFmofu/H0SXDdGlo33cQ5SfzeNGS7h0pPGrkNJyvN
7WJbiOdAYeQXVlBCy7sB
=XFgy
-----END PGP SIGNATURE-----
Merge tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes
Pull MFD fixes from Samuel Ortiz:
"This is the first batch of MFD fixes for 3.9.
With this one we have:
- An ab8500 build failure fix.
- An ab8500 device tree parsing fix.
- A fix for twl4030_madc remove routine to work properly (when
built-in).
- A fix for properly registering palmas interrupt handler.
- A fix for omap-usb init routine to actually write into the
hostconfig register.
- A couple of warning fixes for ab8500-gpadc and tps65912"
* tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
mfd: twl4030-madc: Remove __exit_p annotation
mfd: ab8500: Kill "reg" property from binding
mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO
mfd: wm831x: Don't forward declare enum wm831x_auxadc
mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource()
mfd: tps65912: Declare and use tps65912_irq_exit()
mfd: palmas: Provide irq flags through DT/platform data
mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error
mfd: omap-usb-host: Actually update hostconfig
This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.
The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When neighbour table is full, dst_neigh_lookup/dst_neigh_lookup_skb will return
-ENOBUFS which is absolutely non zero, while all the code in kernel which use
above functions assume failure only on zero return which will cause panic. (for
example: : https://bugzilla.kernel.org/show_bug.cgi?id=54731).
This patch corrects above error with smallest changes to kernel source code and
also correct two return value check missing bugs in drivers/infiniband/hw/cxgb4/cm.c
Tested on my x86_64 SMP machine
Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
COMPAT_NET_DEV_OPS was removed a while back and with it the definition of
netdev_resync_ops() went away. Let's finish the clean-up.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows LRO aggregation on bonded devices that contain an
NX3031 device. It also adds a for_each_netdev_in_bond_rcu(bond, slave)
macro which executes for each slave that has bond as master.
V3: After testing and discussing this with Rajesh, I decided to keep the
vlan ip cache and just rename it to ip_cache since it will store bond
ip addresses too. A new master flag has been added to the ip cache to
denote that the address has been added because of a master device.
I've taken care of the enslave/release cases by checking for various
combinations of events and flags (e.g. netxen has a master, it's a
bond master and it's not marked as a slave means it is being enslaved
and is dev_open()ed in bond_enslave).
I've changed netxen_free_ip_list() to have a "master" parameter which
causes all IP addresses marked as master to be deleted (used when a
netxen is being released). I've made the patch use the new upper
device API as well. The following cases were tested:
- bond -> netxen
- vlan -> netxen
- vlan -> bond -> netxen
V2: Remove local ip caching, retrieve addresses dynamically and
restore them if necessary.
Note: Tested with NX3031 adapter.
Tested-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Andy Gospodarek <agospoda@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull fix for hlist_entry_safe() regression from Paul McKenney:
"This contains a single commit that fixes a regression in
hlist_entry_safe(). This macro references its argument twice, which
can cause NULL-pointer errors. This commit applies a gcc statement
expression, creating a temporary variable to avoid the double
reference. This has been posted to LKML at
https://lkml.org/lkml/2013/3/9/75.
Kudos to CAI Qian, whose testing uncovered this, to Eric Dumazet, who
spotted root cause, and to Li Zefan, who tested this commit."
* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
list: Fix double fetch of pointer in hlist_entry_safe()
The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure. This is OK for normal lock-based use because in
that case, the pointer cannot change. However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time. This use case can result in the following sequence
of events:
1. CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
pointer as sees that it is non-NULL.
2. CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
just fetched a pointer to. Because this is the last entry
in the list, the pointer fetched by CPU 0 is now NULL.
3. CPU 0 refetches the pointer, obtains NULL, and then gets a
NULL-pointer crash.
This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events. Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.
Many thanks to Eric Dumazet for spotting root cause!
Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Li Zefan <lizefan@huawei.com>
Hi.
I'm trying to send big chunks of memory from application address space via
TCP socket using vmsplice + splice like this
mem = mmap(128Mb);
vmsplice(pipe[1], mem); /* splice memory into pipe */
splice(pipe[0], tcp_socket); /* send it into network */
When I'm lucky and a huge page splices into the pipe and then into the socket
_and_ client and server ends of the TCP connection are on the same host,
communicating via lo, the whole connection gets stuck! The sending queue
becomes full and app stops writing/splicing more into it, but the receiving
queue remains empty, and that's why.
The __skb_fill_page_desc observes a tail page of a huge page and erroneously
propagates its page->pfmemalloc value onto socket (the pfmemalloc on tail pages
contain garbage). Then this skb->pfmemalloc leaks through lo and due to the
tcp_v4_rcv
sk_filter
if (skb->pfmemalloc && !sock_flag(sk, SOCK_MEMALLOC)) /* true */
return -ENOMEM
goto release_and_discard;
no packets reach the socket. Even TCP re-transmits are dropped by this, as skb
cloning clones the pfmemalloc flag as well.
That said, here's the proper page->pfmemalloc propagation onto socket: we
must check the huge-page's head page only, other pages' pfmemalloc and mapping
values do not contain what is expected in this place. However, I'm not sure
whether this fix is _complete_, since pfmemalloc propagation via lo also
oesn't look great.
Both, bit propagation from page to skb and this check in sk_filter, were
introduced by c48a11c7 (netvm: propagate page->pfmemalloc to skb), in v3.5 so
Mel and stable@ are in Cc.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chrome OS team reported a crash on a Pixel ChromeBook in TCP stack :
https://code.google.com/p/chromium/issues/detail?id=182056
commit a21d45726a (tcp: avoid order-1 allocations on wifi and tx
path) did a poor choice adding an 'avail_size' field to skb, while
what we really needed was a 'reserved_tailroom' one.
It would have avoided commit 22b4a4f22d (tcp: fix retransmit of
partially acked frames) and this commit.
Crash occurs because skb_split() is not aware of the 'avail_size'
management (and should not be aware)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Mukesh Agrawal <quiche@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This partially reverts commit 1696e6bc2a
("mtd: nand: kill NAND_NO_READRDY").
In that patch I overlooked a few things.
The original documentation for NAND_NO_READRDY included "True for all
large page devices, as they do not support autoincrement." I was
conflating "not support autoincrement" with the NAND_NO_AUTOINCR option,
which was in fact doing nothing. So, when I dropped NAND_NO_AUTOINCR, I
concluded that I then could harmlessly drop NAND_NO_READRDY. But of
course the fact the NAND_NO_AUTOINCR was doing nothing didn't mean
NAND_NO_READRDY was doing nothing...
So, NAND_NO_READRDY is re-introduced as NAND_NEED_READRDY and applied
only to those few remaining small-page NAND which needed it in the first
place.
Cc: stable@kernel.org [3.5+]
Reported-by: Alexander Shiyan <shc_work@mail.ru>
Tested-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Merge misc fixes from Andrew Morton:
- A bunch of fixes
- Finish off the idr API conversions before someone starts to use the
old interfaces again.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
UAPI: fix endianness conditionals in M32R's asm/stat.h
UAPI: fix endianness conditionals in linux/raid/md_p.h
UAPI: fix endianness conditionals in linux/acct.h
UAPI: fix endianness conditionals in linux/aio_abi.h
decompressors: fix typo "POWERPC"
mm/fremap.c: fix oops on error path
idr: deprecate idr_pre_get() and idr_get_new[_above]()
tidspbridge: convert to idr_alloc()
zcache: convert to idr_alloc()
mlx4: remove leftover idr_pre_get() call
workqueue: convert to idr_alloc()
nfsd: convert to idr_alloc()
nfsd: remove unused get_new_stid()
kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
signal: always clear sa_restorer on execve
mm: remove_memory(): fix end_pfn setting
include/linux/res_counter.h needs errno.h
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of struct mdp_superblock_s in linux/raid/md_p.h is wrong in
this way. Note that userspace will likely interpret the ordering of the
fields incorrectly as the big-endian variant on a little-endian machines -
depending on header inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of events_hi, events_lo, cp_events_hi and
cp_events_lo in struct mdp_superblock_s / typedef mdp_super_t.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of ACCT_BYTEORDER in linux/acct.h is wrong in this way.
Note that userspace will likely interpret this incorrectly as the
big-endian variant on little-endian machines - depending on header
inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the value of ACCT_BYTEORDER.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the UAPI header files, __BIG_ENDIAN and __LITTLE_ENDIAN must be
compared against __BYTE_ORDER in preprocessor conditionals where these are
exposed to userspace (that is they're not inside __KERNEL__ conditionals).
However, in the main kernel the norm is to check for
"defined(__XXX_ENDIAN)" rather than comparing against __BYTE_ORDER and
this has incorrectly leaked into the userspace headers.
The definition of PADDED() in linux/aio_abi.h is wrong in this way. Note
that userspace will likely interpret this and thus the order of fields in
struct iocb incorrectly as the little-endian variant on big-endian
machines - depending on header inclusion order.
[!!!] NOTE [!!!] This patch may adversely change the userspace API. It might
be better to fix the ordering of aio_key and aio_reserved1 in struct iocb.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that all in-kernel users are converted to ues the new alloc
interface, mark the old interface deprecated. We should be able to
remove these in a few releases.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alpha allmodconfig:
In file included from mm/memcontrol.c:28:
include/linux/res_counter.h: In function 'res_counter_set_limit':
include/linux/res_counter.h:203: error: 'EBUSY' undeclared (first use in this function)
include/linux/res_counter.h:203: error: (Each undeclared identifier is reported only once
include/linux/res_counter.h:203: error: for each function it appears in.)
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here are a number of tiny USB fixes and new USB device ids for your 3.9
tree.
The "largest" one here is a revert of a usb-storage patch that turned
out to be incorrect, breaking existing users, which is never a good
thing. Everything else is pretty simple and small
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlFA3rgACgkQMUfUDdst+ylSmQCfdQGXwi/1JX0099FKsnt4dcXY
SbMAn1GWSwYPo1Uk5joKJpNh412PMnXZ
=kNF7
-----END PGP SIGNATURE-----
Merge tag 'usb-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a number of tiny USB fixes and new USB device ids for your
3.9 tree.
The "largest" one here is a revert of a usb-storage patch that turned
out to be incorrect, breaking existing users, which is never a good
thing. Everything else is pretty simple and small"
* tag 'usb-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (43 commits)
USB: quatech2: only write to the tty if the port is open.
qcserial: bind to DM/DIAG port on Gobi 1K devices
USB: cdc-wdm: fix buffer overflow
usb: serial: Add Rigblaster Advantage to device table
qcaux: add Franklin U600
usb: musb: core: fix possible build error with randconfig
usb: cp210x new Vendor/Device IDs
usb: gadget: pxa25x: fix disconnect reporting
usb: dwc3: ep0: fix sparc64 build
usb: c67x00 RetryCnt value in c67x00 TD should be 3
usb: Correction to c67x00 TD data length mask
usb: Makefile: fix drivers/usb/phy/ Makefile entry
USB: added support for Cinterion's products AH6 and PLS8
usb: gadget: fix omap_udc build errors
USB: storage: fix Huawei mode switching regression
USB: storage: in-kernel modeswitching is deprecated
tools: usb: ffs-test: Fix build failure
USB: option: add Huawei E5331
usb: musb: omap2430: fix sparse warning
usb: musb: omap2430: fix omap_musb_mailbox glue check again
...
Here are some tty/serial driver fixes for 3.9
We finally mute the annoying WARN_ON that lots of people are hitting and
it turns out isn't needed anymore. Also add a few new device ids and a
some other minor fixes.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlFA30EACgkQMUfUDdst+ylTwACfTdhZhFWStj/3jlIDzdxz6Bjv
cwkAn2c4TomoV5wTu9pvRBkEF4WsDJTS
=98sU
-----END PGP SIGNATURE-----
Merge tag 'tty-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg Kroah-Hartman:
"Here are some tty/serial driver fixes for 3.9
We finally mute the annoying WARN_ON that lots of people are hitting
and it turns out isn't needed anymore. Also add a few new device ids
and a some other minor fixes."
* tag 'tty-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: serial: fix typo "SERIAL_S3C2412"
serial: 8250: Keep 8250.<xxxx> module options functional after driver rename
tty: serial: fix typo "ARCH_S5P6450"
tty/8250_pnp: serial port detection regression since v3.7
serial: bcm63xx_uart: fix compilation after "TTY: switch tty_insert_flip_char"
serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller
Fix 4 port and add support for 8 port 'Unknown' PCI serial port cards
tty/serial: Add support for Altera serial port
tty: serial: vt8500: Unneccessary duplicated clock code removed
tty: serial: mpc5xxx: fix PSC clock name bug
TTY: disable debugging warning
Here are some drivers/staging and drivers/iio fixes for 3.9 (the two are
still pretty intertwined, hence them coming both from my tree still.)
Nothing major, just a few things that have been reported by users, all
of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlFA3+8ACgkQMUfUDdst+ymO3gCdHvj7ucFy04bdR38hSgnKxcWt
+70AoMX/kr0jSJgi6IKbBpH3o7LlSouz
=o79p
-----END PGP SIGNATURE-----
Merge tag 'staging-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg Kroah-Hartman:
"Here are some drivers/staging and drivers/iio fixes for 3.9 (the two
are still pretty intertwined, hence them coming both from my tree
still.) Nothing major, just a few things that have been reported by
users, all of these have been in linux-next for a while."
* tag 'staging-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: dt9812: use CR_CHAN() for channel number
staging/vt6656: Fix too large integer constant warning on 32-bit
staging: comedi: drivers: usbduxsigma.c: fix DMA buffers on stack
staging: imx/drm: request irq only after adding the crtc
staging: comedi: drivers: usbduxfast.c: fix for DMA buffers on stack
staging: comedi: drivers: usbdux.c: fix DMA buffers on stack
staging: vt6656: Fix oops on resume from suspend.
iio:common:st_sensors fixed all warning messages about uninitialized variables
iio: Fix build error seen if IIO_TRIGGER is defined but IIO_BUFFER is not
iio/imu: inv_mpu6050 depends on IIO_BUFFER
iio:ad5064: Initialize register cache correctly
iio:ad5064: Fix off by one in DAC value range check
iio:ad5064: Fix address of the second channel for ad5065/ad5045/ad5025
a long time ago by the commit
commit 93456b6d77
Author: Denis V. Lunev <den@openvz.org>
Date: Thu Jan 10 03:23:38 2008 -0800
[IPV4]: Unify access to the routing tables.
the defenition of FIB_HASH_TABLE size has obtained wrong dependency:
it should depend upon CONFIG_IP_MULTIPLE_TABLES (as was in the original
code) but it was depended from CONFIG_IP_ROUTE_MULTIPATH
This patch returns the situation to the original state.
The problem was spotted by Tingwei Liu.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Tingwei Liu <tingw.liu@gmail.com>
CC: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix regression introduced by commit 787f9fd232 ("atmel_lcdfb: support
16bit BGR:565 mode, remove unsupported 15bit modes") which broke 16-bpp
modes for older SOCs which use IBGR:555 (msb is intensity) rather than
BGR:565.
The above commit removes the RGB:555-wiring hack by
removing the no longer used ATMEL_LCDC_WIRING_RGB555 define.
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Fix regression introduced by commit 787f9fd232 ("atmel_lcdfb: support
16bit BGR:565 mode, remove unsupported 15bit modes") which broke 16-bpp
modes for older SOCs which use IBGR:555 (msb is intensity) rather
than BGR:565.
Use SOC-type to determine the pixel layout.
Tested on at91sam9263 and at91sam9g45.
Cc: <stable@vger.kernel.org>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Change cpts_active_slave to active_slave so that the same DT property
can be used to ethtool and SIOCGMIIPHY.
CC: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
asm/cmpxchg.h can be included on its own and needs to be self-consistent.
The definitions for the cmpxchg*_local macros, as such, need to be part
of this file.
This fixes a build issue on OpenRISC since the system.h smashing patch
96f951edb1 that introdued the direct inclusion
asm/cmpxchg.h into linux/llist.h.
CC: David Howells <dhowells@redhat.com>
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Fix new kernel-doc warnings in idr:
Warning(include/linux/idr.h:113): No description found for parameter 'idr'
Warning(include/linux/idr.h:113): Excess function parameter 'idp' description in 'idr_find'
Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc'
Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Compile warnings and errors (one on x86, two on ARM)
* WARNING in xen-pciback
* Use the acpi_processor_get_performance_info instead of the 'register' version
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRP33uAAoJEFjIrFwIi8fJfxMIAJxK7jI7NghJ87b8uk3IrkDl
VDsY5Re86XWE0HjsFDJXfWZCGumhpMb8hI1RJBavLDWQmk4THBqsKbx9rDgRZUJO
HepvhwD/RdK+nnjVmbAzJKuc7aOKQmeEW3Weysb32DW9C5Wg27LRrto/oVJIUsyj
4HrLcsZZqCiEYUZjGwrkrMC1yXx+2XvkX9mXq81hGj0xU4X1hMizwBAJvSE5lKca
r4LaCoZ0SuayHYHjEtxyjAcu39wnMdlCkdW9DUlqrjK7L3X29ifVvnb+aS951O5E
t57DCHY8WkI1BkaoHaAQqI7/Y1HpYbT8BALVmxXoIInI9lc8VLHZOs8/fVVDppQ=
=MxoB
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
- Compile warnings and errors (one on x86, two on ARM)
- WARNING in xen-pciback
- Use the acpi_processor_get_performance_info instead of the 'register'
version
* tag 'stable/for-linus-3.9-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/acpi: remove redundant acpi/acpi_drivers.h include
xen: arm: mandate EABI and use generic atomic operations.
acpi: Export the acpi_processor_get_performance_info
xen/pciback: Don't disable a PCI device that is already disabled.
Add support for Altera 8250/16550 compatible serial port.
Signed-off-by: Ley Foon Tan <lftan@altera.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This allows ethernet drivers (such as the mv643xx_eth) to support
Wake on LAN on platforms where PHY registers have to be configured
for Wake on LAN (e.g. the Marvell Kirkwood based qnap TS-119P II).
Signed-off-by: Michael Stapelberg <michael@stapelberg.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the second of the TLP patch series; it augments the basic TLP
algorithm with a loss detection scheme.
This patch implements a mechanism for loss detection when a Tail
loss probe retransmission plugs a hole thereby masking packet loss
from the sender. The loss detection algorithm relies on counting
TLP dupacks as outlined in Sec. 3 of:
http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01
The basic idea is: Sender keeps track of TLP "episode" upon
retransmission of a TLP packet. An episode ends when the sender receives
an ACK above the SND.NXT (tracked by tlp_high_seq) at the time of the
episode. We want to make sure that before the episode ends the sender
receives a "TLP dupack", indicating that the TLP retransmission was
unnecessary, so there was no loss/hole that needed plugging. If the
sender gets no TLP dupack before the end of the episode, then it reduces
ssthresh and the congestion window, because the TLP packet arriving at
the receiver probably plugged a hole.
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.
TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.
PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.
TLP Algorithm
On transmission of new data in Open state:
-> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
-> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
-> PTO = min(PTO, RTO)
Conditions for scheduling PTO:
-> Connection is in Open state.
-> Connection is either cwnd limited or no new data to send.
-> Number of probes per tail loss episode is limited to one.
-> Connection is SACK enabled.
When PTO fires:
new_segment_exists:
-> transmit new segment.
-> packets_out++. cwnd remains same.
no_new_packet:
-> retransmit the last segment.
Its ACK triggers FACK or early retransmit based recovery.
ACK path:
-> rearm RTO at start of ACK processing.
-> reschedule PTO if need be.
In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
==1; enables RFC5827 ER.
==2; delayed ER.
==3; TLP and delayed ER. [DEFAULT]
==4; TLP only.
The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Micrel PHY KSZ8031 is similar to KSZ8021 and also requires the special
initialization of "Operation Mode Strap Override" in reg 0x16
introduced in 212ea99 (phy/micrel: Implement support for KSZ8021).
Signed-off-by: Hector Palacios <hector.palacios@digi.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/intel/e1000e/netdev.c
Minor conflict in e1000e, a line that got fixed in 'net'
has been removed in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>