Since commit f3b0946d62 ("genirq/msi: Make sure PCI MSIs are
activated early"), we can end-up activating a PCI/MSI twice (once
at allocation time, and once at startup time).
This is normally of no consequences, except that there is some
HW out there that may misbehave if activate is used more than once
(the GICv3 ITS, for example, uses the activate callback
to issue the MAPVI command, and the architecture spec says that
"If there is an existing mapping for the EventID-DeviceID
combination, behavior is UNPREDICTABLE").
While this could be worked around in each individual driver, it may
make more sense to tackle the issue at the core level. In order to
avoid getting in that situation, let's have a per-interrupt flag
to remember if we have already activated that interrupt or not.
Fixes: f3b0946d62 ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-and-tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1484668848-24361-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For IR wakeup, a driver has to program the hardware to wakeup at a
specific IR sequence, so it makes no sense to allow multiple wakeup
protocols to be selected. In the same manner the sysfs interface only
allows one scancode to be provided.
In addition, we need to know the specific variant of the protocol.
In short, these changes are made to the wakeup_protocols sysfs entry:
- list all the protocol variants rather than the protocol groups,
e.g. "nec nec-x nec-32" rather than just "nec".
- only allow one protocol variant to be selected rather than multiple
- wakeup_filter can only be set once a protocol has been selected in
wakeup_protocols.
This is an API change, however the only user of this API is the img-ir,
but the wakeup code was never merged to mainline, so it was never used.
Signed-off-by: Sean Young <sean@mess.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Sifan Naeem <sifan.naeem@imgtec.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
There are many variants of extended rc5. This implements the 20 bit
version.
Signed-off-by: Sean Young <sean@mess.org>
Cc: David Härdeman <david@hardeman.nu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The d680_dmb keymap has some new new mappings.
Tested-by: Vincent McIntyre <vincent.mcintyre@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
I was under the misconception that the sysfs dev stuff can be fully
set up, and then registered all in one step with device_add. That's
true for properties and property groups, but not for parents and child
devices. Those must be fully registered before you can register a
child.
Add a bit of tracking to make sure that asynchronous mst connector
hotplugging gets this right. For consistency we rely upon the implicit
barriers of the connector->mutex, which is taken anyway, to ensure
that at least either the connector or device registration call will
work out.
Mildly tested since I can't reliably reproduce this on my mst box
here.
Reported-by: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484237756-2720-1-git-send-email-daniel.vetter@ffwll.ch
If we're unlucky then the registration from a hotplugged connector
might race with the final registration step on driver load. And since
MST topology discover is asynchronous that's even somewhat likely.
v2: Also update the kerneldoc for @registered!
v3: Review from Chris:
- Improve kerneldoc for late_register/early_unregister callbacks.
- Use mutex_destroy.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218133545.2106-1-daniel.vetter@ffwll.ch
(cherry picked from commit e73ab00e9a)
Zhang Yanmin reported crashes [1] and provided a patch adding a
synchronize_rcu() call in can_rx_unregister()
The main problem seems that the sockets themselves are not RCU
protected.
If CAN uses RCU for delivery, then sockets should be freed only after
one RCU grace period.
Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
ease stable backports with the following fix instead.
[1]
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
Call Trace:
<IRQ>
[<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
[<ffffffff81d55771>] sk_filter+0x41/0x210
[<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
[<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
[<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
[<ffffffff81f07af9>] can_receive+0xd9/0x120
[<ffffffff81f07beb>] can_rcv+0xab/0x100
[<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
[<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
[<ffffffff81d37f67>] process_backlog+0x127/0x280
[<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
[<ffffffff810c88d4>] __do_softirq+0x184/0x440
[<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
<EOI>
[<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
[<ffffffff810c8bed>] do_softirq+0x1d/0x20
[<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
[<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
[<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
[<ffffffff810e3baf>] process_one_work+0x24f/0x670
[<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
[<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
[<ffffffff810ebafc>] kthread+0x12c/0x150
[<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
Reported-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable patches:
- NFSv4.1: Fix a deadlock in layoutget
- NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors
- NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode
- Fix a memory leak when removing the SUNRPC module
Bugfixes:
- Fix a reference leak in _pnfs_return_layout
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYjM6UAAoJEGcL54qWCgDy+WoP/2NdwqzOa3xxn8piMhLh/TLJ
lxWfVmUZJ8UNVSVl45NDslt2V63P0gpITUiaVo27rPfjKV8pv8XTU7HkxCuXzj0m
uo11jEErsYxbXELNDL2y1fmsPTygYvg4MAV3NK0inKv/ciu6TTMddoga+QQXGJPz
QqQ7znTHyjDSTqZM1D5+f8EhbY7AIAzJbkgVr90WPPjsT6YtdlTQpMb1O9tlDtfj
oYidIOxRphqLOYTdoD8NW4x19dlgTNIYyCgOFomCbpUTFKKK5RFjVy+bpml3NxV1
oUldw8lhiUE8x/5J/CfbCMt1bPZ3yJeFWNghINOIF7cgGnvKEjw0uO2o8Lk6fMzi
yF2aNoZcNQnyoHHCjsqK8t5dB51abwxlzeElrFB1u73B56CcXB91xEP0GsX1vFt1
BIGCnfsOvtYPZxp/OX+B3AweAJ2DRuIvdze9cS2IeHBW9c5XdVdDmMK7CFeL9dc3
YEsWmaWNWtDt41erxCvQ0ezkr3lPHXOcjwS440LIiVpr38/rzjp81+Tf+i2D1GzB
XTsy9IHzaC2zLb0Vz5vOR3S10k4OiFtXT6ikoZLwgpTfI2gt146e0YHx3g90AiFh
QYBRNLJrjSYVedjv2GGFLG1WWsHbhbP6dpmT3BwNmhZTzjPoqZ76f5QFIQUya4jt
4Y2ksyhx7pOeOJy++6SN
=S5qH
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Stable patches:
- NFSv4.1: Fix a deadlock in layoutget
- NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors
- NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode
- Fix a memory leak when removing the SUNRPC module
Bugfixes:
- Fix a reference leak in _pnfs_return_layout"
* tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS: Fix a reference leak in _pnfs_return_layout
nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
SUNRPC: cleanup ida information when removing sunrpc module
NFSv4.0: always send mode in SETATTR after EXCLUSIVE4
nfs: Don't increment lock sequence ID after NFS4ERR_MOVED
NFSv4.1: Fix a deadlock in layoutget
- Fix for unaligned access emulation corner case
- fix for udelay loop inline asm regression
- Fix irq affinity finally for AXS103 board [Yuriy]
- Final fixes for setting IO-coherency sanely in SMP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYi8faAAoJEGnX8d3iisJeU5wP/A1OM9S6TOCe4Ikguhp8UDgS
UaGXboVpAAJNr6B+NfzvUbxi3VybgqooRM6tEU33A4eDzmfCTKBxMbvvE77rOuPC
CC/8wxQ87wDqU+6GGL5DoTTc4rrxnNoEK6MOh2/Rv76idEi8Z2eo4INfm6FkBjGY
+2WkuSpegBIxKrvFmBK9JC7cYaR3KBGSxW3rjLaK6xn0yt+LT1LQKJ/4OFLke7zy
HWNVVnXnyJhn6z7v+Bum3hjnA8bhnDEVKAM0XgD/8TtjkhS6aod4HS2mjvcKnGc/
vyAk3B4fgq0EKFXX0IyHNnWI92gCnACWs8sHAQ/UBB4GHf0z1ScoihJeE4VAqRS4
kJ6SKAIHEtwY1nQV5GgVE2ddMgTDEXwAKCna99ejoUMMmQZFaRy80YuuSMLIjEuz
H16oPkSzJDsR6Z9ocoC5mATWnHxsFZsCAX72u88X+ylaJmBziF2VmjaKRIXB8Psn
YVz02U1YlONJTesEI7lnLD3fwx6pzu/XJjNTe4saiJFAoxaGuPbjRg6sJq3URDj6
3CJ89OFRbgx78jbk7QpYlc6m9SdTM2F5T7ICi6yxElok1dn+i+UQhnBXCinIvTcL
5w9IKA/9qeqjyT76vy1cPY2KlSDGj0kqjI+63IzpGtTScRuyflA5S1CvqyNeUbQq
Vrtw/IRz5pvS7iaCDwuc
=u+Jp
-----END PGP SIGNATURE-----
Merge tag 'arc-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Hopefully last set of changes for ARC for 4.10:
- fix for unaligned access emulation corner case
- fix for udelay loop inline asm regression
- fix irq affinity finally for AXS103 board [Yuriy]
- final fixes for setting IO-coherency sanely in SMP"
* tag 'arc-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [arcompact] handle unaligned access delay slot corner case
ARCv2: smp-boot: wake_flag polling by non-Masters needs to be uncached
ARC: smp-boot: Decouple Non masters waiting API from jump to entry point
ARCv2: MCIP: update the BCR per current changes
ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list
ARCv2: MCIP: Deprecate setting of affinity in Device Tree
percpu_ref_tryget() and percpu_ref_tryget_live() should return
"true" IFF they acquire a reference. But the return value from
atomic_long_inc_not_zero() is a long and may have high bits set,
e.g. PERCPU_COUNT_BIAS, and the return value of the tryget routines
is bool so the reference may actually be acquired but the routines
return "false" which results in a reference leak since the caller
assumes it does not need to do a corresponding percpu_ref_put().
This was seen when performing CPU hotplug during I/O, as hangs in
blk_mq_freeze_queue_wait where percpu_ref_kill (blk_mq_freeze_queue_start)
raced with percpu_ref_tryget (blk_mq_timeout_work).
Sample stack trace:
__switch_to+0x2c0/0x450
__schedule+0x2f8/0x970
schedule+0x48/0xc0
blk_mq_freeze_queue_wait+0x94/0x120
blk_mq_queue_reinit_work+0xb8/0x180
blk_mq_queue_reinit_prepare+0x84/0xa0
cpuhp_invoke_callback+0x17c/0x600
cpuhp_up_callbacks+0x58/0x150
_cpu_up+0xf0/0x1c0
do_cpu_up+0x120/0x150
cpu_subsys_online+0x64/0xe0
device_online+0xb4/0x120
online_store+0xb4/0xc0
dev_attr_store+0x68/0xa0
sysfs_kf_write+0x80/0xb0
kernfs_fop_write+0x17c/0x250
__vfs_write+0x6c/0x1e0
vfs_write+0xd0/0x270
SyS_write+0x6c/0x110
system_call+0x38/0xe0
Examination of the queue showed a single reference (no PERCPU_COUNT_BIAS,
and __PERCPU_REF_DEAD, __PERCPU_REF_ATOMIC set) and no requests.
However, conditions at the time of the race are count of PERCPU_COUNT_BIAS + 0
and __PERCPU_REF_DEAD and __PERCPU_REF_ATOMIC set.
The fix is to make the tryget routines use an actual boolean internally instead
of the atomic long result truncated to a int.
Fixes: e625305b39 percpu-refcount: make percpu_ref based on longs instead of ints
Link: https://bugzilla.kernel.org/show_bug.cgi?id=190751
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: e625305b39 ("percpu-refcount: make percpu_ref based on longs instead of ints")
Cc: stable@vger.kernel.org # v3.18+
Pull networking fixes from David Miller:
1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP
DF on transmit).
2) Netfilter needs to reflect the fwmark when sending resets, from Pau
Espin Pedrol.
3) nftable dump OOPS fix from Liping Zhang.
4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit,
from Rolf Neugebauer.
5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd
Bergmann.
6) Fix regression in handling of NETIF_F_SG in harmonize_features(),
from Eric Dumazet.
7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern.
8) tcp_fastopen_create_child() needs to setup tp->max_window, from
Alexey Kodanev.
9) Missing kmemdup() failure check in ipv6 segment routing code, from
Eric Dumazet.
10) Don't execute unix_bind() under the bindlock, otherwise we deadlock
with splice. From WANG Cong.
11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer,
therefore callers must reload cached header pointers into that skb.
Fix from Eric Dumazet.
12) Fix various bugs in legacy IRQ fallback handling in alx driver, from
Tobias Regnery.
13) Do not allow lwtunnel drivers to be unloaded while they are
referenced by active instances, from Robert Shearman.
14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven.
15) Fix a few regressions from virtio_net XDP support, from John
Fastabend and Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits)
ISDN: eicon: silence misleading array-bounds warning
net: phy: micrel: add support for KSZ8795
gtp: fix cross netns recv on gtp socket
gtp: clear DF bit on GTP packet tx
gtp: add genl family modules alias
tcp: don't annotate mark on control socket from tcp_v6_send_response()
ravb: unmap descriptors when freeing rings
virtio_net: reject XDP programs using header adjustment
virtio_net: use dev_kfree_skb for small buffer XDP receive
r8152: check rx after napi is enabled
r8152: re-schedule napi for tx
r8152: avoid start_xmit to schedule napi when napi is disabled
r8152: avoid start_xmit to call napi_schedule during autosuspend
net: dsa: Bring back device detaching in dsa_slave_suspend()
net: phy: leds: Fix truncated LED trigger names
net: phy: leds: Break dependency of phy.h on phy_led_triggers.h
net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash
net-next: ethernet: mediatek: change the compatible string
Documentation: devicetree: change the mediatek ethernet compatible string
bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().
...
- Series of iw_cxgb4 fixes to make it work with the drain cq API
- One or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem, rxe,
and ipoib
- One big series (13 patches) for the new qedr driver
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYi6KyAAoJELgmozMOVy/d9wgP+gN1i3pEvXN+yy8JmQb/kYDT
9k74tbrPY/E3FnxqfWhqIWWSU427/+wXq7phqFxc/w9BNB3by7Ivud1FuHpBWfx5
PoEhQj+TA3gQEHk259+Jdz8OHLMrKEME3zg1tuDUM9vq+e9hfZWyRVXcTFuOZTKg
YoKBpAE4AmSyKCcogn0FNHkm7UxlSZGlIfiVA61I1vPYjV0Z/ePkiXg51Y3RNOUy
x8StD+RVfWHWJTZ7ZKASC26Mm9tLzQxENQ0kJXg24oG0CaAO90iO3oFUM86z5NP8
gFyvdCpsgDSByRtoLhPnUvSmoTTzcLPa65YneINu17MeKiKjZaVmZipwr54pNTw0
++mqnfuEYhOUzjN1naDaK3DihEAj7CoW9Gdx84s9hOggOyURFUrV7DqwJSEBAz02
9VxBWpKsR7NKF7xVdy/oTgoWpO8Nr7DuYyHklaxS6eORci7CD3lqs71uf1XJg48t
b9SfncPxHReC4L6Y9sAUZsseaLsSCy/3bestB5wTqL6iRxjp25fJ7EvK9VLW/gXl
FcRU5pzph1pKDAasb/M8CjUnS4f4LveenALKAcNViuZDh8WEimM/Jlfg5vHbUAr8
rc0z9s/0cgU9AL/ZPaWQb57ZivaS0pXXa1z+ykKvSvj7nYRAX2tsYLzaX8Im2ang
8Rqqy93AdiO4Zq8ypnCq
=d5rm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Second round of -rc fixes for 4.10.
This -rc cycle has been slow for the rdma subsystem. I had already
sent you the first batch before the Holiday break. After that, we kept
only getting a few here or there. Up until this week, when I got a
drop of 13 to one driver (qedr). So, here's the -rc patches I have. I
currently have none held in reserve, so unless something new comes in,
this is it until the next merge window opens.
Summary:
- series of iw_cxgb4 fixes to make it work with the drain cq API
- one or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem,
rxe, and ipoib
- one big series (13 patches) for the new qedr driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (27 commits)
RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled
IB/rxe: Prevent from completer to operate on non valid QP
IB/rxe: Fix rxe dev insertion to rxe_dev_list
IB/umem: Release pid in error and ODP flow
RDMA/qedr: Dispatch port active event from qedr_add
RDMA/qedr: Fix and simplify memory leak in PD alloc
RDMA/qedr: Fix RDMA CM loopback
RDMA/qedr: Fix formatting
RDMA/qedr: Mark three functions as static
RDMA/qedr: Don't reset QP when queues aren't flushed
RDMA/qedr: Don't spam dmesg if QP is in error state
RDMA/qedr: Remove CQ spinlock from CM completion handlers
RDMA/qedr: Return max inline data in QP query result
RDMA/qedr: Return success when not changing QP state
RDMA/qedr: Add uapi header qedr-abi.h
RDMA/qedr: Fix MTU returned from QP query
RDMA/core: Add the function ib_mtu_int_to_enum
IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path
IB/vmw_pvrdma: Don't leak info from alloc_ucontext
IB/cxgb3: fix misspelling in header guard
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYixa5AAoJEAhfPr2O5OEVxEIP/RdOWmeLxBW0AT8sRlvlw43K
ecFIq0WTIGr6w7VJPm2uHIcw9sdDnwQgozbp5di4KkDLH13aNoJYjMknpl13oPLM
KIBUvIAz7WG+/k3Iq45RLZ7eovrWzGSOTRGHOAitidDBmuFofkLslfzlJKhnNacB
PhSgzQsjK0GrBj0z8Ud/dCSWSR1EU67Uq+pdr+eFGJpMtEicQipnj36BkDMECW02
12wGht0NW7eDMILrNXCwfBYqsFLKfgJqhypoJLci1rnnrGtuMVmlphZJSfPC6Riz
6D6mbgNaYcSAQq8VsHW8L/T5EMio/TNCkapY3LjeUOp/qxTk486/sYWg22fN0qvg
Y1lCU1s5jY0Wu9TBdb6ty/dQ52f7gG9SGjRupRNYCYIVJI0V7UQdWxIrr+uqAkH/
Ha/n7q7oTT6RxjmhHB+qAOspeWI31sGzRBhpDve9R+6MQa366aek14SpuG+4OdUN
7As7Lih9sagi90tB+J2B9PTeuZnr4XaZB3Z+t2vaBb19NG4/vpHYSuC0ymp0mzLh
9hnxtfZCsEOYLSRTWRz5PJmAK2t5j26C5Pb3qeRA40rNbKdSyTj4ntJ/VLUaiFg5
m7kJ222GvdUTeuBLhCvzkjk1sCt1YRvaqQkmLrzpynJnzojuS4SXeWWmdygQIZwq
cQC5lcwMFM/x1MmzBAWI
=r51U
-----END PGP SIGNATURE-----
Merge tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- fix a regression on tvp5150 causing failures at input selection and
image glitches
- CEC was moved out of staging for v4.10. Fix some bugs on it while not
too late
- fix a regression on pctv452e caused by VM stack changes
- fix suspend issued with smiapp
- fix a regression on cobalt driver
- fix some warnings and Kconfig issues with some random configs.
* tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] s5k4ecgx: select CRC32 helper
[media] dvb: avoid warning in dvb_net
[media] v4l: tvp5150: Don't override output pinmuxing at stream on/off time
[media] v4l: tvp5150: Fix comment regarding output pin muxing
[media] v4l: tvp5150: Reset device at probe time, not in get/set format handlers
[media] pctv452e: move buffer to heap, no mutex
[media] media/cobalt: use pci_irq_allocate_vectors
[media] cec: fix race between configuring and unconfiguring
[media] cec: move cec_report_phys_addr into cec_config_thread_func
[media] cec: replace cec_report_features by cec_fill_msg_report_features
[media] cec: update log_addr[] before finishing configuration
[media] cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2
[media] cec: when canceling a message, don't overwrite old status info
[media] cec: fix report_current_latency
[media] smiapp: Make suspend and resume functions __maybe_unused
[media] smiapp: Implement power-on and power-off sequences without runtime PM
The media_entity_pipeline_start() and media_entity_pipeline_stop()
functions are renamed as media_pipeline_start() and media_pipeline_stop(),
respectively. The reason is two-fold: the pipeline struct is, rightly,
already called media_pipeline (rather than media_entity_pipeline) and what
this really is about is a pipeline. A pipeline consists of entities ---
and, well, other objects embedded in these entities.
As the pipeline object will be in the future moved from entities to pads
in order to support multiple pipelines through a single entity, do the
renaming now.
Similarly, functions operating on struct media_entity_graph as well as the
struct itself are renamed by dropping the "entity_" part from the prefix
of the function family and the data structure. The graph traversal which
is what the functions are about is not specifically about entities only
and will operate on pads for the same reason as the media pipeline.
The patch has been generated using the following command:
git grep -l media_entity |xargs perl -i -pe '
s/media_entity_pipeline/media_pipeline/g;
s/media_entity_graph/media_graph/g'
And a few manual edits related to line start alignment and line wrapping.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This is adds support for the PHYs in the KSZ8795 5port managed switch.
It will allow to detect the link between the switch and the soc
and uses the same read_status functions as the KSZ8873MLL switch.
Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike ipv4, this control socket is shared by all cpus so we cannot use
it as scratchpad area to annotate the mark that we pass to ip6_xmit().
Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
family caches the flowi6 structure in the sctp_transport structure, so
we cannot use to carry the mark unless we later on reset it back, which
I discarded since it looks ugly to me.
Fixes: bf99b4ded5 ("tcp: fix mark propagation with fwmark_reflect enabled")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the FSF's postal address from the source code files that typically
contain mostly the license text. Of the 628 removed instances, 578 are
outdated.
The patch has been created with the following command without manual edits:
git grep -l "675 Mass Ave\|59 Temple Place\|51 Franklin St" -- \
drivers/media/ include/media|while read i; do i=$i perl -e '
open(F,"< $ENV{i}");
$a=join("", <F>);
$a =~ s/[ \t]*\*\n.*You should.*\n.*along with.*\n.*(\n.*USA.*$)?\n//m
&& $a =~ s/(^.*)Or, (point your browser to) /$1To obtain the license, $2\n$1/m;
close(F);
open(F, "> $ENV{i}");
print F $a;
close(F);'; done
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYiqcPAAoJEAx081l5xIa+L7YQAJ9vDoOwC6YBpElwC9XM5cYD
C3+UiFU8P/HP7yehepHwXHEwRe5P+aaZcvaOqbvSD/ZOgClnEk6grfCOg/mp+7Uo
M+sb42lJxpButy9rVBy2W2Uj25oMjBYP+TUUf/880G9gZQsY2VrGRSMTQk/mLqDh
2Y/rT0PCoU0SemHpnGRnNEyMlhwbWCtwc+3MV48jctVIQmdqDbzMi0Cct2pqaV+4
FKIic0xUWiNM2CjArCQAaandPBc+If3OuptlOpdkUzfHmxZY3qa7V8RAJo0GoBVs
QA6eNurXDDNOe82uT5OXiXXXU17T1BUrwvoulXJkBk1bZCw4xShPQi7sBURUd7YI
5xJnRL5tTHYcQoAUZbpn3z9XR6mcEniyI0dlWTrbHGx6Kkb6ZQUGrFZubgzi8DmE
9dJtDKi7+Ik0CM0ztDmAhHOCRBtXeu6ZUvZb2hw1z5hFMZyJqsa+QPCrrXijWGhW
32/XcjFavcNF2mLJQwJcl0FNQUuhUDtRnyPDpTZkPU7mcx6xlQn2+GbgcuoHIC6U
Fxdz0zAH78UyzbgIsOivADiYhXV52wZTHfg5fZnDvq0NX2yrNy98zqV/v2bnQpEN
3PitldEGaAbCy6eed60yvxl1M4FRFYC76uu09araQ/9BW4EJ+TJy3mceaqF4rNUQ
PDjC1XtA3sdUQ4xaw4Dw
=rm2F
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.10-rc6-part-two' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This is the main request for rc6, since really the one earlier was the
rc5 one :-)
The main thing are the nouveau specific race fixes for the connector
locking bug we fixed in -next and reverted here as it has quite large
prereqs. These two fixes should solve the problem at that level and we
can fix it properly in 4.11
Otherwise i915 has a bunch of changes, one ABI change for GVT related
stuff, some VC4 leak fixes, one core fence fix and some AMD changes,
oh and one ast hang avoidance fix.
Hoping it calms down around now"
* tag 'drm-fixes-for-v4.10-rc6-part-two' of git://people.freedesktop.org/~airlied/linux: (25 commits)
drm/nouveau: Handle fbcon suspend/resume in seperate worker
drm/nouveau: Don't enabling polling twice on runtime resume
drm/ast: Fixed system hanged if disable P2A
Revert "drm/radeon: always apply pci shutdown callbacks"
drm/i915: reinstate call to trace_i915_vma_bind
drm/i915: Move atomic state free from out of fence release
drm/i915: Check for NULL atomic state in intel_crtc_disable_noatomic()
drm/i915: Fix calculation of rotated x and y offsets for planar formats
drm/i915: Don't init hpd polling for vlv and chv from runtime_suspend()
drm/i915: Don't leak edid in intel_crt_detect_ddc()
drm/i915: Release temporary load-detect state upon switching
drm/i915: prevent crash with .disable_display parameter
drm/i915: Avoid drm_atomic_state_put(NULL) in intel_display_resume
MAINTAINERS: update new mail list for intel gvt driver
drm/i915/gvt: Fix kmem_cache_create() name
drm/i915/gvt/kvmgt: mdev ABI is available_instances, not available_instance
drm/amdgpu: fix unload driver issue for virtual display
drm/amdgpu: check ring being ready before using
drm/vc4: Return -EINVAL on the overflow checks failing.
drm/vc4: Fix an integer overflow in temporary allocation layout.
...
- Revert the recent change that caused suspend-to-idle to be used
as the default suspend method on systems where it is indicated to
be efficient by the ACPI tables, as that turned out to be premature
and introduced suspend regressions on some systems with missing
power management support in device drivers (Rafael Wysocki).
- Fix up the intel_pstate driver to take changes of the global
limits via sysfs correctly when the performance policy is used
which has been broken by a recent change in it (Srinivas Pandruvada).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYiomwAAoJEILEb/54YlRxgGEP/i2Z+MVZbIwifod6wz6Yt0mj
Rm2uri24qEdJJSWCejDAngySiU+ymNJIYfVx2q5l99x1W4WTwyNWuEdWhGMTBDwG
tI3eeLSCZcAmPeMb7V/l8lPvJiKGBf8tM162j09zpZpgO4LFsunPX9PA5IWbK55M
U4hYeho3OlLjT9yiS8Yc9iLAZPrf7MRDBqtM0PeCklJHHyYmberbmSirH/TPm1Sq
TPHHbBk6d2sFxA5mkUEItm5y7g9Wq9kN/E08a6TA0HthQBnjEaJ9GTQCpSBOEGHF
MgEu/MOxm1Kou9YvOQMN3B1L9/VOb5JatV8RkMDltEctJseymQejdg8gFMLIKZ5j
lDAfC/tSpXAXvxUPl/ObYloKJP3HV4ly8urxZ8rqqUWLPq7vK/jo5OwA97Kjp5a0
/qW0LACoK8B96WOYYaNR1hWullH7+hDItKkbbBSBKKSNPXCgOmzqkGjCqvze4yYl
yS2PeOgr6cA3D0iAMIFhmiRkaauAv/Dl++yiF7oVEpn8JI0UfpjOKBegPOdAysFw
ADb5wLAHZ1LCyTn2CnLz1i2F87HxrYrFrKdTwjjBLyEu2aIw9sFuWSvbxuzZ7asf
u2JaUA+KmvMhkZMUkgX2jhzR3siHLEFOqnfWQh2rnZV2ukNCWxL/YjjcWkoUuLTc
H+Kx+VxcU6wKJPVWsnsK
=YF+u
-----END PGP SIGNATURE-----
Merge tag 'pm-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two regressions introduced recently, one by reverting the
problematic commit and one by fixing up the behavior in an overlooked
case.
Specifics:
- Revert the recent change that caused suspend-to-idle to be used as
the default suspend method on systems where it is indicated to be
efficient by the ACPI tables, as that turned out to be premature
and introduced suspend regressions on some systems with missing
power management support in device drivers (Rafael Wysocki).
- Fix up the intel_pstate driver to take changes of the global limits
via sysfs correctly when the performance policy is used which has
been broken by a recent change in it (Srinivas Pandruvada)"
* tag 'pm-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag"
Single fence fix.
* tag 'drm-misc-fixes-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc:
drm/fence: fix memory overwrite when setting out_fence fd
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains a large batch with Netfilter fixes for
your net tree, they are:
1) Two patches to solve conntrack garbage collector cpu hogging, one to
remove GC_MAX_EVICTS and another to look at the ratio (scanned entries
vs. evicted entries) to make a decision on whether to reduce or not
the scanning interval. From Florian Westphal.
2) Two patches to fix incorrect set element counting if NLM_F_EXCL is
is not set. Moreover, don't decrenent set->nelems from abort patch
if -ENFILE which leaks a spare slot in the set. This includes a
patch to deconstify the set walk callback to update set->ndeact.
3) Two fixes for the fwmark_reflect sysctl feature: Propagate mark to
reply packets both from nf_reject and local stack, from Pau Espin Pedrol.
4) Fix incorrect handling of loopback traffic in rpfilter and nf_tables
fib expression, from Liping Zhang.
5) Fix oops on stateful objects netlink dump, when no filter is specified.
Also from Liping Zhang.
6) Fix a build error if proc is not available in ipt_CLUSTERIP, related
to fix that was applied in the previous batch for net. From Arnd Bergmann.
7) Fix lack of string validation in table, chain, set and stateful
object names in nf_tables, from Liping Zhang. Moreover, restrict
maximum log prefix length to 127 bytes, otherwise explicitly bail
out.
8) Two patches to fix spelling and typos in nf_tables uapi header file
and Kconfig, patches from Alexander Alemayhu and William Breathitt Gray.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYiQ5uAAoJEAx081l5xIa+YTwP/jBQ7yjAACjpPS5SI3ProLd6
FZ8QrvjSv65ez8b59MEKe1Ld5qSyF61CyorA0F3oxA/dV2SG14sjRI8AXteKkoQN
XSk2NGiub7c3+Edtr2Cr+pzK91pqwHjtEBLh2Ftex2HrCCq2AP9HNgxU0l0056sH
hfplVP1kwavgn2zpbXA99JjoNHk3+UMX/UbUibH3lOhJkaM401cpBhco9yh07rK5
XfckHjnQwTH5jtvocorOq7Zh/cSWFqTWdw+kkW8TawHzT9GdoYSh6B0XR57LLwyB
yQ0tHa706oqjWMUW0kxOKB/EjEGsyKDSMuc52qOisvwrgmw0JS/RbqHctfjsGrrb
leCtffxe4yFgdB9jFbvyrY5tYt2lrxLm87oiYeB3T/r2SnWhkmyZsNCIJ+IGOkId
1DyuvZnh2UztHpru+TE5frALvM6I/gCF6UwdITAm++yJp87OqWo3fckz+KG5ln6Q
5c0p2jLR15Ii3tbs2q277FZMMOYoO6NMrcDVoDv1pqtG1vFA8ydcLyyPUVBdDF3Z
fpGHjOvGXLoQN4QFXXU/6utlZ6Twn0WGfPHQUH6iNL8pxY1JuyyrAkZS6ubFWNQd
dt71uznvVMszLAT95pUBhjmtttVh+XIH0BYVejFflpC/JoxR5xWce2Ka0rVPwcHe
kpv4K+s1PeTBKijd/t5s
=PHLN
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux
Pull drm revert from Dave Airlie:
"Revert one patch missing some prereqs.
One of the connector fixes was missing some prereqs, we have an
alternate driver fix that should work that I'll send tomorrow.
Today is a holiday here so quickly smashing this out"
Daniel Vetter explains:
"I pushed a locking change to fix a nouveau rpm issue to -fixes that
needed the connector_list rework. And that's only in -next, but I
missed that. Dave has the revert in a pull, and he'll follow-up with
the hack nouveau patch for 4.10, and then we'll reapply the proper fix
again for -next and revert the hacks. A bit a mess, but should be
sorted soon"
* tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/probe-helpers: Drop locking from poll_enable"
This reverts commit 3846fd9b86.
There were some precursor commits missing for this around connector
locking, we should probably merge Lyude's nouveau avoid the problem patch.
Commit 4567d686f5 ("phy: increase size of MII_BUS_ID_SIZE and
bus_id") increased the size of MII bus IDs, but forgot to update the
private definition in <linux/phy_led_triggers.h>.
This may cause:
1. Truncation of LED trigger names,
2. Duplicate LED trigger names,
3. Failures registering LED triggers,
4. Crashes due to bad error handling in the LED trigger failure path.
To fix this, and prevent the definitions going out of sync again in the
future, let the PHY LED trigger code use the existing MII_BUS_ID_SIZE
definition.
Example:
- Before I had triggers "ee700000.etherne:01:100Mbps" and
"ee700000.etherne:01:10Mbps",
- After the increase of MII_BUS_ID_SIZE, both became
"ee700000.ethernet-ffffffff:01:" => FAIL,
- Now, the triggers are "ee700000.ethernet-ffffffff:01:100Mbps" and
"ee700000.ethernet-ffffffff:01:10Mbps", which are unique again.
Fixes: 4567d686f5 ("phy: increase size of MII_BUS_ID_SIZE and bus_id")
Fixes: 2e0bc452f4 ("net: phy: leds: add support for led triggers on phy link state change")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
<linux/phy.h> includes <linux/phy_led_triggers.h>, which is not really
needed. Drop the include from <linux/phy.h>, and add it to all users
that didn't include it explicitly.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
This is v2 of my attempt to fix the recent report based on LTP cpuset
stress test [1]. The intention is to go to stable 4.9 LTSS with this,
as triggering repeated OOMs is not nice. That's why the patches try to
be not too intrusive.
Unfortunately why investigating I found that modifying the testcase to
use per-VMA policies instead of per-task policies will bring the OOM's
back, but that seems to be much older and harder to fix problem. I have
posted a RFC [2] but I believe that fixing the recent regressions has a
higher priority.
Longer-term we might try to think how to fix the cpuset mess in a better
and less error prone way. I was for example very surprised to learn,
that cpuset updates change not only task->mems_allowed, but also
nodemask of mempolicies. Until now I expected the parameter to
alloc_pages_nodemask() to be stable. I wonder why do we then treat
cpusets specially in get_page_from_freelist() and distinguish HARDWALL
etc, when there's unconditional intersection between mempolicy and
cpuset. I would expect the nodemask adjustment for saving overhead in
g_p_f(), but that clearly doesn't happen in the current form. So we
have both crazy complexity and overhead, AFAICS.
[1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
[2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
This patch (of 4):
Since commit c33d6c06f6 ("mm, page_alloc: avoid looking up the first
zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
which can theoretically happen due to concurrent cpuset modification. We
check the zoneref pointer which is never NULL and we should check the zone
pointer. Also document this in first_zones_zonelist() comment per Michal
Hocko.
Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On an overloaded system, it is possible that a change in the watchdog
threshold can be delayed long enough to trigger a false positive.
This can easily be achieved by having a cpu spinning indefinitely on a
task, while another cpu updates watchdog threshold.
What happens is while trying to park the watchdog threads, the hrtimers
on the other cpus trigger and reprogram themselves with the new slower
watchdog threshold. Meanwhile, the nmi watchdog is still programmed
with the old faster threshold.
Because the one cpu is blocked, it prevents the thread parking on the
other cpus from completing, which is needed to shutdown the nmi watchdog
and reprogram it correctly. As a result, a false positive from the nmi
watchdog is reported.
Fix this by setting a park_in_progress flag to block all lockups until
the parking is complete.
Fix provided by Ulrich Obergfell.
[akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.com
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
online_{kernel|movable} is used to change the memory zone to
ZONE_{NORMAL|MOVABLE} and online the memory.
To check that memory zone can be changed, zone_can_shift() is used.
Currently the function returns minus integer value, plus integer
value and 0. When the function returns minus or plus integer value,
it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
But when the function returns 0, there are two meanings.
One of the meanings is that the memory zone does not need to be changed.
For example, when memory is in ZONE_NORMAL and onlined by online_kernel
the memory zone does not need to be changed.
Another meaning is that the memory zone cannot be changed. When memory
is in ZONE_NORMAL and onlined by online_movable, the memory zone may
not be changed to ZONE_MOVALBE due to memory online limitation(see
Documentation/memory-hotplug.txt). In this case, memory must not be
onlined.
The patch changes the return type of zone_can_shift() so that memory
online operation fails when memory zone cannot be changed as follows:
Before applying patch:
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
# echo online_movable > memory4097/state
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 8388608
managed 8388608
online_movable operation succeeded. But memory is onlined as
ZONE_NORMAL, not ZONE_MOVABLE.
After applying patch:
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
# echo online_movable > memory4097/state
bash: echo: write error: Invalid argument
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
online_movable operation failed because of failure of changing
the memory zone from ZONE_NORMAL to ZONE_MOVABLE
Fixes: df429ac039 ("memory-hotplug: more general validation of zone during online")
Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so specify the owning
module for all lwtunnel ops.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127,
at nf_log_packet(), so the extra part is useless.
Second, after adding a log rule with a very very long prefix, we will
fail to dump the nft rules after this _special_ one, but acctually,
they do exist. For example:
# name_65000=$(printf "%0.sQ" {1..65000})
# nft add rule filter output log prefix "$name_65000"
# nft add rule filter output counter
# nft add rule filter output counter
# nft list chain filter output
table ip filter {
chain output {
type filter hook output priority 0; policy accept;
}
}
So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1.
Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
As the functionality to convert the MTU from a number to enum_ib_mtu
is ubiquitous, define a dedicated function and remove the duplicated
code.
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Xuan Qi reports that the Linux NFSv4 client failed to lock a file
that was migrated. The steps he observed on the wire:
1. The client sent a LOCK request to the source server
2. The source server replied NFS4ERR_MOVED
3. The client switched to the destination server
4. The client sent the same LOCK request to the destination
server with a bumped lock sequence ID
5. The destination server rejected the LOCK request with
NFS4ERR_BAD_SEQID
RFC 3530 section 8.1.5 provides a list of NFS errors which do not
bump a lock sequence ID.
However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
- Fix a lockdep issue: the threaded irqchips also need their unique
key, and take this opportunity to get rid of the horrible macro and
replace it with a static inline.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYhcFEAAoJEEEQszewGV1z3sUP/2an5yxYN3FzXcpBKBVr5U4d
FpiU43HeWe3tv2Uwbe1bd5e6i51m3j2PBNOsch3DynjrhCOfafup5sL5B6DCdEHY
SI5j++ZLy+bRLpDRWMnWCCPRJs4JkEPj4KYH8iMC0s3hPX8p2zEOjVNZ6vLleq4R
7/jJTp/+NXq1BCE3cQHfam7dAuteYJqHLlweJ4J6fPc97OqLjz0vfCI+/8r4emXm
MRHL16KR+0Qz/Fg5LYa46c4o+EanN8QcUz4BIrS+2xuaRGzBfPLcjYAkABd+2gVw
R+sBfKj8qhV0bZCCBSrCshebO23bx4HU2EqBjGbXDMwIgmIiGpIXXgGu/uFSAA8u
kzjTlpFrbuinjGrr+cbmLAQvnj1DXPE/HRmRaSxZyvEbxMBO8nL2n8gJXDXxJ9j3
dBBoHDyPw1w+nDXYbNIx+pqUo5PRkJ4cFxdVN4J9owteivI2midIVDAymr9lSZoQ
q+h5tTRY2EDOm9Yyn983CU+rTsuY/Fyh72gjanFKWVqEbp6CI2ckIyhxGmGiPHSZ
/QdbarFBksxosBElcGvqnPI9DVLVwg6uJq7M7WLOh8Oi74zD6lm54pprsybvVaoV
5xvIQNjUPHB7quKYMi4EXthZO1H+BXyarzrc0+lqXle/KpqWQylXUQ8LONva5JbS
wyl3t54XWULK1DvWhLUu
=x1bG
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fix from Linus Walleij:
"A single lockdep fix, nothing else going on. This makes lockdep
noiseless and work properly with threaded GPIO IRQchips.
Summary:
Fix a lockdep issue: the threaded irqchips also need their unique key,
and take this opportunity to get rid of the horrible macro and replace
it with a static inline"
* tag 'gpio-v4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: provide lockdep keys for nested/unnested irqchips
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYhUA5AAoJEAx081l5xIa+E9wP/j/2v9+L/bG+G7pNMnIuyEcy
5SB/ipC12Ek7j9cKtlyIBIXpAqXnDoKMd/a6AamC7a/w0gmluVHe1eoB4i326wRQ
fNDjD57it/N9QGHyyhHlRzOhwncP9tBHpfaHQPmZE68blusrb9JXgmbVbVRubSjB
fvwhK7aA6oWb2XhecWJ5UwVyYYB/E+Ca0hs//+6PRYI2MlHz4u1OOXD4jkkiFMdz
UcAAEqMOPNbHPYM3FsVs2Xo171AgFaeykoX4jt8+Kbyl91YYj0uyZ0BQA76midVz
U8d/UEE8c+4lg4wqyDjL0c5HhtaIekluuUa2yxi8t/1T4j58wDXxyjhlgJn4toN8
M10LBaCUsCj7Y3iZiYPc8QkMSY6GO1ITkhByw7LKILG1L/RfiVnixkzVtqyO02oA
b7R6HLS5ma63WiYf6HgxlgDlgwiDwU3lnvZcttJfOAPUN0SZsXUHG6i7vs6XX8eL
Y4wm+SB0605yRWd2bZ8FUWVvpifCYjCMH7vwFUNbwuSb4ZvsCmSyjNTSTGAbObQH
v7i+rIegaQ0PV9ribZ0HsIBnOzjO7Vr0O46ZpgWSU9G/QuYmMuOGHvnhfo7HfbaI
KRqRmlLULosp7DeM4Z8MFrNeNj+0OXaPacdfqPS86uCiU/wShyZUyCzJKeD6HTid
xVpiX27zOzt2zvQE/OzL
=H6SW
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.10-rc6' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"drm fixes across the board.
Okay holidays and LCA kinda caught up with me, I thought I'd get some
of this dequeued last week, but Hobart was sunny and warm and not all
gloomy and rainy as usual.
This is a bit large, but not too much considering it's two weeks stuff
from AMD and Intel.
core:
- one locking fix that helps with dynamic suspend/resume races
i915:
- mostly GVT updates, GVT was a recent introduction so fixes for it
shouldn't cause any notable side effects.
amdgpu:
- a bunch of fixes for GPUs with a different memory controller design
that need different firmware.
exynos:
- decon regression fixes
msm:
- two regression fixes
etnaviv:
- a workaround for an mmu bug that needs a lot more work.
virtio:
- sparse fix, and a maintainers update"
* tag 'drm-fixes-for-v4.10-rc6' of git://people.freedesktop.org/~airlied/linux: (56 commits)
drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
drm/exynos/decon5433: fix CMU programming
drm/exynos/decon5433: do not disable video after reset
drm/i915: Ignore bogus plane coordinates on SKL when the plane is not visible
drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
drm/amdgpu: add support for new hainan variants
drm/radeon: add support for new hainan variants
drm/amdgpu: change clock gating mode for uvd_v4.
drm/amdgpu: fix program vce instance logic error.
drm/amdgpu: fix bug set incorrect value to vce register
Revert "drm/amdgpu: Only update the CUR_SIZE register when necessary"
drm/msm: fix potential null ptr issue in non-iommu case
drm/msm/mdp5: rip out plane->pending tracking
drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled
drm/exynos/decon5433: update shadow registers iff there are active windows
drm/i915/gvt: rewrite gt reset handler using new function intel_gvt_reset_vgpu_locked
drm/i915/gvt: fix vGPU instance reuse issues by vGPU reset function
drm/i915/gvt: introduce intel_vgpu_reset_mmio() to reset mmio space
drm/i915/gvt: move mmio init/clean function to mmio.c
drm/i915/gvt: introduce intel_vgpu_reset_cfg_space to reset configuration space
...
ARM:
- Fix for timer setup on VHE machines
- Drop spurious warning when the timer races against the vcpu running
again
- Prevent a vgic deadlock when the initialization fails (for stable)
s390:
- Fix a kernel memory exposure (for stable)
x86:
- Fix exception injection when hypercall instruction cannot be patched
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJYglwIAAoJEED/6hsPKofoZp0H+gLLEeKP0Mu+olXiOWjB/KFp
WBDAR1872xIjvEcOl9l6AZgdmp2hk7KW1t+kJj5npgu237v6fHBO9ybqrAfhfU4l
PH23zOebL15HINcwCK6OcxOTiOtgae5Nui1cnLJBHDQgPTC/VmIE8NgV/qrMyo2r
Vth+K/cBLKiWG9JhyQvxmrfupNJUknLSH7CTnlO/fC8GEJzDfMpUl7B1Ui0TGK53
ExVgVLg3F28SErj9bUU8y4VJhMrwDAf2Kx2BNHqDbzXMzTdp0LrGRymFLl2/Gxez
zLtZDfGYYzEhPp1NuDydlxLb8ymnsQNB7K6Kau0w9JoAvOYwfUYfDt+GaTegwYM=
=dPtS
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- Fix for timer setup on VHE machines
- Drop spurious warning when the timer races against the vcpu running
again
- Prevent a vgic deadlock when the initialization fails (for stable)
s390:
- Fix a kernel memory exposure (for stable)
x86:
- Fix exception injection when hypercall instruction cannot be
patched"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: do not expose random data via facility bitmap
KVM: x86: fix fixing of hypercalls
KVM: arm/arm64: vgic: Fix deadlock on error handling
KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems
KVM: arm/arm64: Fix occasional warning from the timer work function
This is a set of 12 fixes including the mpt3sas one that was causing
hangs on ATA passthrough. The others are a couple of zoned block
device fixes, a SAS device detection bug which lead to SATA drives not
being matched to bays, two qla2xxx MSI fixes, a qla2xxx req for rsp
confusion caused by cut and paste, and a few other minor fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYgY/0AAoJEAVr7HOZEZN4avAP/1WxoiNBoQV16CsAyvSlbf0A
ZQ7cR0O/cWjI7ccoLUKHxbTJ57GzMTelfmrdRsVIbNeH74NGSP4+IxzshItSwKG0
/hz0MkCqAxRs/pxliZW8DM12cEbKDIm/t4XWjDst1KBdqakluigmAIniWitKnLE3
IeA0am4/KDf/n5tQm7CI+7GOT7g0QuRqAlfAM1MFXZGAzjw2RZcSQOVkPr6vW2I+
bvDUzmWDHOZ/3g5CRrgA6pGC/1FBVbYPF5iOQicCUuMjpniR4IVvcwek5ULUkCYl
5yZnx7dg/FEXcFawp3fTQ8SbxM6Dhkqkb5/Xe/E+wjjAOwF1I3lDwdV39nQk3aBK
TwIFgeZY8CS3Z+7e9rtlRFUL2kFWwKy5uhFM/6qmY9cpXFYnPN5JsrChq/7Tgb9t
XW3/4CoS2ijcNuvxGymvyJG2FpZsLaJfLfXw4kkEUtn0W3Dzbub2PrMfPUUn7hQO
rmk3ZEuoSANvsZ9n647Gpvo7iQBbC7blNf8krSLJvoopN6SUUmLCsePXEHxorBxz
r7xAkHFqapOjBn28u4JAn5po7aVa3/fQwBuUF6UvfuVJKbvnobl4Qx0O/Fqadd7s
bix9OfCG1GMyEnUBLEWPpBg0jS2JZbLupeXVssiATks93KV2SefiQlEhYwA5hOS5
6+PykxkIEjEmU7asjQJ0
=8y3/
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of 12 fixes including the mpt3sas one that was causing
hangs on ATA passthrough.
The others are a couple of zoned block device fixes, a SAS device
detection bug which lead to SATA drives not being matched to bays, two
qla2xxx MSI fixes, a qla2xxx req for rsp confusion caused by cut and
paste, and a few other minor fixes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: fix hang on ata passthrough commands
scsi: lpfc: Set elsiocb contexts to NULL after freeing it
scsi: sd: Ignore zoned field for host-managed devices
scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type
scsi: bfa: fix wrongly initialized variable in bfad_im_bsg_els_ct_request()
scsi: ses: Fix SAS device detection in enclosure
scsi: libfc: Fix variable name in fc_set_wwpn
scsi: lpfc: avoid double free of resource identifiers
scsi: qla2xxx: remove irq_affinity_notifier
scsi: qla2xxx: fix MSI-X vector affinity
scsi: qla2xxx: Fix apparent cut-n-paste error.
scsi: qla2xxx: Get mutex lock before checking optrom_state
Commit 501db51139 ("virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on
xmit") in fact disables VIRTIO_HDR_F_DATA_VALID on receiving path too,
fixing this by adding a hint (has_data_valid) and set it only on the
receiving path.
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Revert commit 08b98d3291 (PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0
flag) as it caused system suspend (in the default configuration) to fail
on Dell XPS13 (9360) with the Kaby Lake processor.
Fixes: 08b98d3291 (PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag)
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The helper function for adding a GPIO chip compiles in a lockdep
key for debugging, the same key is needed for nested chips as
well.
The macro construction is unreadable, replace this with two
static inlines instead.
The _gpiochip_irqchip_add prefixed function is not helpful,
rename it with gpiochip_irqchip_add_key() that tell us what the
function is actually doing.
Fixes: d245b3f9bd ("gpio: simplify adding threaded interrupts")
Cc: Roger Quadros <rogerq@ti.com>
Reported-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Reported-by: Roger Quadros <rogerq@ti.com>
Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
This patch adds two helpers, bpf_map_area_alloc() and bpf_map_area_free(),
that are to be used for map allocations. Using kmalloc() for very large
allocations can cause excessive work within the page allocator, so i) fall
back earlier to vmalloc() when the attempt is considered costly anyway,
and even more importantly ii) don't trigger OOM killer with any of the
allocators.
Since this is based on a user space request, for example, when creating
maps with element pre-allocation, we really want such requests to fail
instead of killing other user space processes.
Also, don't spam the kernel log with warnings should any of the allocations
fail under pressure. Given that, we can make backend selection in
bpf_map_area_alloc() generic, and convert all maps over to use this API
for spots with potentially large allocation requests.
Note, replacing the one kmalloc_array() is fine as overflow checks happen
earlier in htab_map_alloc(), since it must also protect the multiplication
for vmalloc() should kmalloc_array() fail.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trying to add an mpls encap route when the MPLS modules are not loaded
hangs. For example:
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
CONFIG_MPLS_IPTUNNEL=m
$ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
The ip command hangs:
root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
$ cat /proc/880/stack
[<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
[<ffffffff81065efc>] __request_module+0x27b/0x30a
[<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
[<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
[<ffffffff814ae451>] fib_table_insert+0x90/0x41f
[<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
...
modprobe is trying to load rtnl-lwt-MPLS:
root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
and it hangs after loading mpls_router:
$ cat /proc/881/stack
[<ffffffff81441537>] rtnl_lock+0x12/0x14
[<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
[<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
[<ffffffff81000471>] do_one_initcall+0x8e/0x13f
[<ffffffff81119961>] do_init_module+0x5a/0x1e5
[<ffffffff810bd070>] load_module+0x13bd/0x17d6
...
The problem is that lwtunnel_build_state is called with rtnl lock
held preventing mpls_init from registering.
Given the potential references held by the time lwtunnel_build_state it
can not drop the rtnl lock to the load module. So, extract the module
loading code from lwtunnel_build_state into a new function to validate
the encap type. The new function is called while converting the user
request into a fib_config which is well before any table, device or
fib entries are examined.
Fixes: 745041e2aa ("lwtunnel: autoload of lwt modules")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SMP hotplug update from Thomas Gleixner:
"This contains a trivial typo fix and an extension to the core code for
dynamically allocating states in the prepare stage.
The extension is necessary right now because we need a proper way to
unbreak LTTNG, which iscurrently non functional due to the removal of
the notifiers. Surely it's out of tree, but it's widely used by
distros.
The simple solution would have been to reserve a state for LTTNG, but
I'm not fond about unused crap in the kernel and the dynamic range,
which we admittedly should have done right away, allows us to remove
quite some of the hardcoded states, i.e. those which have no ordering
requirements. So doing the right thing now is better than having an
smaller intermediate solution which needs to be reworked anyway"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Provide dynamic range for prepare stage
perf/x86/amd/ibs: Fix typo after cleanup state names in cpu/hotplug
Pull RCU fixes from Ingo Molnar:
"This fixes sporadic ACPI related hangs in synchronize_rcu() that were
caused by the ACPI code mistakenly relying on an aspect of RCU that
was neither promised to work nor reliable but which happened to work -
until in v4.9 we changed the RCU implementation, which made the hangs
more prominent.
Since the mis-use of the RCU facility wasn't properly detected and
prevented either, these fixes make the RCU side work reliably instead
of working around the problem in the ACPI code.
Hence the slightly larger diffstat that goes beyond the normal scope
of RCU fixes in -rc kernels"
* 'rcu-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Narrow early boot window of illegal synchronous grace periods
rcu: Remove cond_resched() from Tiny synchronize_sched()
- Fix out-of-tree module breakage when it supplies its own
definitions of true and false
Signed-off-by: Jessica Yu <jeyu@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJYfm7sAAoJEMBFfjjOO8FyrmMP/j1Sa179+uBWWPE0Td7ip6yj
EgvOtZGcnZfMuHbs5Evn8Fnz5K3of3IriiJNPePuQPu/YnoidltxkWOMXYkCj+Fn
acW8VRtrh2urec70gRapuTmSpxs1I/XLUdNG+Ozm0FFX+L+k0ydCqEPGuVkwyHNK
Wn31lVTiqx+zWm5PAJBzD6dEchQ0h2uppHRmZ+mIn3GyvYavIGnMMkdjqEEq9v8w
UYdw52AJFAGMDO8LoSihX5cFbe0E28A58jkJuJ5AKXglaY6Nvl2xWOxfLhFnxO1m
7KFuf+q2YO10hoJtdItEmPw2iC8pIgoAUGpZ+4h0iSWxyUC5V4QEmrhe4q9CtOLD
+dfcd+44UekvWiWL4AQUO6IsUzIo8UqsJYf4Tic4/EjAKZtGTseKjQqCgBv3kJA+
nN3hJ9gMN4NZWOMLihSn7Ml/whrxchdqlEP520nzGTnWUaLOUPp4XhfNlDaAH58K
WfxiT0L6w+Cbg3xMCZRxQyqlJWWw8x1CM7B6eScHvN67TulC2enIQYTkv6eDOzQX
DPz4lvcFisjASFP+i+3ouYL2pfLnm/IUG9K1wieqBvPHEdeZBuGr7+VEHjFmhhmG
f3kKvYsRgUQF8tKeGtI0uPxnZNw4z4QaYOVKf8bISzpsIHOezdaouOm+KGUctHqO
DNIWMf34W7fE5AVrGKp6
=ybIc
-----END PGP SIGNATURE-----
Merge tag 'modules-for-v4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull modules fix from Jessica Yu:
- fix out-of-tree module breakage when it supplies its own definitions
of true and false
* tag 'modules-for-v4.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
taint/module: Fix problems when out-of-kernel driver defines true or false
This patch part reverts fd2a0437dc and e858fae2b0 which introduced a
subtle change in how the virtio_net flags are derived from the SKBs
ip_summed field.
With the above commits, the flags are set to VIRTIO_NET_HDR_F_DATA_VALID
when ip_summed == CHECKSUM_UNNECESSARY, thus treating it differently to
ip_summed == CHECKSUM_NONE, which should be the same.
Further, the virtio spec 1.0 / CS04 explicitly says that
VIRTIO_NET_HDR_F_DATA_VALID must not be set by the driver.
Fixes: fd2a0437dc ("virtio_net: introduce virtio_net_hdr_{from,to}_skb")
Fixes: e858fae2b0 (" virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7fd8329ba5 ("taint/module: Clean up global and module taint
flags handling") used the key words true and false as character members
of a new struct. These names cause problems when out-of-kernel modules
such as VirtualBox include their own definitions of true and false.
Fixes: 7fd8329ba5 ("taint/module: Clean up global and module taint flags handling")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Pull networking fixes from David Miller:
1) Handle multicast packets properly in fast-RX path of mac80211, from
Johannes Berg.
2) Because of a logic bug, the user can't actually force SW
checksumming on r8152 devices. This makes diagnosis of hw
checksumming bugs really annoying. Fix from Hayes Wang.
3) VXLAN route lookup does not take the source and destination ports
into account, which means IPSEC policies cannot be matched properly.
Fix from Martynas Pumputis.
4) Do proper RCU locking in netvsc callbacks, from Stephen Hemminger.
5) Fix SKB leaks in mlxsw driver, from Arkadi Sharshevsky.
6) If lwtunnel_fill_encap() fails, we do not abort the netlink message
construction properly in fib_dump_info(), from David Ahern.
7) Do not use kernel stack for DMA buffers in atusb driver, from Stefan
Schmidt.
8) Openvswitch conntack actions need to maintain a correct checksum,
fix from Lance Richardson.
9) ax25_disconnect() is missing a check for ax25->sk being NULL, in
fact it already checks this, but not in all of the necessary spots.
Fix from Basil Gunn.
10) Action GET operations in the packet scheduler can erroneously bump
the reference count of the entry, making it unreleasable. Fix from
Jamal Hadi Salim. Jamal gives a great set of example command lines
that trigger this in the commit message.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
net sched actions: fix refcnt when GETing of action after bind
net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV
net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
net/mlx4_core: Fix racy CQ (Completion Queue) free
net: stmmac: don't use netdev_[dbg, info, ..] before net_device is registered
net/mlx5e: Fix a -Wmaybe-uninitialized warning
ax25: Fix segfault after sock connection timeout
bpf: rework prog_digest into prog_tag
tipc: allocate user memory with GFP_KERNEL flag
net: phy: dp83867: allow RGMII_TXID/RGMII_RXID interface types
ip6_tunnel: Account for tunnel header in tunnel MTU
mld: do not remove mld souce list info when set link down
be2net: fix MAC addr setting on privileged BE3 VFs
be2net: don't delete MAC on close on unprivileged BE3 VFs
be2net: fix status check in be_cmd_pmac_add()
cpmac: remove hopeless #warning
ravb: do not use zero-length alignment DMA descriptor
mlx4: do not call napi_schedule() without care
openvswitch: maintain correct checksum state in conntrack actions
tcp: fix tcp_fastopen unaligned access complaints on sparc
...
- Fix for timer setup on VHE machines
- Drop spurious warning when the timer races against
the vcpu running again
- Prevent a vgic deadlock when the initialization fails
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYeLkqAAoJECPQ0LrRPXpDpasP/iynAP2/28LNQ7L8r3be0m7N
PCZv3gpsTtu1cmTlpiPQVPqzBUVPXQMJJAnDIhA7e1YbYfYJUIhEirVUQ56oT+lY
nmnoquzNFGUNkYRHUj8RTeQ8ISuP/iPlWOVlSh4n2rbrHFAxX1+W1xXdOedbdFJp
oFDRR6DCB4qdIxt7dtTcfObhwVlro8WDOfClNqiUDUdhxfDBD7mEPYNO2J2D0/ca
6YQAkxPzP/MJgKl0mVRjjxhkoNSt6lP0uyU65X67g7i2ZlFRyU3PvqwPm3JOY++V
Uuff2ud81D5yYwf0+6sA+i+707jezgsoiuFnwUFQVjW/9rpIK7W1gzqht2VWZNgJ
NjhFPtNhOFkam/sxPFiSaTNyhUNgrg6C/qPDBF0pPlyGd0mtrvJdVG4y/R9YGSB3
JoiwaFdoi9buEakpRbhu4y5bhcDZZPAo+l+2OanpeE6y3zypqTsJRop8H/qhnv8R
FFzqycBjeEcVo9ZKIpqQVEnp0njBcHKmqblMswgSgGKe3816iGC+kj4oMm7J57Yq
vy1OKQMuur+rSRyH7LpVN1HYiq9yBHdbHsXxUjEWSD5dtPAGdOhAP0kAzXlPbEot
WkVQO2uEV3EKs0UQjWkLKNVJKh/pPaWH3z8ENRlf+pASzvO+UYRmY3sjfDcg3AsI
7Lxc+CVJJ4ExJbFe/OTk
=7cIH
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM updates for 4.10-rc4
- Fix for timer setup on VHE machines
- Drop spurious warning when the timer races against
the vcpu running again
- Prevent a vgic deadlock when the initialization fails
Currently if the userspace declares a int variable to store the out_fence
fd and pass it to OUT_FENCE_PTR the kernel will overwrite the 32 bits
above the int variable on 64 bits systems.
Fix this by making the internal storage of out_fence in the kernel a s32
pointer.
Reported-by: Chad Versace <chadversary@chromium.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Fixes: beaf5af480 ("drm/fence: add out-fences support")
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-and-Tested-by: Chad Versace <chadversary@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1484317329-9293-1-git-send-email-gustavo@padovan.org
The parameter name should be wwpn instead of wwnn.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 7bd509e311 ("bpf: add prog_digest and expose it via
fdinfo/netlink") was recently discussed, partially due to
admittedly suboptimal name of "prog_digest" in combination
with sha1 hash usage, thus inevitably and rightfully concerns
about its security in terms of collision resistance were
raised with regards to use-cases.
The intended use cases are for debugging resp. introspection
only for providing a stable "tag" over the instruction sequence
that both kernel and user space can calculate independently.
It's not usable at all for making a security relevant decision.
So collisions where two different instruction sequences generate
the same tag can happen, but ideally at a rather low rate. The
"tag" will be dumped in hex and is short enough to introspect
in tracepoints or kallsyms output along with other data such
as stack trace, etc. Thus, this patch performs a rename into
prog_tag and truncates the tag to a short output (64 bits) to
make it obvious it's not collision-free.
Should in future a hash or facility be needed with a security
relevant focus, then we can think about requirements, constraints,
etc that would fit to that situation. For now, rework the exposed
parts for the current use cases as long as nothing has been
released yet. Tested on x86_64 and s390x.
Fixes: 7bd509e311 ("bpf: add prog_digest and expose it via fdinfo/netlink")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
bugs.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYfOk6AAoJECebzXlCjuG+Lj4QALaLKRRbIdrz6nmg7gUmpTWc
CdW8NMbzwSCXmYoivsTHBlhXZKsi5vVjnFXMCM/P85ddmipXdcTFCDLmmNoKUQ0M
jODlLX90ctaZKCDBVSaH4htAz2gkFv7z5IllX0YDQqHyiuzh/9KoV+AFCgPZPTpL
O1XRmfWz+yJDydz4hb3i5f2JvMk9P/tCXLnheuxxTIMSl2/fIfgF81eWwDpFqcA2
27+PyWWjZehVnZ77ca/mWJj2n0+gBINiKafcfF39NK/Hv2q4aauB3k7c4blecc9Q
m/IT3mKifvHvdNCmvHD5s74h4OikEGYpqaSjonMptZnWgfM4/gtF7yTiQjsOMDx/
w6W/tfHlGrvegpzhjaIaoZZ50EZp7xwGNNZYgH4J44kytYpolrhsOR6NqCLTqpej
xG2Kd89ZtnAgc/7T7ET/1PqpZ8f9M9pyV3E8s36OvF4AYQUNrfzbWSTQcZy3WGBP
YuoUCzacIbNbGgu4m6Zx5l/vKW5yn45xbUMp7T9S4WoxYMx6a5vViU0NiF7KsQDu
pcDT92DZ57KJFtCw7Ig08ILKsSXmNApH5/4mIrkX3quZuH4j2XapEJ9u//fmfZBd
Q+Sgv8RXcGELUJIg9yfmoWgPDA/oYslc7ynBV0lXLNgBuod//dGSlZ+6KfFFJYr8
XVOxwPTiiBIlc9lvB9eA
=tb4L
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.10-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Miscellaneous nfsd bugfixes, one for a 4.10 regression, three for
older bugs"
* tag 'nfsd-4.10-1' of git://linux-nfs.org/~bfields/linux:
svcrdma: avoid duplicate dma unmapping during error recovery
sunrpc: don't call sleeping functions from the notifier block callbacks
svcrpc: don't leak contexts on PROC_DESTROY
nfsd: fix supported attributes for acl & labels
Currently, we check the existing rtable in PREROUTING hook, if RTCF_LOCAL
is set, we assume that the packet is loopback.
But this assumption is incorrect, for example, a packet encapsulated
in ipsec transport mode was received and routed to local, after
decapsulation, it would be delivered to local again, and the rtable
was not dropped, so RTCF_LOCAL check would trigger. But actually, the
packet was not loopback.
So for these normal loopback packets, we can check whether the in device
is IFF_LOOPBACK or not. For these locally generated broadcast/multicast,
we can check whether the skb->pkt_type is PACKET_LOOPBACK or not.
Finally, there's a subtle difference between nft fib expr and xtables
rpfilter extension, user can add the following nft rule to do strict
rpfilter check:
# nft add rule x y meta iif eth0 fib saddr . iif oif != eth0 drop
So when the packet is loopback, it's better to store the in device
instead of the LOOPBACK_IFINDEX, otherwise, after adding the above
nft rule, locally generated broad/multicast packets will be dropped
incorrectly.
Fixes: f83a7ea207 ("netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too")
Fixes: f6d0cbcf09 ("netfilter: nf_tables: add fib expression")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
o s/numerice/numeric
o s/opertaor/operator
Signed-off-by: Alexander Alemayhu <alexander@alemayhu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Mathieu reported that the LTTNG modules are broken as of 4.10-rc1 due to
the removal of the cpu hotplug notifiers.
Usually I don't care much about out of tree modules, but LTTNG is widely
used in distros. There are two ways to solve that:
1) Reserve a hotplug state for LTTNG
2) Add a dynamic range for the prepare states.
While #1 is the simplest solution, #2 is the proper one as we can convert
in tree users, which do not care about ordering, to the dynamic range as
well.
Add a dynamic range which allows LTTNG to request states in the prepare
stage.
Reported-and-tested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701101353010.3401@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull an urgent RCU fix from Paul E. McKenney:
"This series contains a pair of commits that permit RCU synchronous grace
periods (synchronize_rcu() and friends) to work correctly throughout boot.
This eliminates the current "dead time" starting when the scheduler spawns
its first taks and ending when the last of RCU's kthreads is spawned
(this last happens during early_initcall() time). Although RCU's
synchronous grace periods have long been documented as not working
during this time, prior to 4.9, the expedited grace periods worked by
accident, and some ACPI code came to rely on this unintentional behavior.
(Note that this unintentional behavior was -not- reliable. For example,
failures from ACPI could occur on !SMP systems and on systems booting
with the rcu_normal kernel boot parameter.)
Either way, there is a bug that needs fixing, and the 4.9 switch of RCU's
expedited grace periods to workqueues could be considered to have caused
a regression. This series therefore makes RCU's expedited grace periods
operate correctly throughout the boot process. This has been demonstrated
to fix the problems ACPI was encountering, and has the added longer-term
benefit of simplifying RCU's behavior."
Signed-off-by: Ingo Molnar <mingo@kernel.org>
in actually sending them out - will try to do better in
the future:
* handle VHT opmode properly when hostapd is controlling
full station state
* two fixes for minimum channel width in mac80211
* don't leave SMPS set to junk in HT capabilities
* fix headroom when forwarding mesh packets, recently
broken by another fix that failed to take into account
frame encryption
* fix the TID in null-data packets indicating EOSP (end
of service period) in U-APSD
* prevent attempting to use (and then failing which
results in crashes) TXQs on stations that aren't added
to the driver yet
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJYeNveAAoJEGt7eEactAAdCjgP+gIIUjpH08MazRE9Nl6gqRGW
G6EAoaQDQ8AjiVo8nIJ33X2jmHRIPcyelyfMhdww3AtMJRrvl9wVhQvwKWI4+l6D
o8UVMBu66fq2luQA6AtOkkU6SkKwC7Se72Mdx6OA/zvdZz4/9Y4nZpOFPVwCbxqL
UJRzjrhnbS51YgL/Y0s0Vp+7Rv7IYCuH9JORNarMO5sYxaVpLWihJhbShX1bOshw
uFTHOjNRseImLhM4GOvVA7fSUiK8jxEuMECmlDKQB/6nVxMskE54yLOqMB5Ys6va
2CKjp5xqM4FfqB4LMNp7soJAXUOXvqk25JXAAbkVNo4VRd2Y4GoJ7XdBPqd4kfKb
rPlr+UP2xaSQsfqIoe+uFr/lVUmm4oCTrS/Mo7YVrjSMU7fntYwZccr9aw5jrQFW
YU+1QTF25HEb1SL18FH9JNWpoTlOlpB3bwpbAW4BHzsqDYc76CE/oyDLdZ6zQYlu
z92cLuycGTwgNhi1zDNThxB81zhZsWH1Jbh9ppZBdDPxo6E4DMK71okreGAXBEMJ
IdFZZqJbYW4I/TsUtse4atozk6oXlTEFY/pX4Qm0gwGwmqQx+wfSNLbymsD7gR42
OkL+bZ701HBl2gJUWuICrM2lFWtD4/o6oWpaW2I7QUAQhjvUr4ld9kthVMF2vdlt
w316EIfZQzBKw7Z7zjZI
=GU6q
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2017-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
We have a number of fixes, in part because I was late
in actually sending them out - will try to do better in
the future:
* handle VHT opmode properly when hostapd is controlling
full station state
* two fixes for minimum channel width in mac80211
* don't leave SMPS set to junk in HT capabilities
* fix headroom when forwarding mesh packets, recently
broken by another fix that failed to take into account
frame encryption
* fix the TID in null-data packets indicating EOSP (end
of service period) in U-APSD
* prevent attempting to use (and then failing which
results in crashes) TXQs on stations that aren't added
to the driver yet
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull i2c fixes from Wolfram Sang:
"Bugfixes for I2C. Mostly core this time which is a bit unusual but
nothing really scary in there"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: piix4: Avoid race conditions with IMC
i2c: fix spelling mistake: "insufficent" -> "insufficient"
i2c: print correct device invalid address
i2c: do not enable fall back to Host Notify by default
i2c: fix kernel memory disclosure in dev interface
Pull perf fixes from Ingo Molnar:
"Misc race fixes uncovered by fuzzing efforts, a Sparse fix, two PMU
driver fixes, plus miscellanous tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Reject non sampling events with precise_ip
perf/x86/intel: Account interrupts for PEBS errors
perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race
perf/core: Fix sys_perf_event_open() vs. hotplug
perf/x86/intel: Use ULL constant to prevent undefined shift behaviour
perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
perf/x86: Set pmu->module in Intel PMU modules
perf probe: Fix to probe on gcc generated symbols for offline kernel
perf probe: Fix --funcs to show correct symbols for offline module
perf symbols: Robustify reading of build-id from sysfs
perf tools: Install tools/lib/traceevent plugins with install-bin
tools lib traceevent: Fix prev/next_prio for deadline tasks
perf record: Fix --switch-output documentation and comment
perf record: Make __record_options static
tools lib subcmd: Add OPT_STRING_OPTARG_SET option
perf probe: Fix to get correct modname from elf header
samples/bpf trace_output_user: Remove duplicate sys/ioctl.h include
samples/bpf sock_example: Avoid getting ethhdr from two includes
perf sched timehist: Show total scheduling time
Pull EFI fixes from Ingo Molnar:
"A number of regression fixes:
- Fix a boot hang on machines that have somewhat unusual memory map
entries of phys_addr=0x0 num_pages=0, which broke due to a recent
commit. This commit got cherry-picked from the v4.11 queue because
the bug is affecting real machines.
- Fix a boot hang also reported by KASAN, caused by incorrect init
ordering introduced by a recent optimization.
- Fix a recent robustification fix to allocate_new_fdt_and_exit_boot()
that introduced an invalid assumption. Neither bugs were seen in
the wild AFAIK"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Prune invalid memory map entries and fix boot regression
x86/efi: Don't allocate memmap through memblock after mm_init()
efi/libstub/arm*: Pass latest memory map to the kernel
The current preemptible RCU implementation goes through three phases
during bootup. In the first phase, there is only one CPU that is running
with preemption disabled, so that a no-op is a synchronous grace period.
In the second mid-boot phase, the scheduler is running, but RCU has
not yet gotten its kthreads spawned (and, for expedited grace periods,
workqueues are not yet running. During this time, any attempt to do
a synchronous grace period will hang the system (or complain bitterly,
depending). In the third and final phase, RCU is fully operational and
everything works normally.
This has been OK for some time, but there has recently been some
synchronous grace periods showing up during the second mid-boot phase.
This code worked "by accident" for awhile, but started failing as soon
as expedited RCU grace periods switched over to workqueues in commit
8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue").
Note that the code was buggy even before this commit, as it was subject
to failure on real-time systems that forced all expedited grace periods
to run as normal grace periods (for example, using the rcu_normal ksysfs
parameter). The callchain from the failure case is as follows:
early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited
The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
which caused the code to try using workqueues before they were
initialized, which did not go well.
This commit therefore reworks RCU to permit synchronous grace periods
to proceed during this mid-boot phase. This commit is therefore a
fix to a regression introduced in v4.9, and is therefore being put
forward post-merge-window in v4.10.
This commit sets a flag from the existing rcu_scheduler_starting()
function which causes all synchronous grace periods to take the expedited
path. The expedited path now checks this flag, using the requesting task
to drive the expedited grace period forward during the mid-boot phase.
Finally, this flag is updated by a core_initcall() function named
rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.
Note that this arrangement assumes that tasks are not sent POSIX signals
(or anything similar) from the time that the first task is spawned
through core_initcall() time.
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Reported-by: "Zheng, Lv" <lv.zheng@intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Stan Kain <stan.kain@gmail.com>
Tested-by: Ivan <waffolz@hotmail.com>
Tested-by: Emanuel Castelo <emanuel.castelo@gmail.com>
Tested-by: Bruno Pesavento <bpesavento@infinito.it>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Frederic Bezies <fredbezies@gmail.com>
Cc: <stable@vger.kernel.org> # 4.9.0-
Pull vfs fixes from Al Viro.
The most notable fix here is probably the fix for a splice regression
("fix a fencepost error in pipe_advance()") noticed by Alan Wylie.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix a fencepost error in pipe_advance()
coredump: Ensure proper size of sparse core files
aio: fix lock dep warning
tmpfs: clear S_ISGID when setting posix ACLs
Pull block fixes from Jens Axboe:
- the virtio_blk stack DMA corruption fix from Christoph, fixing and
issue with VMAP stacks.
- O_DIRECT blkbits calculation fix from Chandan.
- discard regression fix from Christoph.
- queue init error handling fixes for nbd and virtio_blk, from Omar and
Jeff.
- two small nvme fixes, from Christoph and Guilherme.
- rename of blk_queue_zone_size and bdev_zone_size to _sectors instead,
to more closely follow what we do in other places in the block layer.
This interface is new for this series, so let's get the naming right
before releasing a kernel with this feature. From Damien.
* 'for-linus' of git://git.kernel.dk/linux-block:
block: don't try to discard from __blkdev_issue_zeroout
sd: remove __data_len hack for WRITE SAME
nvme: use blk_rq_payload_bytes
scsi: use blk_rq_payload_bytes
block: add blk_rq_payload_bytes
block: Rename blk_queue_zone_size and bdev_zone_size
nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too
nvme-rdma: fix nvme_rdma_queue_is_ready
virtio_blk: fix panic in initialization error path
nbd: blk_mq_init_queue returns an error code on failure, not NULL
virtio_blk: avoid DMA to stack for the sense buffer
do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned
If the last section of a core file ends with an unmapped or zero page,
the size of the file does not correspond with the last dump_skip() call.
gdb complains that the file is truncated and can be confusing to users.
After all of the vma sections are written, make sure that the file size
is no smaller than the current file position.
This problem can be demonstrated with gdb's bigcore testcase on the
sparc architecture.
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Some machines, such as the Lenovo ThinkPad W541 with firmware GNET80WW
(2.28), include memory map entries with phys_addr=0x0 and num_pages=0.
These machines fail to boot after the following commit,
commit 8e80632fb2 ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
Fix this by removing such bogus entries from the memory map.
Furthermore, currently the log output for this case (with efi=debug)
looks like:
[ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[0x0000000000000000-0xffffffffffffffff] (0MB)
This is clearly wrong, and also not as informative as it could be. This
patch changes it so that if we find obviously invalid memory map
entries, we print an error and skip those entries. It also detects the
display of the address range calculation overflow, so the new output is:
[ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
[ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[0x0000000000000000-0x0000000000000000] (invalid)
It also detects memory map sizes that would overflow the physical
address, for example phys_addr=0xfffffffffffff000 and
num_pages=0x0200000000000001, and prints:
[ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
[ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[phys_addr=0xfffffffffffff000-0x20ffffffffffffffff] (invalid)
It then removes these entries from the memory map.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ardb: refactor for clarity with no functional changes, avoid PAGE_SHIFT]
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
[Matt: Include bugzilla info in commit log]
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=191121
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It's possible to set up PEBS events to get only errors and not
any data, like on SNB-X (model 45) and IVB-EP (model 62)
via 2 perf commands running simultaneously:
taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10
This leads to a soft lock up, because the error path of the
intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt
for error PEBS interrupts, so in case you're getting ONLY
errors you don't have a way to stop the event when it's over
the max_samples_per_tick limit:
NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816]
...
RIP: 0010:[<ffffffff81159232>] [<ffffffff81159232>] smp_call_function_single+0xe2/0x140
...
Call Trace:
? trace_hardirqs_on_caller+0xf5/0x1b0
? perf_cgroup_attach+0x70/0x70
perf_install_in_context+0x199/0x1b0
? ctx_resched+0x90/0x90
SYSC_perf_event_open+0x641/0xf90
SyS_perf_event_open+0x9/0x10
do_syscall_64+0x6c/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25
Add perf_event_account_interrupt() which does the interrupt
and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s
error path.
We keep the pending_kill and pending_wakeup logic only in the
__perf_event_overflow() path, because they make sense only if
there's any data to deliver.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull btrfs fixes from Chris Mason:
"These are all over the place.
The tracepoint part of the pull fixes a crash and adds a little more
information to two tracepoints, while the rest are good old fashioned
fixes"
* 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: make tracepoint format strings more compact
Btrfs: add truncated_len for ordered extent tracepoints
Btrfs: add 'inode' for extent map tracepoint
btrfs: fix crash when tracepoint arguments are freed by wq callbacks
Btrfs: adjust outstanding_extents counter properly when dio write is split
Btrfs: fix lockdep warning about log_mutex
Btrfs: use down_read_nested to make lockdep silent
btrfs: fix locking when we put back a delayed ref that's too new
btrfs: fix error handling when run_delayed_extent_op fails
btrfs: return the actual error value from from btrfs_uuid_tree_iterate
other buggy modules!)
* two NULL pointer dereferences from syzkaller
* CVE from syzkaller, very serious on 4.10-rc, "just" kernel memory
leak on releases
* CVE from security@kernel.org, somewhat serious on AMD, less so on
Intel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJYd7l5AAoJEL/70l94x66DLWYH/0GUg+lK9J/gj0kwqi6BwsOP
Rrs5Y7XvyNLsy/piBrrHDHvRa+DfAkrU8nepwgygX/yuGmSDV/zmdIb8XA/dvKht
MN285NFlVjTyznYlU/LH3etx11CHLMNclishiFHQbcnohtvhOe+fvN6RVNdfeRxm
d9iBPOum15ikc1xDl2z8Op+ZXVjMxkgLkzIXFcDBpJf4BvUx0X+ZHZXIKdizVhgU
ZMD2ds/MutMB8X1A52qp6kQvT7xE4rp87M0So4qDMTbAto5G4ZmMaWC5MlK2Oxe/
o+3qnx4vVz4H6uYzg1N4diHiC+buhgtXCLwwkcUOKKUVqJRP9e0Bh7kw8JA52XU=
=C+tM
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
- fix for module unload vs deferred jump labels (note: there might be
other buggy modules!)
- two NULL pointer dereferences from syzkaller
- also syzkaller: fix emulation of fxsave/fxrstor/sgdt/sidt, problem
made worse during this merge window, "just" kernel memory leak on
releases
- fix emulation of "mov ss" - somewhat serious on AMD, less so on Intel
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix emulation of "MOV SS, null selector"
KVM: x86: fix NULL deref in vcpu_scan_ioapic
KVM: eventfd: fix NULL deref irqbypass consumer
KVM: x86: Introduce segmented_write_std
KVM: x86: flush pending lapic jump label updates on module unload
jump_labels: API for flushing deferred jump label updates
Add a helper to calculate the actual data transfer size for special
payload requests.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Fix up a data alignment issue on sparc by swapping the order
of the cookie byte array field with the length field in
struct tcp_fastopen_cookie, and making it a proper union
to clean up the typecasting.
This addresses log complaints like these:
log_unaligned: 113 callbacks suppressed
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Kernel unaligned access at TPC[9764ac] tcp_try_fastopen+0x2ec/0x360
Kernel unaligned access at TPC[9764c8] tcp_try_fastopen+0x308/0x360
Kernel unaligned access at TPC[9764e4] tcp_try_fastopen+0x324/0x360
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current KVM world switch code is unintentionally setting wrong bits to
CNTHCTL_EL2 when E2H == 1, which may allow guest OS to access physical
timer. Bit positions of CNTHCTL_EL2 are changing depending on
HCR_EL2.E2H bit. EL1PCEN and EL1PCTEN are 1st and 0th bits when E2H is
not set, but they are 11th and 10th bits respectively when E2H is set.
In fact, on VHE we only need to set those bits once, not for every world
switch. This is because the host kernel runs in EL2 with HCR_EL2.TGE ==
1, which makes those bits have no effect for the host kernel execution.
So we just set those bits once for guests, and that's it.
Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This time we got a few more fixes than the previous rc's, and the most
of commits were about ASoC. The only significant change in the core
side is the regression fix wrt the aux device list handling, and all
the rest are driver-specific small / trivial fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJYd+juAAoJEGwxgFQ9KSmkdpkQALA1JjQSHYyPybvwJNAjTUnB
+VaG+tTK0Q6TRS9KFUte41a8/FVazXwK9xBsEsOjaYFfd/nJj7WRoZsmbwMALY+S
6S7FBPBxuvrhzFSWCHBFL9E0aaxeHo81tHiZQoaSj4TAQJvIH9amDCNLgMM+kdIf
nRp5CgXZT4Xco462Ge+dpnz6KL7mRtv3f9EpjAHT/ptEWRIDQb7KOj3cmq7F5Jpk
SFigCWfHdthH6ldwVKnl3ZRGPYqmdVwq+Wq+jOKByDNK4yPKb0s3JFtldvDkszWI
IxblPwVbjnOPGvWY4dI1FT20Xuqjhgbm1HoZQMQsNGHwqZsUxdNtQTO7V3nlE57I
Kf+l6FC0f4aveIPo9Qt6J6T6kkPIYNhIhEaVhFXX3LFU91/f8fcDpyzunPP1pbA0
TyCbDsc6boPfrCh0qYeOZdfsJpfomL1hTyGpPkQiGGqKzw+uO3Jv+lsTeq/Qd0Ud
RvOhcW+UrP/gBIb7Q1zQpK7vbJna00WpmqBjYClENGuLWUCdPOOpkcLHVXOKsy/0
bxKsGt9rWzz6LtLX4bWgwHdu0tn9iXNggbcshp5LjyoChVCXaTh36uQHRyIXk8Bw
VhbMFW1ZIMac9pHj47EkpSTRsGPHZYwCnK5utWqf8RRgV86DpUrR9rrX2wVUxGYF
pGOR8RzEoM5i7PRWJyJE
=gGt3
-----END PGP SIGNATURE-----
Merge tag 'sound-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This time we got a few more fixes than the previous rc's, and most of
commits were about ASoC.
The only significant change in the core side is the regression fix wrt
the aux device list handling, and all the rest are driver-specific
small / trivial fixes"
* tag 'sound-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: usb-audio: Add a quirk for Plantronics BT600
ASoC: rt5645: set sel_i2s_pre_div1 to 2
ASoC: dpcm: Avoid putting stream state to STOP when FE stream is paused
ASoC: Intel: Skylake: Release FW ctx in cleanup
ASoC: Intel: bytcr-rt5640: fix settings in internal clock mode
ASoC: fsl_ssi: set fifo watermark to more reliable value
ASoC: nau8825: fix invalid configuration in Pre-Scalar of FLL
ASoC: nau8825: correct the function name of register
ASoC: Intel: Skylake: Fix to fail safely if module not available in path
ASoC: tlv320aic3x: Mark the RESET register as volatile
ASoC: Fix binding and probing of auxiliary components
ASoC: wm_adsp: Don't overrun firmware file buffer when reading region data
ASoC: Intel: bytcr_rt5640: fallback mechanism if MCLK is not enabled
ASoC: hdmi-codec: use unsigned type to structure members with bit-field
ASoC: topology: kfree kcontrol->private_value before freeing kcontrol
ASoC: rsnd: don't double free kctrl
ASoC: dwc: Fix PIO mode initialization
The inet6addr_chain is an atomic notifier chain, so we can't call
anything that might sleep (like lock_sock)... instead of closing the
socket from svc_age_temp_xprts_now (which is called by the notifier
function), just have the rpc service threads do it instead.
Cc: stable@vger.kernel.org
Fixes: c3d4879e01 "sunrpc: Add a function to close..."
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
It was only needed to protect the connector_list walking, see
commit 8c4ccc4ab6
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Jul 9 23:44:26 2015 +0200
drm/probe-helper: Grab mode_config.mutex in poll_init/enable
Unfortunately the commit message of that patch fails to mention that
the new locking check was for the connector_list.
But that requirement disappeared in
commit c36a3254f7
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Dec 15 16:58:43 2016 +0100
drm: Convert all helpers to drm_connector_list_iter
and so we can drop this again.
This fixes a locking inversion on nouveau, where the rpm code needs to
re-enable. But in other places the rpm_get() calls are nested within
the big modeset locks.
While at it, also improve the kerneldoc for these two functions a
notch.
v2: Update the kerneldoc even more to explain that these functions
can't be called concurrently, or bad things happen (Chris).
Cc: Dave Airlie <airlied@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Lyude <lyude@redhat.com>
Reviewed-by: Lyude <lyude@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111090117.5134-1-daniel.vetter@ffwll.ch
Falling back unconditionally to HostNotify as primary client's interrupt
breaks some drivers which alter their functionality depending on whether
interrupt is present or not, so let's introduce a board flag telling I2C
core explicitly if we want wired interrupt or HostNotify-based one:
I2C_CLIENT_HOST_NOTIFY.
For DT-based systems we introduce "host-notify" property that we convert
to I2C_CLIENT_HOST_NOTIFY board flag.
Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Pali Rohár <pali.rohar@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
This fixes two regressions that has been reported to be introduced in
v4.10-rc1.
* The first fix corrects an incorrect usage of the kref api.
* The second reverts the change to make the resource table read-only. As the
space each vdev resource is used as virtio device config space it must be
shared with the remote.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYdqtvAAoJEAsfOT8Nma3F58IP/A88sWYN0Fp8Zv7tU1H1sk4B
YFRCbf+g+9fFUEA55infRbCrQxzPV5ZINeIV8hmsFgradSnMLntTrl0KUpgjYr7q
eH6Nt7AHkloPME+fGV2m3aVrQGQDYOMaJ5kgRoFMn4YFU5KekcCwnwuwgxiSJiq5
PBVhUZfFUoBEWfAcCh9palkUZ7V5FMtSu5ZujPu6rpEwpchZqla4hTCVCDJsZsQo
HpF+Df30MSML8PiFtvX8aEKz4KzKKAFENAs0FQL22YcyhMF2A0LRy/LfVoL2ANDc
c0CVAEQueVr5diACVYtUzYQKJPib/Lw97OqO2rlh3F6AxsLhOcM46agx28OtJEKv
NRJRKKX5xWpXofsO+MyZ0odpesGWOOecouTbnmL6BgHeJKW2BVbkntbv/VDFD0f4
BIxCioLNCY/zGZ/FCConmceqoXoMJwgL5ZHp2vcL4icwZj4M//B1zQGBmWmauCZG
x/SA7A6r8DZ/Y6GyUP4rpzA0m3GUhEvFU4N4nEaQjBe9tVsougD2NIlgoDcARoO8
e7CiKNuAqViSeMJzlNmDBARDoQFCDoGLZhhS4qQsyMCmd6ySmu2x5MVNEuMRUpp3
wafRU5CMHJw/EUyFlybNQmkReE7RDhWuTZ42UBK+Zx1orhu+fH6ulPgPTaxKgp+d
x4XcbiZU4iqAobZ/wWsP
=GUNc
-----END PGP SIGNATURE-----
Merge tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc
Pull remoteproc fixes from Bjorn Andersson:
"This fixes two regressions that have been reported to be introduced in
v4.10-rc1.
- correct an incorrect usage of the kref api
- revert the change to make the resource table read-only. As the
space each vdev resource is used as virtio device config space it
must be shared with the remote"
* tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc:
Revert "remoteproc: Merge table_ptr and cached_table pointers"
remoteproc: fix vdev reference management
Pull scsi target fixes from Bart Van Assche:
- a series of bug fixes for the XCOPY implementation from David
Disseldorp
- one bug fix for the ibmvscsis driver, a driver that is used for
communication between partitions on IBM POWER systems.
* 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux:
ibmvscsis: Fix srp_transfer_data fail return code
target: support XCOPY requests without parameters
target: check for XCOPY parameter truncation
target: use XCOPY segment descriptor CSCD IDs
target: check XCOPY segment descriptor CSCD IDs
target: simplify XCOPY wwn->se_dev lookup helper
target: return UNSUPPORTED TARGET/SEGMENT DESC TYPE CODE sense
target: bounds check XCOPY total descriptor list length
target: bounds check XCOPY segment descriptor list
target: use XCOPY TOO MANY TARGET DESCRIPTORS sense
target: add XCOPY target/segment desc sense codes
All block device data fields and functions returning a number of 512B
sectors are by convention named xxx_sectors while names in the form
xxx_size are generally used for a number of bytes. The blk_queue_zone_size
and bdev_zone_size functions were not following this convention so rename
them.
No functional change is introduced by this patch.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Collapsed the two patches, they were nonsensically split and broke
bisection.
Signed-off-by: Jens Axboe <axboe@fb.com>
Modules that use static_key_deferred need a way to synchronize with
any delayed work that is still pending when the module is unloaded.
Introduce static_key_deferred_flush() which flushes any pending
jump label updates.
Signed-off-by: David Matlack <dmatlack@google.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Merge fixes from Andrew Morton:
"27 fixes.
There are three patches that aren't actually fixes. They're simple
function renamings which are nice-to-have in mainline as ongoing net
development depends on them."
* akpm: (27 commits)
timerfd: export defines to userspace
mm/hugetlb.c: fix reservation race when freeing surplus pages
mm/slab.c: fix SLAB freelist randomization duplicate entries
zram: support BDI_CAP_STABLE_WRITES
zram: revalidate disk under init_lock
mm: support anonymous stable page
mm: add documentation for page fragment APIs
mm: rename __page_frag functions to __page_frag_cache, drop order from drain
mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free
mm, memcg: fix the active list aging for lowmem requests when memcg is enabled
mm: don't dereference struct page fields of invalid pages
mailmap: add codeaurora.org names for nameless email commits
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
mm: pmd dirty emulation in page fault handler
ipc/sem.c: fix incorrect sem_lock pairing
lib/Kconfig.debug: fix frv build failure
mm: get rid of __GFP_OTHER_NODE
mm: fix remote numa hits statistics
mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}
ocfs2: fix crash caused by stale lvb with fsdlm plugin
...
Currently, this attribute is only fetched on station addition, but
not on station change. Since this info is only present in the assoc
request, with full station state support in the driver it cannot be
present when the station is added.
Thus, add support for changing the VHT opmode on station update if
done before (or while) the station is marked as associated. After
this, ignore it, since it used to be ignored.
Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since userspace is expected to call timerfd syscalls directly with these
flags/ioctls, make sure we export them so they don't have to duplicate
the values themselves.
Link: http://lkml.kernel.org/r/20161219064052.7196-1-vapier@gentoo.org
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During developemnt for zram-swap asynchronous writeback, I found strange
corruption of compressed page, resulting in:
Modules linked in: zram(E)
CPU: 3 PID: 1520 Comm: zramd-1 Tainted: G E 4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff88007620b840 task.stack: ffff880078090000
RIP: set_freeobj.part.43+0x1c/0x1f
RSP: 0018:ffff880078093ca8 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff880076798d88 RCX: ffffffff81c408c8
RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000246
RBP: ffff880078093cb0 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88005bc43030 R11: 0000000000001df3 R12: ffff880076798d88
R13: 000000000005bc43 R14: ffff88007819d1b8 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff88007e380000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc934048f20 CR3: 0000000077b01000 CR4: 00000000000406e0
Call Trace:
obj_malloc+0x22b/0x260
zs_malloc+0x1e4/0x580
zram_bvec_rw+0x4cd/0x830 [zram]
page_requests_rw+0x9c/0x130 [zram]
zram_thread+0xe6/0x173 [zram]
kthread+0xca/0xe0
ret_from_fork+0x25/0x30
With investigation, it reveals currently stable page doesn't support
anonymous page. IOW, reuse_swap_page can reuse the page without waiting
writeback completion so it can overwrite page zram is compressing.
Unfortunately, zram has used per-cpu stream feature from v4.7.
It aims for increasing cache hit ratio of scratch buffer for
compressing. Downside of that approach is that zram should ask
memory space for compressed page in per-cpu context which requires
stricted gfp flag which could be failed. If so, it retries to
allocate memory space out of per-cpu context so it could get memory
this time and compress the data again, copies it to the memory space.
In this scenario, zram assumes the data should never be changed
but it is not true unless stable page supports. So, If the data is
changed under us, zram can make buffer overrun because second
compression size could be bigger than one we got in previous trial
and blindly, copy bigger size object to smaller buffer which is
buffer overrun. The overrun breaks zsmalloc free object chaining
so system goes crash like above.
I think below is same problem.
https://bugzilla.suse.com/show_bug.cgi?id=997574
Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
writeback in there so the approach in this patch is simply return false if
we found it needs stable page. Although it increases memory footprint
temporarily, it happens rarely and it should be reclaimed easily althoug
it happened. Also, It would be better than waiting of IO completion,
which is critial path for application latency.
Fixes: da9556a236 ("zram: user per-cpu compression streams")
Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox
Link: http://lkml.kernel.org/r/1482366980-3782-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Hyeoncheol Lee <cheol.lee@lge.com>
Cc: <yjay.kim@lge.com>
Cc: Sangseok Lee <sangseok.lee@lge.com>
Cc: <stable@vger.kernel.org> [4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch does two things.
First it goes through and renames the __page_frag prefixed functions to
__page_frag_cache so that we can be clear that we are draining or
refilling the cache, not the frags themselves.
Second we drop the order parameter from __page_frag_cache_drain since we
don't actually need to pass it since all fragments are either order 0 or
must be a compound page.
Link: http://lkml.kernel.org/r/20170104023954.13451.5678.stgit@localhost.localdomain
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Page fragment updates", v4.
This patch series takes care of a few cleanups for the page fragments
API.
First we do some renames so that things are much more consistent. First
we move the page_frag_ portion of the name to the front of the functions
names. Secondly we split out the cache specific functions from the
other page fragment functions by adding the word "cache" to the name.
Finally I added a bit of documentation that will hopefully help to
explain some of this. I plan to revisit this later as we get things
more ironed out in the near future with the changes planned for the DMA
setup to support eXpress Data Path.
This patch (of 3):
This patch renames the page frag functions to be more consistent with
other APIs. Specifically we place the name page_frag first in the name
and then have either an alloc or free call name that we append as the
suffix. This makes it a bit clearer in terms of naming.
In addition we drop the leading double underscores since we are
technically no longer a backing interface and instead the front end that
is called from the networking APIs.
Link: http://lkml.kernel.org/r/20170104023854.13451.67390.stgit@localhost.localdomain
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nils Holland and Klaus Ethgen have reported unexpected OOM killer
invocations with 32b kernel starting with 4.8 kernels
kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0
kworker/u4:5 cpuset=/ mems_allowed=0
CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2
[...]
Mem-Info:
active_anon:58685 inactive_anon:90 isolated_anon:0
active_file:274324 inactive_file:281962 isolated_file:0
unevictable:0 dirty:649 writeback:0 unstable:0
slab_reclaimable:40662 slab_unreclaimable:17754
mapped:7382 shmem:202 pagetables:351 bounce:0
free:206736 free_pcp:332 free_cma:0
Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 813 3474 3474
Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB
lowmem_reserve[]: 0 0 21292 21292
HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB
the oom killer is clearly pre-mature because there there is still a lot
of page cache in the zone Normal which should satisfy this lowmem
request. Further debugging has shown that the reclaim cannot make any
forward progress because the page cache is hidden in the active list
which doesn't get rotated because inactive_list_is_low is not memcg
aware.
The code simply subtracts per-zone highmem counters from the respective
memcg's lru sizes which doesn't make any sense. We can simply end up
always seeing the resulting active and inactive counts 0 and return
false. This issue is not limited to 32b kernels but in practice the
effect on systems without CONFIG_HIGHMEM would be much harder to notice
because we do not invoke the OOM killer for allocations requests
targeting < ZONE_NORMAL.
Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node
and subtract per-memcg highmem counts when memcg is enabled. Introduce
helper lruvec_zone_lru_size which redirects to either zone counters or
mem_cgroup_get_zone_lru_size when appropriate.
We are losing empty LRU but non-zero lru size detection introduced by
ca707239e8 ("mm: update_lru_size warn and reset bad lru_size") because
of the inherent zone vs. node discrepancy.
Fixes: f8d1a31163 ("mm: consider whether to decivate based on eligible zones inactive ratio")
Link: http://lkml.kernel.org/r/20170104100825.3729-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Nils Holland <nholland@tisys.org>
Tested-by: Nils Holland <nholland@tisys.org>
Reported-by: Klaus Ethgen <Klaus@Ethgen.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit 00cd5c37af ("ptrace: permit ptracing of /sbin/init") we
can now trace init processes. init is initially protected with
SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
there are a number of paths during tracing where SIGNAL_UNKILLABLE can
be implicitly cleared.
This can result in init becoming stoppable/killable after tracing. For
example, running:
while true; do kill -STOP 1; done &
strace -p 1
and then stopping strace and the kill loop will result in init being
left in state TASK_STOPPED. Sending SIGCONT to init will resume it, but
init will now respond to future SIGSTOP signals rather than ignoring
them.
Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
that we don't clear SIGNAL_UNKILLABLE.
Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.com
Signed-off-by: Jamie Iles <jamie.iles@oracle.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The flag was introduced by commit 78afd5612d ("mm: add
__GFP_OTHER_NODE flag") to allow proper accounting of remote node
allocations done by kernel daemons on behalf of a process - e.g.
khugepaged.
After "mm: fix remote numa hits statistics" we do not need and actually
use the flag so we can safely remove it because all allocations which
are satisfied from their "home" node are accounted properly.
[mhocko@suse.com: fix build]
Link: http://lkml.kernel.org/r/20170106122225.GK5556@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170102153057.9451-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Konovalov has reported the following warning triggered by the
syzkaller fuzzer.
WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__alloc_pages_slowpath mm/page_alloc.c:3511
__alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
alloc_pages include/linux/gfp.h:469
kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
kmalloc_large include/linux/slab.h:422
__kmalloc+0x210/0x2d0 mm/slub.c:3723
kmalloc include/linux/slab.h:495
ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
new_sync_write fs/read_write.c:499
__vfs_write+0x483/0x760 fs/read_write.c:512
vfs_write+0x170/0x4e0 fs/read_write.c:560
SYSC_write fs/read_write.c:607
SyS_write+0xfb/0x230 fs/read_write.c:599
entry_SYSCALL_64_fastpath+0x1f/0xc2
The issue is caused by a lack of size check for the request size in
ep_write_iter which should be fixed. It, however, points to another
problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
resulting page allocator request might be MAX_ORDER which is too large
(see __alloc_pages_slowpath).
The same applies to the SLOB allocator which allows even larger sizes.
Make sure that they are capped properly and never request more than
MAX_ORDER order.
Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently dax_mapping_entry_mkclean() fails to clean and write protect
the pmd_t of a DAX PMD entry during an *sync operation. This can result
in data loss in the following sequence:
1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the
pmd_t dirty and writeable
2) fsync, flushing out PMD data and cleaning the radix tree entry. We
currently fail to mark the pmd_t as clean and write protected.
3) more mmap writes to the PMD. These don't cause any page faults since
the pmd_t is dirty and writeable. The radix tree entry remains clean.
4) fsync, which fails to flush the dirty PMD data because the radix tree
entry was clean.
5) crash - dirty data that should have been fsync'd as part of 4) could
still have been in the processor cache, and is lost.
Fix this by marking the pmd_t clean and write protected in
dax_mapping_entry_mkclean(), which is called as part of the fsync
operation 2). This will cause the writes in step 3) above to generate
page faults where we'll re-dirty the PMD radix tree entry, resulting in
flushes in the fsync that happens in step 4).
Fixes: 4b4bb46d00 ("dax: clear dirty entry tags on cache flush")
Link: http://lkml.kernel.org/r/1482272586-21177-3-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Write protect DAX PMDs in *sync path".
Currently dax_mapping_entry_mkclean() fails to clean and write protect
the pmd_t of a DAX PMD entry during an *sync operation. This can result
in data loss, as detailed in patch 2.
This series is based on Dan's "libnvdimm-pending" branch, which is the
current home for Jan's "dax: Page invalidation fixes" series. You can
find a working tree here:
https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean
This patch (of 2):
Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a
huge page PMD leaf to be found and returned.
Link: http://lkml.kernel.org/r/1482272586-21177-2-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The GRO fast path caches the frag0 address. This address becomes
invalid if frag0 is modified by pskb_may_pull or its variants.
So whenever that happens we must disable the frag0 optimization.
This is usually done through the combination of gro_header_hard
and gro_header_slow, however, the IPv6 extension header path did
the pulling directly and would continue to use the GRO fast path
incorrectly.
This patch fixes it by disabling the fast path when we enter the
IPv6 extension header path.
Fixes: 78a478d0ef ("gro: Inline skb_gro_header and cache frag0 virtual address")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As defined in http://www.t10.org/lists/asc-num.htm. To be used during
validation of XCOPY target and segment descriptor lists.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>