At first when sendpage gets called, if there is more data, 'more' in
tls_push_data() gets set which later sets pending_open_record_frags, but
when there is no more data in file left, and last time tls_push_data()
gets called, pending_open_record_frags doesn't get reset. And later when
2 bytes of encrypted alert comes as sendmsg, it first checks for
pending_open_record_frags, and since this is set, it creates a record with
0 data bytes to encrypt, meaning record length is prepend_size + tag_size
only, which causes problem.
We should set/reset pending_open_record_frags based on more bit.
Fixes: e8f6979981 ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When a ICMPV6_PKT_TOOBIG report a next-hop MTU that is less than the IPv6
minimum link MTU, the estimated path MTU is reduced to the minimum link
MTU. This behaviour breaks TAHI IPv6 Core Conformance Test v6LC4.1.6:
Packet Too Big Less than IPv6 MTU.
Referring to RFC 8201 section 4: "If a node receives a Packet Too Big
message reporting a next-hop MTU that is less than the IPv6 minimum link
MTU, it must discard it. A node must not reduce its estimate of the Path
MTU below the IPv6 minimum link MTU on receipt of a Packet Too Big
message."
Drop the path MTU update if reported MTU is less than the minimum link MTU.
Signed-off-by: Georg Kohmann <geokohma@cisco.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When processing a system suspend request we suspend modem endpoints
if they are enabled, and call ipa_cmd_tag_process() (which issues
IPA commands) to ensure the IPA pipeline is cleared. It is an error
to attempt to issue an IPA command before setup is complete, so this
is clearly a bug. But we also shouldn't suspend or resume any
endpoints that have not been set up.
Have ipa_endpoint_suspend() and ipa_endpoint_resume() immediately
return if setup hasn't completed, to avoid any attempt to configure
endpoints or issue IPA commands in that case.
Fixes: 84f9bd12d4 ("soc: qcom: ipa: IPA endpoints")
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter selftests fixes from
Fabian Frederick:
1) Extend selftest nft_meta.sh to check for meta cpu.
2) Fix selftest nft_meta.sh error reporting.
3) Fix shellcheck warnings in selftest nft_meta.sh.
4) Extend selftest nft_meta.sh to check for meta time.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
using packetdrill it's possible to observe the same MPTCP DSN being acked
by different subflows with DACK4 and DACK8. This is in contrast with what
specified in RFC8684 §3.3.2: if an MPTCP endpoint transmits a 64-bit wide
DSN, it MUST be acknowledged with a 64-bit wide DACK. Fix 'use_64bit_ack'
variable to make it a property of MPTCP sockets, not TCP subflows.
Fixes: a0c1d0eafd ("mptcp: Use 32-bit DATA_ACK when possible")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The phy_reset_after_clk_enable() does a PHY reset, which means the PHY
loses its register settings. The fec_enet_mii_probe() starts the PHY
and does the necessary calls to configure the PHY via PHY framework,
and loads the correct register settings into the PHY. Therefore,
fec_enet_mii_probe() should be called only after the PHY has been
reset, not before as it is now.
Fixes: 1b0a83ac04 ("net: fec: add phy_reset_after_clk_enable() support")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Richard Leitner <richard.leitner@skidata.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some last minute fixes. The last two of them haven't been in next but
they do seem kind of obvious, very small and safe, fix bugs reported in
the field, and they are both in a new mlx5 vdpa driver, so it's not like
we can introduce regressions.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl9/cWEPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpaPoH/2b+Hc0UvQPvAas1uWC022bV/VpfiW0+OZaT
IsP88s9IInjWUoBb7Rqkhi3jnZYs9p7W59AV9cNJ5g6vPxcxrJAfgeo9R4bq6rD9
LqIAxHRRwPEnYddtFAv/XnX4YuTS+cJFwFJLWGXXdySA5/pgFqc9qDXjNLFzxq7X
8pI7qbW04e9eS3i8vlZwXNHpHQ7DMcpgewR7XO1Lhqh4sHWfusKSEVDrOM7v/0ru
yMtwKA5X8vRZQTIoaLamRWIm/qLWIi/Wcor6APhRG0Hn9yzS21JRnGDs8iKRSjjx
ecBNatgGPrmbO7yCfyh4el0GgrYhxAk/w2H3p/aXQJu+sGXL8nk=
=y/Tb
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost fixes from Michael Tsirkin:
"Some last minute vhost,vdpa fixes.
The last two of them haven't been in next but they do seem kind of
obvious, very small and safe, fix bugs reported in the field, and they
are both in a new mlx5 vdpa driver, so it's not like we can introduce
regressions"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa/mlx5: Fix dependency on MLX5_CORE
vdpa/mlx5: should keep avail_index despite device status
vhost-vdpa: fix page pinning leakage in error path
vhost-vdpa: fix vhost_vdpa_map() on error condition
vhost: Don't call log_access_ok() when using IOTLB
vhost: Use vhost_get_used_size() in vhost_vring_set_addr()
vhost: Don't call access_ok() when using IOTLB
vhost vdpa: fix vhost_vdpa_open error handling
Pull networking fixes from Jakub Kicinski:
"One more set of fixes from the networking tree:
- add missing input validation in nl80211_del_key(), preventing
out-of-bounds access
- last minute fix / improvement of a MRP netlink (uAPI) interface
introduced in 5.9 (current) release
- fix "unresolved symbol" build error under CONFIG_NET w/o
CONFIG_INET due to missing tcp_timewait_sock and inet_timewait_sock
BTF.
- fix 32 bit sub-register bounds tracking in the bpf verifier for OR
case
- tcp: fix receive window update in tcp_add_backlog()
- openvswitch: handle DNAT tuple collision in conntrack-related code
- r8169: wait for potential PHY reset to finish after applying a FW
file, avoiding unexpected PHY behaviour and failures later on
- mscc: fix tail dropping watermarks for Ocelot switches
- avoid use-after-free in macsec code after a call to the GRO layer
- avoid use-after-free in sctp error paths
- add a device id for Cellient MPL200 WWAN card
- rxrpc fixes:
- fix the xdr encoding of the contents read from an rxrpc key
- fix a BUG() for a unsupported encoding type.
- fix missing _bh lock annotations.
- fix acceptance handling for an incoming call where the incoming
call is encrypted.
- the server token keyring isn't network namespaced - it belongs
to the server, so there's no need. Namespacing it means that
request_key() fails to find it.
- fix a leak of the server keyring"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (21 commits)
net: usb: qmi_wwan: add Cellient MPL200 card
macsec: avoid use-after-free in macsec_handle_frame()
r8169: consider that PHY reset may still be in progress after applying firmware
openvswitch: handle DNAT tuple collision
sctp: fix sctp_auth_init_hmacs() error path
bridge: Netlink interface fix.
net: wireless: nl80211: fix out-of-bounds access in nl80211_del_key()
bpf: Fix scalar32_min_max_or bounds tracking
tcp: fix receive window update in tcp_add_backlog()
net: usb: rtl8150: set random MAC address when set_ethernet_addr() fails
mptcp: more DATA FIN fixes
net: mscc: ocelot: warn when encoding an out-of-bounds watermark value
net: mscc: ocelot: divide watermark value by 60 when writing to SYS_ATOP
net: qrtr: ns: Fix the incorrect usage of rcu_read_lock()
rxrpc: Fix server keyring leak
rxrpc: The server keyring isn't network-namespaced
rxrpc: Fix accept on a connection that need securing
rxrpc: Fix some missing _bh annotations on locking conn->state_lock
rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()
rxrpc: Fix rxkad token xdr encoding
...
Remove propmt for selecting MLX5_VDPA by the user and modify
MLX5_VDPA_NET to select MLX5_VDPA. Also modify MLX5_VDPA_NET to depend
on mlx5_core.
This fixes an issue where configuration sets 'y' for MLX5_VDPA_NET while
MLX5_CORE is compiled as a module causing link errors.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 device")s
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20201007064011.GA50074@mtl-vdi-166.wap.labs.mlnx
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A VM with mlx5 vDPA has below warnings while being reset:
vhost VQ 0 ring restore failed: -1: Resource temporarily unavailable (11)
vhost VQ 1 ring restore failed: -1: Resource temporarily unavailable (11)
We should allow userspace emulating the virtio device be
able to get to vq's avail_index, regardless of vDPA device
status. Save the index that was last seen when virtq was
stopped, so that userspace doesn't complain.
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Link: https://lore.kernel.org/r/1601583511-15138-1-git-send-email-si-wei.liu@oracle.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Add usb ids of the Cellient MPL200 card.
Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
De-referencing skb after call to gro_cells_receive() is not allowed.
We need to fetch skb->len earlier.
Fixes: 5491e7c6b1 ("macsec: enable GRO and RPS on macsec devices")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some firmware files trigger a PHY soft reset and don't wait for it to
be finished. PHY register writes directly after applying the firmware
may fail or provide unexpected results therefore. Fix this by waiting
for bit BMCR_RESET to be cleared after applying firmware.
There's nothing wrong with the referenced change, it's just that the
fix will apply cleanly only after this change.
Fixes: 89fbd26cca ("r8169: fix firmware not resetting tp->ocp_base")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With multiple DNAT rules it's possible that after destination
translation the resulting tuples collide.
For example, two openvswitch flows:
nw_dst=10.0.0.10,tp_dst=10, actions=ct(commit,table=2,nat(dst=20.0.0.1:20))
nw_dst=10.0.0.20,tp_dst=10, actions=ct(commit,table=2,nat(dst=20.0.0.1:20))
Assuming two TCP clients initiating the following connections:
10.0.0.10:5000->10.0.0.10:10
10.0.0.10:5000->10.0.0.20:10
Both tuples would translate to 10.0.0.10:5000->20.0.0.1:20 causing
nf_conntrack_confirm() to fail because of tuple collision.
Netfilter handles this case by allocating a null binding for SNAT at
egress by default. Perform the same operation in openvswitch for DNAT
if no explicit SNAT is requested by the user and allocate a null binding
for SNAT for packets in the "original" direction.
Reported-at: https://bugzilla.redhat.com/1877128
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: 05752523e5 ("openvswitch: Interface with NAT.")
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2020-10-08
The main changes are:
1) Fix "unresolved symbol" build error under CONFIG_NET w/o CONFIG_INET due
to missing tcp_timewait_sock and inet_timewait_sock BTF, from Yonghong Song.
2) Fix 32 bit sub-register bounds tracking for OR case, from Daniel Borkmann.
====================
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is correcting NETLINK br_fill_ifinfo() to be able to
handle 'filter_mask' with multiple flags asserted.
Fixes: 36a8e8e265 ("bridge: Extend br_fill_ifinfo to return MPR status")
Signed-off-by: Henrik Bjoernlund <henrik.bjoernlund@microchip.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Suggested-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
nouveau:
- fix crash in TTM alloc fail path
- return error earlier for unknown chipsets
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJffof8AAoJEAx081l5xIa+J1cP/Rw0awD0mLwtjfI+9btpxBkk
/p308idpUNGvF92HJ/f+V8gwznR85mwes5ls6/qtfI78c+ShuTDlWwIDF7xHfIyJ
f8Ai/NqGciRcRceeM0kjC7+EUGd6xpyzEg2YADMuRYoeTqC4VTDdFM+Bf+YWYfyo
vgCMvidap8Sdc2K+mEhSr1PwbeB+13ViflgyWTne8o5mZxmq66d2/ufoa7qBZecJ
FMpansUaa5PXFFjVI6bYt+AmUNi50JDa63GO4UNuBCOzLqfRLnFj9yCCrgaNrTTx
rKcOAYvHphSRfkKU2OQ8dEYnzwAlCfthOc6Ks1TGd9ve4Z5swb6X8mMQiTxKvTDR
+EFKXQCtO/6c7y7bWQw7pGzoBMA1Bpi0ky1VtG+llME+F0W5ePaUqbVBj6AC4iIR
sPlT6wtrqW99/AfgvcfZs5wq25onoPSMZplGbfqx8AErFWp/KmEE/+R5bR27SA3N
TlKPzyYCQ3EL1nQmrfPnDwF+H8GetaVngJZe/awnr31xwWcHLl3h+FfIArzd7gRl
H2umkUIO/Uk8lIcIr0Vk90V84BLy+de4ijng2b5bnXKbBx7+o+e/faisqVXx8ZR6
2hmGupAiuOmHOOf2PCLPnUyZTN/J+pzURN6UK4yjk6nlTfN01wXn4+2w3ifB8m6b
Vl54q++yIQBaZAADE/NL
=IkbJ
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2020-10-08' of git://anongit.freedesktop.org/drm/drm
Pull drm nouveau fixes from Dave Airlie:
"Karol found two last minute nouveau fixes, they both fix crashes, the
TTM one follows what other drivers do already, and the other is for
bailing on load on unrecognised chipsets.
- fix crash in TTM alloc fail path
- return error earlier for unknown chipsets"
* tag 'drm-fixes-2020-10-08' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/mem: guard against NULL pointer access in mem_del
drm/nouveau/device: return error for unknown chipsets
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX31hAAAKCRCAXGG7T9hj
vjneAQDTJofrC76bt5QcPcrz1BWBC41tOOb5jSVLEVxwsnTfDAD/STWrrT6ZLH2z
759txSf/ZCnpRCub7IXgaUek5oNlSAI=
=QWgj
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.9b-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"One fix for a regression when booting as a Xen guest on ARM64
introduced probably during the 5.9 cycle. It is very low risk as it is
modifying Xen specific code only.
The exact commit introducing the bug hasn't been identified yet, but
everything was fine in 5.8 and only in 5.9 some configurations started
to fail"
* tag 'for-linus-5.9b-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
arm/arm64: xen: Fix to convert percpu address to gfn correctly
The afs filesystem has a lock[*] that it uses to serialise I/O operations
going to the server (vnode->io_lock), as the server will only perform one
modification operation at a time on any given file or directory. This
prevents the the filesystem from filling up all the call slots to a server
with calls that aren't going to be executed in parallel anyway, thereby
allowing operations on other files to obtain slots.
[*] Note that is probably redundant for directories at least since
i_rwsem is used to serialise directory modifications and
lookup/reading vs modification. The server does allow parallel
non-modification ops, however.
When a file truncation op completes, we truncate the in-memory copy of the
file to match - but we do it whilst still holding the io_lock, the idea
being to prevent races with other operations.
However, if writeback starts in a worker thread simultaneously with
truncation (whilst notify_change() is called with i_rwsem locked, writeback
pays it no heed), it may manage to set PG_writeback bits on the pages that
will get truncated before afs_setattr_success() manages to call
truncate_pagecache(). Truncate will then wait for those pages - whilst
still inside io_lock:
# cat /proc/8837/stack
[<0>] wait_on_page_bit_common+0x184/0x1e7
[<0>] truncate_inode_pages_range+0x37f/0x3eb
[<0>] truncate_pagecache+0x3c/0x53
[<0>] afs_setattr_success+0x4d/0x6e
[<0>] afs_wait_for_operation+0xd8/0x169
[<0>] afs_do_sync_operation+0x16/0x1f
[<0>] afs_setattr+0x1fb/0x25d
[<0>] notify_change+0x2cf/0x3c4
[<0>] do_truncate+0x7f/0xb2
[<0>] do_sys_ftruncate+0xd1/0x104
[<0>] do_syscall_64+0x2d/0x3a
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
The writeback operation, however, stalls indefinitely because it needs to
get the io_lock to proceed:
# cat /proc/5940/stack
[<0>] afs_get_io_locks+0x58/0x1ae
[<0>] afs_begin_vnode_operation+0xc7/0xd1
[<0>] afs_store_data+0x1b2/0x2a3
[<0>] afs_write_back_from_locked_page+0x418/0x57c
[<0>] afs_writepages_region+0x196/0x224
[<0>] afs_writepages+0x74/0x156
[<0>] do_writepages+0x2d/0x56
[<0>] __writeback_single_inode+0x84/0x207
[<0>] writeback_sb_inodes+0x238/0x3cf
[<0>] __writeback_inodes_wb+0x68/0x9f
[<0>] wb_writeback+0x145/0x26c
[<0>] wb_do_writeback+0x16a/0x194
[<0>] wb_workfn+0x74/0x177
[<0>] process_one_work+0x174/0x264
[<0>] worker_thread+0x117/0x1b9
[<0>] kthread+0xec/0xf1
[<0>] ret_from_fork+0x1f/0x30
and thus deadlock has occurred.
Note that whilst afs_setattr() calls filemap_write_and_wait(), the fact
that the caller is holding i_rwsem doesn't preclude more pages being
dirtied through an mmap'd region.
Fix this by:
(1) Use the vnode validate_lock to mediate access between afs_setattr()
and afs_writepages():
(a) Exclusively lock validate_lock in afs_setattr() around the whole
RPC operation.
(b) If WB_SYNC_ALL isn't set on entry to afs_writepages(), trying to
shared-lock validate_lock and returning immediately if we couldn't
get it.
(c) If WB_SYNC_ALL is set, wait for the lock.
The validate_lock is also used to validate a file and to zap its cache
if the file was altered by a third party, so it's probably a good fit
for this.
(2) Move the truncation outside of the io_lock in setattr, using the same
hook as is used for local directory editing.
This requires the old i_size to be retained in the operation record as
we commit the revised status to the inode members inside the io_lock
still, but we still need to know if we reduced the file size.
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit 70e806e4e6 ("mm: Do early cow for pinned pages during fork()
for ptes") we write-protected the PTE before doing the page pinning
check, in order to avoid a race with concurrent fast-GUP pinning (which
doesn't take the mm semaphore or the page table lock).
That trick doesn't actually work - it doesn't handle memory ordering
properly, and doing so would be prohibitively expensive.
It also isn't really needed. While we're moving in the direction of
allowing and supporting page pinning without marking the pinned area
with MADV_DONTFORK, the fact is that we've never really supported this
kind of odd "concurrent fork() and page pinning", and doing the
serialization on a pte level is just wrong.
We can add serialization with a per-mm sequence counter, so we know how
to solve that race properly, but we'll do that at a more appropriate
time. Right now this just removes the write protect games.
It also turns out that the write protect games actually break on Power,
as reported by Aneesh Kumar:
"Architecture like ppc64 expects set_pte_at to be not used for updating
a valid pte. This is further explained in commit 56eecdb912 ("mm:
Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit")"
and the code triggered a warning there:
WARNING: CPU: 0 PID: 30613 at arch/powerpc/mm/pgtable.c:185 set_pte_at+0x2a8/0x3a0 arch/powerpc/mm/pgtable.c:185
Call Trace:
copy_present_page mm/memory.c:857 [inline]
copy_present_pte mm/memory.c:899 [inline]
copy_pte_range mm/memory.c:1014 [inline]
copy_pmd_range mm/memory.c:1092 [inline]
copy_pud_range mm/memory.c:1127 [inline]
copy_p4d_range mm/memory.c:1150 [inline]
copy_page_range+0x1f6c/0x2cc0 mm/memory.c:1212
dup_mmap kernel/fork.c:592 [inline]
dup_mm+0x77c/0xab0 kernel/fork.c:1355
copy_mm kernel/fork.c:1411 [inline]
copy_process+0x1f00/0x2740 kernel/fork.c:2070
_do_fork+0xc4/0x10b0 kernel/fork.c:2429
Link: https://lore.kernel.org/lkml/CAHk-=wiWr+gO0Ro4LvnJBMs90OiePNyrE3E+pJvc9PzdBShdmw@mail.gmail.com/
Link: https://lore.kernel.org/linuxppc-dev/20201008092541.398079-1-aneesh.kumar@linux.ibm.com/
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simon reported an issue with the current scalar32_min_max_or() implementation.
That is, compared to the other 32 bit subreg tracking functions, the code in
scalar32_min_max_or() stands out that it's using the 64 bit registers instead
of 32 bit ones. This leads to bounds tracking issues, for example:
[...]
8: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
8: (79) r1 = *(u64 *)(r0 +0)
R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
9: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
9: (b7) r0 = 1
10: R0_w=inv1 R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
10: (18) r2 = 0x600000002
12: R0_w=inv1 R1_w=inv(id=0) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
12: (ad) if r1 < r2 goto pc+1
R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
13: R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
13: (95) exit
14: R0_w=inv1 R1_w=inv(id=0,umax_value=25769803777,var_off=(0x0; 0x7ffffffff)) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
14: (25) if r1 > 0x0 goto pc+1
R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
15: R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
15: (95) exit
16: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=25769803777,var_off=(0x0; 0x77fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
16: (47) r1 |= 0
17: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=32212254719,var_off=(0x1; 0x700000000),s32_max_value=1,u32_max_value=1) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
[...]
The bound tests on the map value force the upper unsigned bound to be 25769803777
in 64 bit (0b11000000000000000000000000000000001) and then lower one to be 1. By
using OR they are truncated and thus result in the range [1,1] for the 32 bit reg
tracker. This is incorrect given the only thing we know is that the value must be
positive and thus 2147483647 (0b1111111111111111111111111111111) at max for the
subregs. Fix it by using the {u,s}32_{min,max}_value vars instead. This also makes
sense, for example, for the case where we update dst_reg->s32_{min,max}_value in
the else branch we need to use the newly computed dst_reg->u32_{min,max}_value as
we know that these are positive. Previously, in the else branch the 64 bit values
of umin_value=1 and umax_value=32212254719 were used and latter got truncated to
be 1 as upper bound there. After the fix the subreg range is now correct:
[...]
8: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
8: (79) r1 = *(u64 *)(r0 +0)
R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R10=fp0 fp-8=mmmmmmmm
9: R0=map_value(id=0,off=0,ks=4,vs=48,imm=0) R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
9: (b7) r0 = 1
10: R0_w=inv1 R1_w=inv(id=0) R10=fp0 fp-8=mmmmmmmm
10: (18) r2 = 0x600000002
12: R0_w=inv1 R1_w=inv(id=0) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
12: (ad) if r1 < r2 goto pc+1
R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
13: R0_w=inv1 R1_w=inv(id=0,umin_value=25769803778) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
13: (95) exit
14: R0_w=inv1 R1_w=inv(id=0,umax_value=25769803777,var_off=(0x0; 0x7ffffffff)) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
14: (25) if r1 > 0x0 goto pc+1
R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
15: R0_w=inv1 R1_w=inv(id=0,umax_value=0,var_off=(0x0; 0x7fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
15: (95) exit
16: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=25769803777,var_off=(0x0; 0x77fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
16: (47) r1 |= 0
17: R0_w=inv1 R1_w=inv(id=0,umin_value=1,umax_value=32212254719,var_off=(0x0; 0x77fffffff),u32_max_value=2147483647) R2_w=inv25769803778 R10=fp0 fp-8=mmmmmmmm
[...]
Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Simon Scannell <scannell.smn@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Previously the code relied on device->pri to be NULL and to fail probing
later. We really should just return an error inside nvkm_device_ctor for
unsupported GPUs.
Fixes: 24d5ff40a7 ("drm/nouveau/device: rework mmio mapping code to get rid of second map")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Cc: dann frazier <dann.frazier@canonical.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201006220528.13925-1-kherbst@redhat.com
Fix missing result check of exfat_build_inode().
And use PTR_ERR_OR_ZERO instead of PTR_ERR.
Signed-off-by: Tetsuhiro Kohada <kohada.t2@gmail.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
matching the target BTI instruction (when branch target identification
is enabled).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl98sfoACgkQa9axLQDI
XvEAKA//UjU7u2Gk28v+KyQ/xVusZWPhWh5+FRbGg5OHno73tljPNC0NfP3Kz/Xt
vJqoLDi1TAisx6uezU4yVe4Ah0x7cba0eZnQ+x8SrFccQMSJKuwu471xm94O4t6k
fXlTVobmeunKlAz4YSw2XVeinnvtRavSfVkvTUa5O1tLnvNyBeVHvvgkj54s6Ymk
uHmt+4U8Pnqt1IWffqrHhnPxc+ILkW2+mp7ixhOPVpRd9B1LGFZh31bNUMjUqqCj
Ku0c0RXAUgMuE2DKE4IMOLZRGWhDiaja2hvjoQN6bYocBySq3BjTWzd/wUPwI8qY
h68n/kcNskC4sGi53r1JBETf3anXvYP5akLi8/qBe7JCDvKWzue/zyFBcarnnaJ4
BEkjAapVzMSTNYkGMGiWIFZwYld05l9crmKOOlgpVAOSivCKNPVXox73LPAo0cOm
9iXWil05iRjH8P52XKn2JoSl5Ca7TyqAdJckdFjKqO7CGxLqRiHmydwWoxKXCrJ1
K1Eu4n4d2SabChZofdUN1JeRLC0Moo2hMiqDCwwNQBeJmQWIWcnhJYg8TrcFRyh/
NdqVH9cz47COuQ5VDL1ipg/1smJFi36CbU74+v4DZPVMa8//mGGQnCPfMu0WUe29
lEweC8G1DWO/TTCGvkT7gbK7b8chf70v5Epuu8HaLPda2dS2by4=
=LWZg
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix a kernel panic in the AES crypto code caused by a BR tail call not
matching the target BTI instruction (when branch target identification
is enabled)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
crypto: arm64: Use x16 with indirect branch to bti_c
One final pdx86 fix for Tablet Mode reporting regressions (which
make the keyboard and touchpad unusable) on various Asus notebooks.
This fix has been tested as a downstream patch in Fedora kernels for
approx. 2 weeks with no problems being reported.
Shortlog:
Hans de Goede (1):
platform/x86: asus-wmi: Fix SW_TABLET_MODE always reporting 1 on many different models
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAl98OiQUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wkQgf+NpE3N3HjIvivbTZmgysdVTRdohXv
vNzH4tGRLpDtM06FvEWrEt30w/wIHhC1GKwXqJmF4ZraDC53FoRKpK0mRstP4vQO
VLiJqkulnqJPq2hyO3d5n7dhPgGTb2ZzsFpta4YkyMqwkfhXzQWDhKN8WDQ/9hql
XOdxBRu9zHV0yKGftGzGRlk0gJ+q2IJewU0HaHqdTGkPiWkOoM3yL2y23+f3hrLH
QZBiKvJ88T5vM5HY6FTnt4aGD3AZrwZZegrBB+Hza9aaV3nFW+jOjuQKcQ4nBDFy
MKkXk8JSssojT87rBp3b9g2zSHbEXGlafyC8hxQVd9YrYk2zsjlikCrHsw==
=Decp
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull another x86 platform driver fix from Hans de Goede:
"One final pdx86 fix for Tablet Mode reporting regressions (which make
the keyboard and touchpad unusable) on various Asus notebooks.
These regressions were caused by the asus-nb-wmi and the intel-vbtn
drivers both receiving recent patches to start reporting Tablet Mode /
to report it on more models.
Due to a miscommunication between Andy and me, Andy's earlier pull-req
only contained the fix for the intel-vbtn driver and not the fix for
the asus-nb-wmi code.
This fix has been tested as a downstream patch in Fedora kernels for
approx two weeks with no problems being reported"
* tag 'platform-drivers-x86-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-wmi: Fix SW_TABLET_MODE always reporting 1 on many different models
fbdev:
- Re-add FB_ARMCLCD for android.
- Fix global-out-of-bounds read in fbcon_get_font().
core:
- Small doc fix.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJffA1qAAoJEAx081l5xIa+K6IQAIM6IB2G5a2g0tpyxFTkwyAm
vDPrfm1BgnIJraNfDfkKuIVo30eExJqT3dihacAdv1Hmr1jDZBOrFcziaE5HvdJC
vjkAVc24myUtyL4j2mxKOSuAHGdohPQkj8ZTHGcudZf25ATUSDWp0ACMVscCdi5F
RBb/8BwwC0EjQV6iGLuAUR9e+kABh4bZKdboHh35wq4JeEGd9QGZH/9OLZRPhqsR
1Zqvf4agDpF1ORS80DzrxcieTfUlijjVtK9fA8aELz5/k+G7Zutb3Ttt+9N01MEk
qGc+/7QBdzbCGb31+rPqWDz+HgIJ/JH/ojxXnINdeVQ9a8IzwNeofk51+pPNSdMl
J9PrA3gnvaDNPR/ztIK+HSkhjO0ek7r2WiDxYl9IBsq/Pu3VMRD7A8pSDKor3S8R
+RweHoRtdNnvahN+R8lwcNkihDOKxtkV/IJ3c6icEfvWa5D0EOPwNgtCWTAwAgEZ
EfL8VxY5i8Gnj9rv/i/tB7Rm0V6VcDvG0DMcY6DLFA24PX7bYui/Mm0O6ckKZbkz
K7PQRccMg4/1QgkHKeMYvO5OnlEG4kW3FIudHOMTcvVvPZ9/5KNhLjHHnqH1NpsE
cvTZUv1qUGg0mIr/SL1mw0hJnSxdIoE1HL/WE7L9UGm6xgmdtdXdNSfoj/IzssA4
lEPeMhOnUxGgdZ2xDWP3
=1fPQ
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2020-10-06-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Daniel queued these up last week and I took a long weekend so didn't
get them out, but fixing the OOB access on get font seems like
something we should land and it's cc'ed stable as well.
The other big change is a partial revert for a regression on android
on the clcd fbdev driver, and one other docs fix.
fbdev:
- Re-add FB_ARMCLCD for android
- Fix global-out-of-bounds read in fbcon_get_font()
core:
- Small doc fix"
* tag 'drm-fixes-2020-10-06-1' of git://anongit.freedesktop.org/drm/drm:
drm: drm_dsc.h: fix a kernel-doc markup
Partially revert "video: fbdev: amba-clcd: Retire elder CLCD driver"
fbcon: Fix global-out-of-bounds read in fbcon_get_font()
Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts
fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h
Kernel threads intentionally do CLONE_FS in order to follow any changes
that 'init' does to set up the root directory (or cwd).
It is admittedly a bit odd, but it avoids the situation where 'init'
does some extensive setup to initialize the system environment, and then
we execute a usermode helper program, and it uses the original FS setup
from boot time that may be very limited and incomplete.
[ Both Al Viro and Eric Biederman point out that 'pivot_root()' will
follow the root regardless, since it fixes up other users of root (see
chroot_fs_refs() for details), but overmounting root and doing a
chroot() would not. ]
However, Vegard Nossum noticed that the CLONE_FS not only means that we
follow the root and current working directories, it also means we share
umask with whatever init changed it to. That wasn't intentional.
Just reset umask to the original default (0022) before actually starting
the usermode helper program.
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa reports that splice() can return 0 before the real EOF, if
the data in the splice source pipe is an empty pipe buffer. That empty
pipe buffer case doesn't happen in any normal situation, but you can
trigger it by doing a write to a pipe that fails due to a page fault.
Tetsuo has a test-case to show the behavior:
#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
const int fd = open("/tmp/testfile", O_WRONLY | O_CREAT, 0600);
int pipe_fd[2] = { -1, -1 };
pipe(pipe_fd);
write(pipe_fd[1], NULL, 4096);
/* This splice() should wait unless interrupted. */
return !splice(pipe_fd[0], NULL, fd, NULL, 65536, 0);
}
which results in
write(5, NULL, 4096) = -1 EFAULT (Bad address)
splice(4, NULL, 3, NULL, 65536, 0) = 0
and this can confuse splice() users into believing they have hit EOF
prematurely.
The issue was introduced when the pipe write code started pre-allocating
the pipe buffers before copying data from user space.
This is modified verion of Tetsuo's original patch.
Fixes: a194dfe6e6 ("pipe: Rearrange sequence in pipe_write() to preallocate slot")
Link:https://lore.kernel.org/linux-fsdevel/20201005121339.4063-1-penguin-kernel@I-love.SAKURA.ne.jp/
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The AES code uses a 'br x7' as part of a function called by
a macro. That branch needs a bti_j as a target. This results
in a panic as seen below. Using x16 (or x17) with an indirect
branch keeps the target bti_c.
Bad mode in Synchronous Abort handler detected on CPU1, code 0x34000003 -- BTI
CPU: 1 PID: 265 Comm: cryptomgr_test Not tainted 5.8.11-300.fc33.aarch64 #1
pstate: 20400c05 (nzCv daif +PAN -UAO BTYPE=j-)
pc : aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs]
lr : aesbs_xts_encrypt+0x48/0xe0 [aes_neon_bs]
sp : ffff80001052b730
aesbs_encrypt8+0x0/0x5f0 [aes_neon_bs]
__xts_crypt+0xb0/0x2dc [aes_neon_bs]
xts_encrypt+0x28/0x3c [aes_neon_bs]
crypto_skcipher_encrypt+0x50/0x84
simd_skcipher_encrypt+0xc8/0xe0
crypto_skcipher_encrypt+0x50/0x84
test_skcipher_vec_cfg+0x224/0x5f0
test_skcipher+0xbc/0x120
alg_test_skcipher+0xa0/0x1b0
alg_test+0x3dc/0x47c
cryptomgr_test+0x38/0x60
Fixes: 0e89640b64 ("crypto: arm64 - Use modern annotations for assembly functions")
Cc: <stable@vger.kernel.org> # 5.6.x-
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Suggested-by: Dave P Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20201006163326.2780619-1-jeremy.linton@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl97RWEACgkQ+7dXa6fL
C2sxNBAAhr1dnVfGHAV7mUVAv8BtNwY6B+mczIo48k53oiy0+Ngh83yrcdt2EkmY
s3JdbWq1rVlCps6zOOefKYfXG8FS2guFVDjKl9SaC6nYmxdEPnRmbW9mlhiFg/Na
xLnYVcJnuHw2ymisaRkARQn4w6F4CfEYBI9pbRpiw2d7vfD+Rziu49JMqVbTc2mF
g8tY0KPt81TouPlc//5BrY0dFat06gRbBsYcLmL/x/9aNofWg6F8dse9Evixgl3y
sY+ZwQkIxipYVyfuS9Z2UVhFTcYSvbTKWgvE08f9AK7iO6Y35hI4HIkZckIepgU0
rRNZY5AAq6Qb/kbGwIN27GDD/Ef8SqrW5NFdyRQykr8h1DIxGi5BlWRpVcpH1d9x
JI4fAp9dAcySOtusETrOBMvczz9wxB1HSe0tmrUP3lx0DLA484zdR8M+rQNPcEOK
M/x83hmIkMnmd3dH/eVNx0OwA35KVQ/eW79QsfDhnG2JVms4jwzqe/QfGpwXl2q9
SYNrlJZe6HjypNdWwMPZLswKzKe+7v9zKxY69TvsdKmqycQf2hVwsIxRmAr1GHEc
dQX3ag+LzS8elgqWRZ/NC4y8ojUgO73BhgL1DCrSgvu1UIzMC9bNSxrsdN+d3VSt
ZKzaFGQ9E9GDGSvfVJt/yRAb7kjQdeXchowWSGg804fPEzlGmds=
=dmWc
-----END PGP SIGNATURE-----
Merge tag 'rxrpc-fixes-20201005' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here are some miscellaneous rxrpc fixes:
(1) Fix the xdr encoding of the contents read from an rxrpc key.
(2) Fix a BUG() for a unsupported encoding type.
(3) Fix missing _bh lock annotations.
(4) Fix acceptance handling for an incoming call where the incoming call
is encrypted.
(5) The server token keyring isn't network namespaced - it belongs to the
server, so there's no need. Namespacing it means that request_key()
fails to find it.
(6) Fix a leak of the server keyring.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We got reports from GKE customers flows being reset by netfilter
conntrack unless nf_conntrack_tcp_be_liberal is set to 1.
Traces seemed to suggest ACK packet being dropped by the
packet capture, or more likely that ACK were received in the
wrong order.
wscale=7, SYN and SYNACK not shown here.
This ACK allows the sender to send 1871*128 bytes from seq 51359321 :
New right edge of the window -> 51359321+1871*128=51598809
09:17:23.389210 IP A > B: Flags [.], ack 51359321, win 1871, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389212 IP B > A: Flags [.], seq 51422681:51424089, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 1408
09:17:23.389214 IP A > B: Flags [.], ack 51422681, win 1376, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389253 IP B > A: Flags [.], seq 51424089:51488857, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 64768
09:17:23.389272 IP A > B: Flags [.], ack 51488857, win 859, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389275 IP B > A: Flags [.], seq 51488857:51521241, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384
Receiver now allows to send 606*128=77568 from seq 51521241 :
New right edge of the window -> 51521241+606*128=51598809
09:17:23.389296 IP A > B: Flags [.], ack 51521241, win 606, options [nop,nop,TS val 10 ecr 999], length 0
09:17:23.389308 IP B > A: Flags [.], seq 51521241:51553625, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384
It seems the sender exceeds RWIN allowance, since 51611353 > 51598809
09:17:23.389346 IP B > A: Flags [.], seq 51553625:51611353, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 57728
09:17:23.389356 IP B > A: Flags [.], seq 51611353:51618393, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 7040
09:17:23.389367 IP A > B: Flags [.], ack 51611353, win 0, options [nop,nop,TS val 10 ecr 999], length 0
netfilter conntrack is not happy and sends RST
09:17:23.389389 IP A > B: Flags [R], seq 92176528, win 0, length 0
09:17:23.389488 IP B > A: Flags [R], seq 174478967, win 0, length 0
Now imagine ACK were delivered out of order and tcp_add_backlog() sets window based on wrong packet.
New right edge of the window -> 51521241+859*128=51631193
Normally TCP stack handles OOO packets just fine, but it
turns out tcp_add_backlog() does not. It can update the window
field of the aggregated packet even if the ACK sequence
of the last received packet is too old.
Many thanks to Alexandre Ferrieux for independently reporting the issue
and suggesting a fix.
Fixes: 4f693b55c3 ("tcp: implement coalescing on backlog queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When get_registers() fails in set_ethernet_addr(),the uninitialized
value of node_id gets copied over as the address.
So, check the return value of get_registers().
If get_registers() executed successfully (i.e., it returns
sizeof(node_id)), copy over the MAC address using ether_addr_copy()
(instead of using memcpy()).
Else, if get_registers() failed instead, a randomly generated MAC
address is set as the MAC address instead.
Reported-by: syzbot+abbc768b560c84d92fd3@syzkaller.appspotmail.com
Tested-by: syzbot+abbc768b560c84d92fd3@syzkaller.appspotmail.com
Acked-by: Petko Manolov <petkan@nucleusys.com>
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently data fin on data packet are not handled properly:
the 'rcv_data_fin_seq' field is interpreted as the last
sequence number carrying a valid data, but for data fin
packet with valid maps we currently store map_seq + map_len,
that is, the next value.
The 'write_seq' fields carries instead the value subseguent
to the last valid byte, so in mptcp_write_data_fin() we
never detect correctly the last DSS map.
Fixes: 7279da6145 ("mptcp: Use MPTCP-level flag for sending DATA_FIN")
Fixes: 1a49b2c2a5 ("mptcp: Handle incoming 32-bit DATA_FIN values")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean says:
====================
Fix tail dropping watermarks for Ocelot switches
This series adds a missing division by 60, and a warning to prevent that
in the future.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an upper bound to the value that a watermark may hold. That
upper bound is not immediately obvious during configuration, and it
might be possible to have accidental truncation.
Actually this has happened already, add a warning to prevent it from
happening again.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tail dropping is enabled for a port when:
1. A source port consumes more packet buffers than the watermark encoded
in SYS:PORT:ATOP_CFG.ATOP.
AND
2. Total memory use exceeds the consumption watermark encoded in
SYS:PAUSE_CFG:ATOP_TOT_CFG.
The unit of these watermarks is a 60 byte memory cell. That unit is
programmed properly into ATOP_TOT_CFG, but not into ATOP. Actually when
written into ATOP, it would get truncated and wrap around.
Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rcu_read_lock() is not supposed to lock the kernel_sendmsg() API
since it has the lock_sock() in qrtr_sendmsg() which will sleep. Hence,
fix it by excluding the locking for kernel_sendmsg().
While at it, let's also use radix_tree_deref_retry() to confirm the
validity of the pointer returned by radix_tree_deref_slot() and use
radix_tree_iter_resume() to resume iterating the tree properly before
releasing the lock as suggested by Doug.
Fixes: a7809ff90c ("net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks")
Reported-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alex Elder <elder@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b0dbd97de1 ("platform/x86: asus-wmi: Add support for
SW_TABLET_MODE") added support for reporting SW_TABLET_MODE using the
Asus 0x00120063 WMI-device-id to see if various transformer models were
docked into their keyboard-dock (SW_TABLET_MODE=0) or if they were
being used as a tablet.
The new SW_TABLET_MODE support (naively?) assumed that non Transformer
devices would either not support the 0x00120063 WMI-device-id at all,
or would NOT set ASUS_WMI_DSTS_PRESENCE_BIT in their reply when querying
the device-id.
Unfortunately this is not true and we have received many bug reports about
this change causing the asus-wmi driver to always report SW_TABLET_MODE=1
on non Transformer devices. This causes libinput to think that these are
360 degree hinges style 2-in-1s folded into tablet-mode. Making libinput
suppress keyboard and touchpad events from the builtin keyboard and
touchpad. So effectively this causes the keyboard and touchpad to not work
on many non Transformer Asus models.
This commit fixes this by using the existing DMI based quirk mechanism in
asus-nb-wmi.c to allow using the 0x00120063 device-id for reporting
SW_TABLET_MODE on Transformer models and ignoring it on all other models.
Fixes: b0dbd97de1 ("platform/x86: asus-wmi: Add support for SW_TABLET_MODE")
Link: https://patchwork.kernel.org/patch/11780901/
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209011
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1876997
Reported-by: Samuel Čavoj <samuel@cavoj.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
* Attempt #3 of enabling Tablet Mode reporting w/o regressions
* Improve battery recognition code in ASUS WMI driver
* Fix Kconfig dependency warning for Fujitsu and LG laptop drivers
* Add fixes in Thinkpad ACPI driver for _BCL method and NVRAM polling
* Fix power supply extended topology in Mellanox driver
* Fix memory leak in OLPC EC driver
* Avoid static struct device in Intel PMC core driver
* Add support for the touchscreen found in MPMAN Converter9 2-in-1
* Update MAINTAINERS to reflect the real state of affairs
The following is an automated git shortlog grouped by driver:
asus-nb-wmi:
- Revert "Do not load on Asus T100TA and T200TA"
asus-wmi:
- Add BATC battery name to the list of supported
intel_pmc_core:
- do not create a static struct device
intel-vbtn:
- Switch to an allow-list for SW_TABLET_MODE reporting
- Revert "Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360"
- Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360
MAINTAINERS:
- Add Mark Gross and Hans de Goede as x86 platform drivers maintainers
mlx-platform:
- Fix extended topology configuration for power supply units
pcengines-apuv2:
- Fix typo on define of AMD_FCH_GPIO_REG_GPIO55_DEVSLP0
Platform:
- fix kconfig dependency warning for FUJITSU_LAPTOP
- fix kconfig dependency warning for LG_LAPTOP
- OLPC: Fix memleak in olpc_ec_probe
thinkpad_acpi:
- re-initialize ACPI buffer size when reuse
- initialize tp_nvram_state variable
- fix underline length build warning
touchscreen_dmi:
- Add info for the MPMAN Converter9 2-in-1
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl97IFAACgkQb7wzTHR8
rCjE8Q/9HvCzXq26X6HdcGgsV/+Ri+Kf0LyEFwTRgJJEuWt3RcBUARseryH6W1Gd
KOPnAHTt3EIEiFE2hXHIZOaLnNuuCA5Vlp78S/kLDk4Mn9p3dtvy6WK3ab07/Nok
0K+1eQsqNhgO8+ALcnZW6Q1iTl/aY0qzCWzd7GB5b1H0CXCqnJHBw6EchzOOX0qW
5eh8dDXfcgnh+PxHibFPBw6NLOG2ZyGv7R0dpGTzBtMyGA0+74A24nXjyG5oIANY
S7izmdcyTAvIxtx9tf92VdfiIJFk9l1qI6mUAfxtRQGgnBiwg/EiFd52Ot05b/j/
1Iih3c4U/fqbUrmKi31plPKS3QSn5qx1fEv2hsWQy+k4Eun9lGEVO/nYa5hScQ4u
2WQe32Wi+a4MxZFsQ70NWc3/d6LIRLJ3bt9VrZYN/hcTd3m0ixK4jzKlm2iE8rij
sRP5L7hzceg68nW0e83t7L02V7+h0dcHkKpAQ+7V2Q2Ku9l3opCWI/NXaDAioRiV
704zLVOHYlJXDs4j3HPqiQd2ePKpw2XC7hbz7sVq/lTaG83sRIwRs3C5XAyilxVf
/1x8v65s1N93f3/pXZKWjPbu8+RDMbJJlIt4zxidv0dHwaWt9QKIuCU+4Y85C1I3
jN7OGg1niM+3eyXF++WX4t79L+6CqjAHMdw29E/JeBv0ezB2Ol4=
=OmqG
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.9-2' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fixes from Andy Shevchenko:
"We have some fixes for Tablet Mode reporting in particular, that users
are complaining a lot about.
Summary:
- Attempt #3 of enabling Tablet Mode reporting w/o regressions
- Improve battery recognition code in ASUS WMI driver
- Fix Kconfig dependency warning for Fujitsu and LG laptop drivers
- Add fixes in Thinkpad ACPI driver for _BCL method and NVRAM polling
- Fix power supply extended topology in Mellanox driver
- Fix memory leak in OLPC EC driver
- Avoid static struct device in Intel PMC core driver
- Add support for the touchscreen found in MPMAN Converter9 2-in-1
- Update MAINTAINERS to reflect the real state of affairs"
* tag 'platform-drivers-x86-v5.9-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: thinkpad_acpi: re-initialize ACPI buffer size when reuse
MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers maintainers
platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reporting
platform/x86: intel-vbtn: Revert "Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360"
platform/x86: intel_pmc_core: do not create a static struct device
platform/x86: mlx-platform: Fix extended topology configuration for power supply units
platform/x86: pcengines-apuv2: Fix typo on define of AMD_FCH_GPIO_REG_GPIO55_DEVSLP0
platform/x86: fix kconfig dependency warning for FUJITSU_LAPTOP
platform/x86: fix kconfig dependency warning for LG_LAPTOP
platform/x86: thinkpad_acpi: initialize tp_nvram_state variable
platform/x86: intel-vbtn: Fix SW_TABLET_MODE always reporting 1 on the HP Pavilion 11 x360
platform/x86: asus-wmi: Add BATC battery name to the list of supported
platform/x86: asus-nb-wmi: Revert "Do not load on Asus T100TA and T200TA"
platform/x86: touchscreen_dmi: Add info for the MPMAN Converter9 2-in-1
Documentation: laptops: thinkpad-acpi: fix underline length build warning
Platform: OLPC: Fix memleak in olpc_ec_probe
Pull networking fixes from David Miller:
1) Make sure SKB control block is in the proper state during IPSEC
ESP-in-TCP encapsulation. From Sabrina Dubroca.
2) Various kinds of attributes were not being cloned properly when we
build new xfrm_state objects from existing ones. Fix from Antony
Antony.
3) Make sure to keep BTF sections, from Tony Ambardar.
4) TX DMA channels need proper locking in lantiq driver, from Hauke
Mehrtens.
5) Honour route MTU during forwarding, always. From Maciej
Żenczykowski.
6) Fix races in kTLS which can result in crashes, from Rohit
Maheshwari.
7) Skip TCP DSACKs with rediculous sequence ranges, from Priyaranjan
Jha.
8) Use correct address family in xfrm state lookups, from Herbert Xu.
9) A bridge FDB flush should not clear out user managed fdb entries
with the ext_learn flag set, from Nikolay Aleksandrov.
10) Fix nested locking of netdev address lists, from Taehee Yoo.
11) Fix handling of 32-bit DATA_FIN values in mptcp, from Mat Martineau.
12) Fix r8169 data corruptions on RTL8402 chips, from Heiner Kallweit.
13) Don't free command entries in mlx5 while comp handler could still be
running, from Eran Ben Elisha.
14) Error flow of request_irq() in mlx5 is busted, due to an off by one
we try to free and IRQ never allocated. From Maor Gottlieb.
15) Fix leak when dumping netlink policies, from Johannes Berg.
16) Sendpage cannot be performed when a page is a slab page, or the page
count is < 1. Some subsystems such as nvme were doing so. Create a
"sendpage_ok()" helper and use it as needed, from Coly Li.
17) Don't leak request socket when using syncookes with mptcp, from
Paolo Abeni.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
net/core: check length before updating Ethertype in skb_mpls_{push,pop}
net: mvneta: fix double free of txq->buf
net_sched: check error pointer in tcf_dump_walker()
net: team: fix memory leak in __team_options_register
net: typhoon: Fix a typo Typoon --> Typhoon
net: hinic: fix DEVLINK build errors
net: stmmac: Modify configuration method of EEE timers
tcp: fix syn cookied MPTCP request socket leak
libceph: use sendpage_ok() in ceph_tcp_sendpage()
scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map()
drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage()
tcp: use sendpage_ok() to detect misused .sendpage
nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage()
net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send
net: introduce helper sendpage_ok() in include/linux/net.h
net: usb: pegasus: Proper error handing when setting pegasus' MAC address
net: core: document two new elements of struct net_device
netlink: fix policy dump leak
net/mlx5e: Fix race condition on nhe->n pointer in neigh update
net/mlx5e: Fix VLAN create flow
...
If someone calls setsockopt() twice to set a server key keyring, the first
keyring is leaked.
Fix it to return an error instead if the server key keyring is already set.
Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>