Commit Graph

917631 Commits

Author SHA1 Message Date
Jeff Layton 829ad4db95 ceph: ceph_kick_flushing_caps needs the s_mutex
The mdsc->cap_dirty_lock is not held while walking the list in
ceph_kick_flushing_caps, which is not safe.

ceph_early_kick_flushing_caps does something similar, but the
s_mutex is held while it's called and I think that guards against
changes to the list.

Ensure we hold the s_mutex when calling ceph_kick_flushing_caps,
and add some clarifying comments.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:53 +02:00
Jeff Layton d67c72e6cc ceph: request expedited service on session's last cap flush
When flushing a lot of caps to the MDS's at once (e.g. for syncfs),
we can end up waiting a substantial amount of time for MDS replies, due
to the fact that it may delay some of them so that it can batch them up
together in a single journal transaction. This can lead to stalls when
calling sync or syncfs.

What we'd really like to do is request expedited service on the _last_
cap we're flushing back to the server. If the CHECK_CAPS_FLUSH flag is
set on the request and the current inode was the last one on the
session->s_cap_dirty list, then mark the request with
CEPH_CLIENT_CAPS_SYNC.

Note that this heuristic is not perfect. New inodes can race onto the
list after we've started flushing, but it does seem to fix some common
use cases.

URL: https://tracker.ceph.com/issues/44744
Reported-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 1cf03a68e7 ceph: convert mdsc->cap_dirty to a per-session list
This is a per-sb list now, but that makes it difficult to tell when
the cap is the last dirty one associated with the session. Switch
this to be a per-session list, but continue using the
mdsc->cap_dirty_lock to protect the lists.

This list is only ever walked in ceph_flush_dirty_caps, so change that
to walk the sessions array and then flush the caps for inodes on each
session's list.

If the auth cap ever changes while the inode has dirty caps, then
move the inode to the appropriate session for the new auth_cap. Also,
ensure that we never remove an auth cap while the inode is still on the
s_cap_dirty list.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Yan, Zheng 6f05b30ea0 ceph: reset i_requested_max_size if file write is not wanted
write can stuck at waiting for larger max_size in following sequence of
events:

- client opens a file and writes to position 'A' (larger than unit of
  max size increment)
- client closes the file handle and updates wanted caps (not wanting
  file write caps)
- client opens and truncates the file, writes to position 'A' again.

At the 1st event, client set inode's requested_max_size to 'A'. At the
2nd event, mds removes client's writable range, but client does not reset
requested_max_size. At the 3rd event, client does not request max size
because requested_max_size is already larger than 'A'.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 88828190f0 ceph: throw a warning if we destroy session with mutex still locked
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton dc3da0461c ceph: fix potential race in ceph_check_caps
Nothing ensures that session will still be valid by the time we
dereference the pointer. Take and put a reference.

In principle, we should always be able to get a reference here, but
throw a warning if that's ever not the case.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 4fb5dda39c ceph: document what protects i_dirty_item and i_flushing_item
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 7833323363 ceph: don't take i_ceph_lock in handle_cap_import
Just take it before calling it. This means we have to do a couple of
minor in-memory operations under the spinlock now, but those shouldn't
be an issue.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 7391fba267 ceph: don't release i_ceph_lock in handle_cap_trunc
There's no reason to do this here. Just have the caller handle it.
Also, add a lockdep assertion.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton d7dbfb4f2b ceph: add comments for handle_cap_flush_ack logic
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 681ac63488 ceph: split up __finish_cap_flush
This function takes a mdsc argument or ci argument, but if both are
passed in, it ignores the ci arg. Fortunately, nothing does that, but
there's no good reason to have the same function handle both cases.

Also, get rid of some branches and just use |= to set the wake_* vals.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:52 +02:00
Jeff Layton 0a454bdd50 ceph: reorganize __send_cap for less spinlock abuse
Get rid of the __releases annotation by breaking it up into two
functions: __prep_cap which is done under the spinlock and __send_cap
that is done outside it. Add new fields to cap_msg_args for the wake
boolean and old_xattr_buf pointer.

Nothing checks the return value from __send_cap, so make it void
return.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:51 +02:00
Xiubo Li 70c948206f ceph: add metadata perf metric support
Add a new "r_ended" field to struct ceph_mds_request and use that to
maintain the average latency of MDS requests.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:51 +02:00
Xiubo Li 97e27aaa9a ceph: add read/write latency metric support
Calculate the latency for OSD read requests. Add a new r_end_stamp
field to struct ceph_osd_request that will hold the time of that
the reply was received. Use that to calculate the RTT for each call,
and divide the sum of those by number of calls to get averate RTT.

Keep a tally of RTT for OSD writes and number of calls to track average
latency of OSD writes.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:51 +02:00
Xiubo Li 1af16d547f ceph: add caps perf metric for each superblock
Count hits and misses in the caps cache. If the client has all of
the necessary caps when a task needs references, then it's counted
as a hit. Any other situation is a miss.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:51 +02:00
Xiubo Li f9009efac4 ceph: add dentry lease metric support
For dentry leases, only count the hit/miss info triggered from the vfs
calls. For the cases like request reply handling and ceph_trim_dentries,
ignore them.

For now, these are only viewable using debugfs. Future patches will
allow the client to send the stats to the MDS.

The output looks like:

item          total           miss            hit
-------------------------------------------------
d_lease       11              7               141

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-06-01 13:22:51 +02:00
Linus Torvalds 3d77e6a880 Linux 5.7 2020-05-31 16:49:15 -07:00
Joe Perches bdc48fa11e checkpatch/coding-style: deprecate 80-column warning
Yes, staying withing 80 columns is certainly still _preferred_.  But
it's not the hard limit that the checkpatch warnings imply, and other
concerns can most certainly dominate.

Increase the default limit to 100 characters.  Not because 100
characters is some hard limit either, but that's certainly a "what are
you doing" kind of value and less likely to be about the occasional
slightly longer lines.

Miscellanea:

 - to avoid unnecessary whitespace changes in files, checkpatch will no
   longer emit a warning about line length when scanning files unless
   --strict is also used

 - Add a bit to coding-style about alignment to open parenthesis

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-31 11:00:42 -07:00
Linus Torvalds 8fc984aedc A pile of x86 fixes:
- Prevent a memory leak in ioperm which was caused by the stupid
     assumption that the exit cleanup is always called for current, which is
     not the case when fork fails after taking a reference on the ioperm
     bitmap.
 
   - Fix an arithmething overflow in the DMA code on 32bit systems
 
   - Fill gaps in the xstate copy with defaults instead of leaving them
     uninitialized
 
   - Revert: o"Make __X32_SYSCALL_BIT be unsigned long" as it turned out
     that existing user space fails to build.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl7Tt8YTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYobtTEACukhGsuivgiTwltWuHcATqrcNbgHSu
 nnhuQrjJ8KJiF5O60nDztPAVzxD+Ww2tzuDnD1BLFDI9cEA5oPhzXf7kUuJvrYUK
 INY+OALPPpw2iWjmygIsEyw3Pzmnm6peRA4h5UZSZdFxdROGGwBeGYNxowuVWFiH
 X7Fa1J4QxTI7e2X3psDVz94bOnVTPRPAR2bNpX8K8Qs+Wn1FFO92LFU04EvJTCHe
 JdN73VAS+0o0qPlPMewiuyfxaHexc8eJySMdOiysPnGRy+vagyyMPOV2Kg0DD6bp
 caDxCXNjIxXlRExV6F75s8hnl42DwXzLSzY/G7L/HVJ5r3voqcREYtXHgfenl7Jg
 8o6tEi+qFduPJ6SuRjfjPBDBF4wJvcjgmCwJaPJbMkrg8p5jH9Xg35egmEMo9cF8
 JQa2RzWJTR9XUjuPAuHJZR6f9jnle01PCznmw7Mavoed82udW1Lo32+QnvWsx6Qq
 4uuV38FqK3lsVCfFjyZir9OB9DGeuT/NETs3WJuGW5QUnC1mqfvIYipL3BkxNMKP
 IBB7n5X2iCJ545JkydepXF2I+b/i8XhNcIwYMVoSbZzBKccwCZ7zxHFNj6YAWG+M
 TN77x/+lw5zbnxhL3YzK+fgPNLio/By4Zcpmq6uppaf9Ip67SJGVq22Ef3S0w8vG
 X1inh1zqLX9hsQ==
 =DmSb
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A pile of x86 fixes:

   - Prevent a memory leak in ioperm which was caused by the stupid
     assumption that the exit cleanup is always called for current,
     which is not the case when fork fails after taking a reference on
     the ioperm bitmap.

   - Fix an arithmething overflow in the DMA code on 32bit systems

   - Fill gaps in the xstate copy with defaults instead of leaving them
     uninitialized

   - Revert: "Make __X32_SYSCALL_BIT be unsigned long" as it turned out
     that existing user space fails to build"

* tag 'x86-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioperm: Prevent a memory leak when fork fails
  x86/dma: Fix max PFN arithmetic overflow on 32 bit systems
  copy_xstate_to_kernel(): don't leave parts of destination uninitialized
  x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
2020-05-31 10:45:11 -07:00
Linus Torvalds 3d04282329 A single scheduler fix preventing a crash in NUMA balancing. The
current->mm check is not reliable as the mm might be temporary
 due to use_mm() in a kthread. Check for PF_KTHREAD explictely.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl7TtiATHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoR52EACe2TGyJ2k5raj86CD7tWqdTXnSJu9+
 Q8njzbfwwJWc9SR9jxJBd7H6VQ5Kyd71Yeyi0RcIjpC0CtxdR2ZzYl/mHSwPSuku
 TKFo1WgwajRnny6daoNuJMmVZKxaZfOVTtf8ekJOjUrWKOWIJyUb5wvcgstdL7Uz
 ZMPIYmL5TpqreRI0gLYIpPoZQoE/Cja0eB45H3JocD0s+o0BZJhUql/DuXB+SUWO
 ANVz2RZ4dNM8LHBfXdQrWTq5G6Ckhr9pm0o0lQPeEcKmypOrG0l9p2qeQV59pFgD
 QS0jsAqjrmtnfHvIdATUThE7oBfimf2NX1Nqmf1l1BmMiQnL0u3+yuvibAQj2/aA
 J2WoQXKsJHTKEcZ2DGmjksOrzeognDMqXgXR7hZztKkAoBvEpmiebqxMgTv6XaoF
 ZhF5NAx+DUngg2uQd65UW8dbFSg9yQh+73+wItRiBSuNB8ePweqgop7OZVv/hOa6
 MiFBaWAzSTugR2E2LBbLuwMsB+4lLyUaXo4TeM2NLUVGurtKO3HL6xgo7bII+l+X
 QNn/PaNweT6OD2kEJb66wYWVWxN7JsryDIHaJq6e/j7u7JmKCYAxd7XD92sKkVoS
 JNsCDzcUNRao/UI0jsirntvWy9bS9J+HOUaSF5N0AcDOuJeZSffm8AVyr0nlUI0y
 HGYFGiSqS0I3EA==
 =GfbN
 -----END PGP SIGNATURE-----

Merge tag 'sched-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Thomas Gleixner:
 "A single scheduler fix preventing a crash in NUMA balancing.

  The current->mm check is not reliable as the mm might be temporary due
  to use_mm() in a kthread. Check for PF_KTHREAD explictly"

* tag 'sched-urgent-2020-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Don't NUMA balance for kthreads
2020-05-31 10:43:17 -07:00
Linus Torvalds 19835b1ba6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Another week, another set of bug fixes:

   1) Fix pskb_pull length in __xfrm_transport_prep(), from Xin Long.

   2) Fix double xfrm_state put in esp{4,6}_gro_receive(), also from Xin
      Long.

   3) Re-arm discovery timer properly in mac80211 mesh code, from Linus
      Lüssing.

   4) Prevent buffer overflows in nf_conntrack_pptp debug code, from
      Pablo Neira Ayuso.

   5) Fix race in ktls code between tls_sw_recvmsg() and
      tls_decrypt_done(), from Vinay Kumar Yadav.

   6) Fix crashes on TCP fallback in MPTCP code, from Paolo Abeni.

   7) More validation is necessary of untrusted GSO packets coming from
      virtualization devices, from Willem de Bruijn.

   8) Fix endianness of bnxt_en firmware message length accesses, from
      Edwin Peer.

   9) Fix infinite loop in sch_fq_pie, from Davide Caratti.

  10) Fix lockdep splat in DSA by setting lockless TX in netdev features
      for slave ports, from Vladimir Oltean.

  11) Fix suspend/resume crashes in mlx5, from Mark Bloch.

  12) Fix use after free in bpf fmod_ret, from Alexei Starovoitov.

  13) ARP retransmit timer guard uses wrong offset, from Hongbin Liu.

  14) Fix leak in inetdev_init(), from Yang Yingliang.

  15) Don't try to use inet hash and unhash in l2tp code, results in
      crashes. From Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  l2tp: add sk_family checks to l2tp_validate_socket
  l2tp: do not use inet_hash()/inet_unhash()
  net: qrtr: Allocate workqueue before kernel_bind
  mptcp: remove msk from the token container at destruction time.
  mptcp: fix race between MP_JOIN and close
  mptcp: fix unblocking connect()
  net/sched: act_ct: add nat mangle action only for NAT-conntrack
  devinet: fix memleak in inetdev_init()
  virtio_vsock: Fix race condition in virtio_transport_recv_pkt
  drivers/net/ibmvnic: Update VNIC protocol version reporting
  NFC: st21nfca: add missed kfree_skb() in an error path
  neigh: fix ARP retransmit timer guard
  bpf, selftests: Add a verifier test for assigning 32bit reg states to 64bit ones
  bpf, selftests: Verifier bounds tests need to be updated
  bpf: Fix a verifier issue when assigning 32bit reg states to 64bit ones
  bpf: Fix use-after-free in fmod_ret check
  net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()
  net/mlx5e: Fix MLX5_TC_CT dependencies
  net/mlx5e: Properly set default values when disabling adaptive moderation
  net/mlx5e: Fix arch depending casting issue in FEC
  ...
2020-05-31 10:16:53 -07:00
Eric Dumazet d9a81a2252 l2tp: add sk_family checks to l2tp_validate_socket
syzbot was able to trigger a crash after using an ISDN socket
and fool l2tp.

Fix this by making sure the UDP socket is of the proper family.

BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
Write of size 1 at addr ffff88808ed0c590 by task syz-executor.5/3018

CPU: 0 PID: 3018 Comm: syz-executor.5 Not tainted 5.7.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x188/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:382
 __kasan_report.cold+0x20/0x38 mm/kasan/report.c:511
 kasan_report+0x33/0x50 mm/kasan/common.c:625
 setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78
 l2tp_tunnel_register+0xb15/0xdd0 net/l2tp/l2tp_core.c:1523
 l2tp_nl_cmd_tunnel_create+0x4b2/0xa60 net/l2tp/l2tp_netlink.c:249
 genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline]
 genl_family_rcv_msg net/netlink/genetlink.c:718 [inline]
 genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735
 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:746
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 ____sys_sendmsg+0x6e6/0x810 net/socket.c:2352
 ___sys_sendmsg+0x100/0x170 net/socket.c:2406
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x45ca29
Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007effe76edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004fe1c0 RCX: 000000000045ca29
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 000000000000094e R14: 00000000004d5d00 R15: 00007effe76ee6d4

Allocated by task 3018:
 save_stack+0x1b/0x40 mm/kasan/common.c:49
 set_track mm/kasan/common.c:57 [inline]
 __kasan_kmalloc mm/kasan/common.c:495 [inline]
 __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:468
 __do_kmalloc mm/slab.c:3656 [inline]
 __kmalloc+0x161/0x7a0 mm/slab.c:3665
 kmalloc include/linux/slab.h:560 [inline]
 sk_prot_alloc+0x223/0x2f0 net/core/sock.c:1612
 sk_alloc+0x36/0x1100 net/core/sock.c:1666
 data_sock_create drivers/isdn/mISDN/socket.c:600 [inline]
 mISDN_sock_create+0x272/0x400 drivers/isdn/mISDN/socket.c:796
 __sock_create+0x3cb/0x730 net/socket.c:1428
 sock_create net/socket.c:1479 [inline]
 __sys_socket+0xef/0x200 net/socket.c:1521
 __do_sys_socket net/socket.c:1530 [inline]
 __se_sys_socket net/socket.c:1528 [inline]
 __x64_sys_socket+0x6f/0xb0 net/socket.c:1528
 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x49/0xb3

Freed by task 2484:
 save_stack+0x1b/0x40 mm/kasan/common.c:49
 set_track mm/kasan/common.c:57 [inline]
 kasan_set_free_info mm/kasan/common.c:317 [inline]
 __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:456
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x109/0x2b0 mm/slab.c:3757
 kvfree+0x42/0x50 mm/util.c:603
 __free_fdtable+0x2d/0x70 fs/file.c:31
 put_files_struct fs/file.c:420 [inline]
 put_files_struct+0x248/0x2e0 fs/file.c:413
 exit_files+0x7e/0xa0 fs/file.c:445
 do_exit+0xb04/0x2dd0 kernel/exit.c:791
 do_group_exit+0x125/0x340 kernel/exit.c:894
 get_signal+0x47b/0x24e0 kernel/signal.c:2739
 do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784
 exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161
 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
 do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
 entry_SYSCALL_64_after_hwframe+0x49/0xb3

The buggy address belongs to the object at ffff88808ed0c000
 which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 1424 bytes inside of
 2048-byte region [ffff88808ed0c000, ffff88808ed0c800)
The buggy address belongs to the page:
page:ffffea00023b4300 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0xfffe0000000200(slab)
raw: 00fffe0000000200 ffffea0002838208 ffffea00015ba288 ffff8880aa000e00
raw: 0000000000000000 ffff88808ed0c000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88808ed0c480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88808ed0c500: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88808ed0c580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
 ffff88808ed0c600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88808ed0c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Fixes: 6b9f34239b ("l2tp: fix races in tunnel creation")
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Cc: Guillaume Nault <gnault@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:56:55 -07:00
Eric Dumazet 02c71b144c l2tp: do not use inet_hash()/inet_unhash()
syzbot recently found a way to crash the kernel [1]

Issue here is that inet_hash() & inet_unhash() are currently
only meant to be used by TCP & DCCP, since only these protocols
provide the needed hashinfo pointer.

L2TP uses a single list (instead of a hash table)

This old bug became an issue after commit 6102365876
("bpf: Add new cgroup attach type to enable sock modifications")
since after this commit, sk_common_release() can be called
while the L2TP socket is still considered 'hashed'.

general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 7063 Comm: syz-executor654 Not tainted 5.7.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600
Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00
RSP: 0018:ffffc90001777d30 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242
RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008
RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1
R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0
R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00
FS:  0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 sk_common_release+0xba/0x370 net/core/sock.c:3210
 inet_create net/ipv4/af_inet.c:390 [inline]
 inet_create+0x966/0xe00 net/ipv4/af_inet.c:248
 __sock_create+0x3cb/0x730 net/socket.c:1428
 sock_create net/socket.c:1479 [inline]
 __sys_socket+0xef/0x200 net/socket.c:1521
 __do_sys_socket net/socket.c:1530 [inline]
 __se_sys_socket net/socket.c:1528 [inline]
 __x64_sys_socket+0x6f/0xb0 net/socket.c:1528
 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x441e29
Code: e8 fc b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffdce184148 EFLAGS: 00000246 ORIG_RAX: 0000000000000029
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441e29
RDX: 0000000000000073 RSI: 0000000000000002 RDI: 0000000000000002
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000402c30 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace 23b6578228ce553e ]---
RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600
Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00
RSP: 0018:ffffc90001777d30 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242
RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008
RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1
R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0
R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00
FS:  0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: 0d76751fad ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Reported-by: syzbot+3610d489778b57cc8031@syzkaller.appspotmail.com
2020-05-30 21:55:16 -07:00
Chris Lew c6e08d6251 net: qrtr: Allocate workqueue before kernel_bind
A null pointer dereference in qrtr_ns_data_ready() is seen if a client
opens a qrtr socket before qrtr_ns_init() can bind to the control port.
When the control port is bound, the ENETRESET error will be broadcasted
and clients will close their sockets. This results in DEL_CLIENT
packets being sent to the ns and qrtr_ns_data_ready() being called
without the workqueue being allocated.

Allocate the workqueue before setting sk_data_ready and binding to the
control port. This ensures that the work and workqueue structs are
allocated and initialized before qrtr_ns_data_ready can be called.

Fixes: 0c2204a4ad ("net: qrtr: Migrate nameservice to kernel from userspace")
Signed-off-by: Chris Lew <clew@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:43:13 -07:00
David S. Miller e237659cfe Merge branch 'mptcp-a-bunch-of-fixes'
Paolo Abeni says:

====================
mptcp: a bunch of fixes

This patch series pulls together a few bugfixes for MPTCP bug observed while
doing stress-test with apache bench - forced to use MPTCP and multiple
subflows.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:39:13 -07:00
Paolo Abeni c5c79763fa mptcp: remove msk from the token container at destruction time.
Currently we remote the msk from the token container only
via mptcp_close(). The MPTCP master socket can be destroyed
also via other paths (e.g. if not yet accepted, when shutting
down the listener socket). When we hit the latter scenario,
dangling msk references are left into the token container,
leading to memory corruption and/or UaF.

This change addresses the issue by moving the token removal
into the msk destructor.

Fixes: 79c0949e9a ("mptcp: Add key generation and token tree")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:39:13 -07:00
Paolo Abeni 10f6d46c94 mptcp: fix race between MP_JOIN and close
If a MP_JOIN subflow completes the 3whs while another
CPU is closing the master msk, we can hit the
following race:

CPU1                                    CPU2

close()
 mptcp_close
                                        subflow_syn_recv_sock
                                         mptcp_token_get_sock
                                         mptcp_finish_join
                                          inet_sk_state_load
  mptcp_token_destroy
  inet_sk_state_store(TCP_CLOSE)
  __mptcp_flush_join_list()
                                          mptcp_sock_graft
                                          list_add_tail
  sk_common_release
   sock_orphan()
 <socket free>

The MP_JOIN socket will be leaked. Additionally we can hit
UaF for the msk 'struct socket' referenced via the 'conn'
field.

This change try to address the issue introducing some
synchronization between the MP_JOIN 3whs and mptcp_close
via the join_list spinlock. If we detect the msk is closing
the MP_JOIN socket is closed, too.

Fixes: f296234c98 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:39:13 -07:00
Paolo Abeni 41be81a8d3 mptcp: fix unblocking connect()
Currently unblocking connect() on MPTCP sockets fails frequently.
If mptcp_stream_connect() is invoked to complete a previously
attempted unblocking connection, it will still try to create
the first subflow via __mptcp_socket_create(). If the 3whs is
completed and the 'can_ack' flag is already set, the latter
will fail with -EINVAL.

This change addresses the issue checking for pending connect and
delegating the completion to the first subflow. Additionally
do msk addresses and sk_state changes only when needed.

Fixes: 2303f994b3 ("mptcp: Associate MPTCP context with TCP socket")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 21:39:13 -07:00
wenxu 05aa69e5cb net/sched: act_ct: add nat mangle action only for NAT-conntrack
Currently add nat mangle action with comparing invert and orig tuple.
It is better to check IPS_NAT_MASK flags first to avoid non necessary
memcmp for non-NAT conntrack.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:57:58 -07:00
Yang Yingliang 1b49cd71b5 devinet: fix memleak in inetdev_init()
When devinet_sysctl_register() failed, the memory allocated
in neigh_parms_alloc() should be freed.

Fixes: 20e61da7ff ("ipv4: fail early when creating netdev named all or default")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:48:56 -07:00
Jia He 8692cefc43 virtio_vsock: Fix race condition in virtio_transport_recv_pkt
When client on the host tries to connect(SOCK_STREAM, O_NONBLOCK) to the
server on the guest, there will be a panic on a ThunderX2 (armv8a server):

[  463.718844] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  463.718848] Mem abort info:
[  463.718849]   ESR = 0x96000044
[  463.718852]   EC = 0x25: DABT (current EL), IL = 32 bits
[  463.718853]   SET = 0, FnV = 0
[  463.718854]   EA = 0, S1PTW = 0
[  463.718855] Data abort info:
[  463.718856]   ISV = 0, ISS = 0x00000044
[  463.718857]   CM = 0, WnR = 1
[  463.718859] user pgtable: 4k pages, 48-bit VAs, pgdp=0000008f6f6e9000
[  463.718861] [0000000000000000] pgd=0000000000000000
[  463.718866] Internal error: Oops: 96000044 [#1] SMP
[...]
[  463.718977] CPU: 213 PID: 5040 Comm: vhost-5032 Tainted: G           O      5.7.0-rc7+ #139
[  463.718980] Hardware name: GIGABYTE R281-T91-00/MT91-FS1-00, BIOS F06 09/25/2018
[  463.718982] pstate: 60400009 (nZCv daif +PAN -UAO)
[  463.718995] pc : virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[  463.718999] lr : virtio_transport_recv_pkt+0x1fc/0xd40 [vmw_vsock_virtio_transport_common]
[  463.719000] sp : ffff80002dbe3c40
[...]
[  463.719025] Call trace:
[  463.719030]  virtio_transport_recv_pkt+0x4c8/0xd40 [vmw_vsock_virtio_transport_common]
[  463.719034]  vhost_vsock_handle_tx_kick+0x360/0x408 [vhost_vsock]
[  463.719041]  vhost_worker+0x100/0x1a0 [vhost]
[  463.719048]  kthread+0x128/0x130
[  463.719052]  ret_from_fork+0x10/0x18

The race condition is as follows:
Task1                                Task2
=====                                =====
__sock_release                       virtio_transport_recv_pkt
  __vsock_release                      vsock_find_bound_socket (found sk)
    lock_sock_nested
    vsock_remove_sock
    sock_orphan
      sk_set_socket(sk, NULL)
    sk->sk_shutdown = SHUTDOWN_MASK
    ...
    release_sock
                                    lock_sock
                                       virtio_transport_recv_connecting
                                         sk->sk_socket->state (panic!)

The root cause is that vsock_find_bound_socket can't hold the lock_sock,
so there is a small race window between vsock_find_bound_socket() and
lock_sock(). If __vsock_release() is running in another task,
sk->sk_socket will be set to NULL inadvertently.

This fixes it by checking sk->sk_shutdown(suggested by Stefano) after
lock_sock since sk->sk_shutdown is set to SHUTDOWN_MASK under the
protection of lock_sock_nested.

Signed-off-by: Jia He <justin.he@arm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-30 17:44:01 -07:00
Linus Torvalds ffeb595d84 powerpc fixes for 5.7 #6
A fix for the recent change to how we restore non-volatile GPRs, which broke our
 emulation of reading from the DSCR (Data Stream Control Register).
 
 And a fix for the recent rewrite of interrupt/syscall exit in C, we need to
 exclude KCOV from that code, otherwise it can lead to unrecoverable faults.
 
 Thanks to:
   Daniel Axtens.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl7SY4gTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLrdD/9E6AuIXrHcQvsPg9wIdSBHTgZnM50R
 GD/9N21qL/426jlpIA2hWhpyDtNevxk6TsVe67JJV6XgbYkxe0vgwV28zhefYVV0
 YmUiP/BfRktvyn5jR1HkOOj/9vXoz15mcUwfkTgQjQORSjuzIRhml7JZzeJL6YSa
 i3EWUbAlzPJ5BFKE0XzspPxkpaIhcAPqiP55rSrrYcex4/xoaReHUyKESa1sin4X
 YYuUBHI6Ze2OqhFhEVHSime+j8qUOxU4l4/3oJC8I00xDAX++S69cnGrlISGB+Pe
 sDmMIyi7O89QojMNS6z9vGQ8milUqLBvTNY2IKPam7APZeyGQm95oKAD3rRx+L9+
 lsiJfV2X23Lq6ZfZhQe0bHB8n0SxIjFYogC+SYmHEtiLO20+FNsdXSH6UaU3F8QU
 YSgYxda41dgAhMDInEIt5D1OjGRk705b9rtIPeHGDdw/vPrwuuDnlzqmgAFG9ahv
 x/Q0IvZAHgV+ZIiMNsQ2Qu9gJb7z9WQ7VB7j5KkWHS4q2Ja03uYhrRDjOOGe8sca
 j87Jgfu99vQhN/YQwmoTJZJlCd9guEcUdgQCGCLyiD7ywl0xCQ1OUxO2RQfu1H5V
 bmdOJFPam+sQhg4Clq8EmHzgOuaMpOqdJcDYm/7LV5w9g0qrsfZQ/EqTS3F/alrv
 9fHeX2gIHaKPSQ==
 =TXe7
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - a fix for the recent change to how we restore non-volatile GPRs,
   which broke our emulation of reading from the DSCR (Data Stream
   Control Register).

 - a fix for the recent rewrite of interrupt/syscall exit in C, we need
   to exclude KCOV from that code, otherwise it can lead to
   unrecoverable faults.

Thanks to Daniel Axtens.

* tag 'powerpc-5.7-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s: Disable sanitisers for C syscall/interrupt entry/exit code
  powerpc/64s: Fix restore of NV GPRs after facility unavailable exception
2020-05-30 12:28:44 -07:00
Linus Torvalds 900db15047 GPIO fixes for the v5.7 series:
- Fix proving of mvebu chips without PWM
 - Fix errorpath on ida_get_simple() on the exar driver
 - Notify userspace properly about line status changes
   when flags are changed on lines.
 - Fix a sleeping while holding spinlock in the mellanox
   driver.
 - Fix return value of the PXA and Kona probe calls.
 - Fix IRQ locking of open drain lines, it is fine to
   have IRQs on open drain lines flagged for output.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl7SPXIACgkQQRCzN7AZ
 XXNnZA//SEFQgIuzRXOVImqgPEa9Rb3v592UtHTWeDXkdm3pUbmW3FEDmL8f5oaB
 naBCoCpcUSq8TTi3Ja60jeHt6clbBHUGYc//CgQWDaf5AjShUWLdcUOi6N3rmcZv
 5h7csbHvuVBWVgPWhop6yOmhfYoTMbOnyOWGJDHn8f1oOmZnlfGJsCansHraZWBZ
 ExQtmTiHlz89r8HIMvnRU1TloqTENqOca123WvBvJ1rdPaN7yKQkQaHl0p/hQWZb
 sm3KgTI3PjBU8Ys5nitAUG0Dn85TPzHTnnOURmWRL4yQAg2pCwLUcDvMKeGHlUz8
 ygLes55qdOKyiR1wsb2OlbL+LrDOOuU2v13Q6j7aNszDWYq3mxRDRaMqPIfSRFG9
 JeYMqVdvkpk6cZsOQtvltjZlO8kD66OfffHrZfR1Amt7qqna+MoB09IXZ50X15d5
 Cg6Ek8A/WKMPYILVgWzgkmm0o+lbtNpmB0TIEEFLwMwatllQNN2wW6/4OzT1wFau
 kWY/TdlM29dgH/+Iqv6m1tJj8ONsMyTiY4Iogi6Irm8L8IVcYFaSAPVNJRZZOhso
 wt2Lvh3MYEtzWPQdSTA7HuFmYohaCR6DbDjAodhLbnZdhrOeiJPljTvx1FnkiU46
 G+bXSmzxkhzFgZnsRydnnutjugdlVYoU3ar/Sdh0WqyiAjc5D1g=
 =NXVY
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Here are some (very) late fixes for GPIO, none of them very serious
  except the one tagged for stable for enabling IRQ on open drain lines:

   - Fix probing of mvebu chips without PWM

   - Fix error path on ida_get_simple() on the exar driver

   - Notify userspace properly about line status changes when flags are
     changed on lines.

   - Fix a sleeping while holding spinlock in the mellanox driver.

   - Fix return value of the PXA and Kona probe calls.

   - Fix IRQ locking of open drain lines, it is fine to have IRQs on
     open drain lines flagged for output"

* tag 'gpio-v5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: fix locking open drain IRQ lines
  gpio: bcm-kona: Fix return value of bcm_kona_gpio_probe()
  gpio: pxa: Fix return value of pxa_gpio_probe()
  gpio: mlxbf2: Fix sleeping while holding spinlock
  gpiolib: notify user-space about line status changes after flags are set
  gpio: exar: Fix bad handling for ida_simple_get error path
  gpio: mvebu: Fix probing for chips without PWM
2020-05-30 12:26:21 -07:00
Thomas Falcon 784688993e drivers/net/ibmvnic: Update VNIC protocol version reporting
VNIC protocol version is reported in big-endian format, but it
is not byteswapped before logging. Fix that, and remove version
comparison as only one protocol version exists at this time.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 17:20:59 -07:00
Chuhong Yuan 3decabdc71 NFC: st21nfca: add missed kfree_skb() in an error path
st21nfca_tm_send_atr_res() misses to call kfree_skb() in an error path.
Add the missed function call to fix it.

Fixes: 1892bf844e ("NFC: st21nfca: Adding P2P support to st21nfca in Initiator & Target mode")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 17:04:14 -07:00
Hangbin Liu 96d10d5b19 neigh: fix ARP retransmit timer guard
In commit 19e16d220f ("neigh: support smaller retrans_time settting")
we add more accurate control for ARP and NS. But for ARP I forgot to
update the latest guard in neigh_timer_handler(), then the next
retransmit would be reset to jiffies + HZ/2 if we set the retrans_time
less than 500ms. Fix it by setting the time_before() check to HZ/100.

IPv6 does not have this issue.

Reported-by: Jianwen Ji <jiji@redhat.com>
Fixes: 19e16d220f ("neigh: support smaller retrans_time settting")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 16:56:53 -07:00
David S. Miller f2b122d3d6 mlx5-fixes-2020-05-28
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl7Ra5oACgkQSD+KveBX
 +j6TVQgAr/LLADIfJJxxMWEU9hjEdBx1H0ojxkh/OZRqwkkKsffmeeQm2Ovei1aY
 VRhyj6kr2UGYu970dtDzEVTZePdk+Hgapl3v6cIXM4IMPKGsUkYrhcaopyub1vRV
 g/3uClIbVmgtxX7NBSJL1e6KGvrGQJ9OFEaNw8gHrN5+ba4qW9PIhMnoybQ0S2r4
 eE39I4ZDWU6SAWqbVNTmiwpycPb6mtTRu3zfeX04+4hprQTPpyJ5rKtvKABLVRoY
 upkOSPCxUVIzspci/FlfXNzXb08MelVl2o7LUzklQVIiQpnsnY+hBg+y8qTMQ4TA
 ErFI/P2vl+UwOwqKV7RwB78q3l9A4w==
 =hk6W
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2020-05-28

This series introduces some fixes to mlx5 driver.

v1->v2:
 - Fix bad sha1, Jakub.
 - Added one more patch by Pablo.
   net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()

Nothing major, the only patch worth mentioning is the suspend/resume crash
fix by adding the missing pci device handlers, the fix is very straight
forward and as Dexuan already expressed, the patch is important for Azure
users to avoid crash on VM hibernation, patch is marked for -stable v4.6
below.

Conflict note:
('net/mlx5e: Fix MLX5_TC_CT dependencies') has a trivial one line conflict
with current net-next, which can be resolved by simply using the line from
net-next.

Please pull and let me know if there is any problem.

For -stable v4.6
 ('net/mlx5: Fix crash upon suspend/resume')

For -stable v5.6
 ('net/mlx5e: replace EINVAL in mlx5e_flower_parse_meta()')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 16:31:22 -07:00
Linus Torvalds 86852175b0 ARM: SoC fixes for v5.7
This time there is one fix for the error path in the mediatek cmdq driver
 (used by their video driver) and a couple of devicetree fixes, mostly
 for 32-bit ARM, and fairly harmless:
 
 - On OMAP2 there were a few regressions in the ethernet drivers,
   one of them leading to an external abort trap
 
 - One Raspberry Pi version had a misconfigured LED
 
 - Interrupts on Broadcom NSP were slightly misconfigured
 
 - One i.MX6q board had issues with graphics mode setting
 
 - On mmp3 there are some minor fixes that were submitted for
   v5.8 with a cc:stable tag, so I ended up picking them up
   here as well
 
 - The Mediatek Video Codec needs to run at a higher frequency
   than configured originally
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl7Rin0ACgkQmmx57+YA
 GNlD7A//ehGZi+AinBWiCupunm26EekqJE2hp8gZMCYAil5jm3Djk4oib6xHFSIB
 +z6H0gXfi/cqeMNkch3aPW0Rhqm6JhvOpkVmBEzwXSj1W9TQz6Xvhy58Aa3E1h/A
 2FyoC4pm9hKOlamaEUkaOLblMhLRla1fpgewmB6qGPNWZx6gNX45cxawqn9t4M7/
 tjDa9s+Df8YxeS08thK6PVwZ3zdVQLFfhuZR0/tCpBgikYSuK+e9W9R7IealPPs5
 iPTBeFpJVbtapaEaXHyjlyUn+0TdiEq5nrFwv6vMc08kIrZl9Ej9DSel0pq+mEDT
 b/v5J3FBgcq20kTBfeU2fYw5Sr6bZ7Ojm0ZMnFou60CPX7R18FDbgl8+ISOK8wmV
 1m+0V+tNYg7Mxu0IXBTHuZ98oMH3YeFmmYUOUIp5wmxHH/wsAePUxroRQXDZHtBA
 XEAN4qOztNtNrz1+iKwe03tls50baF0oey0N6AqH+yH6DvBbbIRYZxgAvErdh8YU
 67hbKnhYpK7gsGeCJ/f056TKUizsX923pU6e5XapdG9bNXLm5lgC2dSA2iPJjws/
 ppcbQjQTef1j2MPs97Al0Q3wbPTiq4dM0na1B//iyBZOAQf3niZPaUdbPngbNCra
 FZmVhdKkhI7EAIBAS+9haK/v6RKrMJytJd6z6o4GTRL0CElahdw=
 =E8Pg
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "This time there is one fix for the error path in the mediatek cmdq
  driver (used by their video driver) and a couple of devicetree fixes,
  mostly for 32-bit ARM, and fairly harmless:

   - On OMAP2 there were a few regressions in the ethernet drivers, one
     of them leading to an external abort trap

   - One Raspberry Pi version had a misconfigured LED

   - Interrupts on Broadcom NSP were slightly misconfigured

   - One i.MX6q board had issues with graphics mode setting

   - On mmp3 there are some minor fixes that were submitted for v5.8
     with a cc:stable tag, so I ended up picking them up here as well

   - The Mediatek Video Codec needs to run at a higher frequency than
     configured originally"

* tag 'armsoc-fixes-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: dts: mmp3: Drop usb-nop-xceiv from HSIC phy
  ARM: dts: mmp3-dell-ariel: Fix the SPI devices
  ARM: dts: mmp3: Use the MMP3 compatible string for /clocks
  ARM: dts: bcm: HR2: Fix PPI interrupt types
  ARM: dts: bcm2835-rpi-zero-w: Fix led polarity
  ARM: dts/imx6q-bx50v3: Set display interface clock parents
  soc: mediatek: cmdq: return send msg error code
  arm64: dts: mt8173: fix vcodec-enc clock
  ARM: dts: Fix wrong mdio clock for dm814x
  ARM: dts: am437x: fix networking on boards with ksz9031 phy
  ARM: dts: am57xx: fix networking on boards with ksz9031 phy
2020-05-29 16:10:07 -07:00
David S. Miller f9e0ce3ddc Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-05-29

The following pull-request contains BPF updates for your *net* tree.

We've added 6 non-merge commits during the last 7 day(s) which contain
a total of 4 files changed, 55 insertions(+), 34 deletions(-).

The main changes are:

1) minor verifier fix for fmod_ret progs, from Alexei.

2) af_xdp overflow check, from Bjorn.

3) minor verifier fix for 32bit assignment, from John.

4) powerpc has non-overlapping addr space, from Petr.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-29 15:59:08 -07:00
Linus Torvalds e2fce151d2 Cache tiering and cap handling fixups, both marked for stable.
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl7RKm4THGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi6BJB/4pz7N1K3sqs3OXHsHHnMnpTmxV5lU3
 4pXDivwESypxJKBDZ96qgSNMGgL9XpxChfA/LCYVy92LvIbjr9vrUh9386Q2arqw
 nRe4kTiN7Y8HkLb47GmqzCQdxgGVC35OZJZQzdM5y9rVEH9nbEUHWhsvCHYUR8Cb
 Ndm7hT6QzLRTQzlUhu0lPfLc84R0Hl5aFJNkA7enbXL7s9yfTYRf9+zcl+8VOI09
 X01OOxsOVNoQUzhTn2Y+SDFLr5N7CNtW7UN17S6sCiiA0XgodxeWmnxl2aaVMG+z
 VbsXQPr9ma4gYaD7BjzqaPEQqpgoTrmNqPkrzSzZbFHRc+GC3S5PiLwU
 =TOVq
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.7-rc8' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Cache tiering and cap handling fixups, both marked for stable"

* tag 'ceph-for-5.7-rc8' of git://github.com/ceph/ceph-client:
  ceph: flush release queue when handling caps for unknown inode
  libceph: ignore pool overlay and cache logic on redirects
2020-05-29 13:59:54 -07:00
Linus Torvalds 835e36b119 Fix the previous, flawed gfs2_find_jhead commit
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAl7RI6UUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTpV8A/8DxTfUVzH+S8fXS6nEfJ2Q8soLeGa
 JE8ZalUmcc8G6R/hPekZbcV4NVN03PlfSMh6Jnr5o5Zz6mDsksC2Lh+i0legsm2Y
 /QPj5N/vnNbEANBtz2BBMRl8VRWyqh9wBP0UuErv+bw39EyUNRVKvRkw04gxYMdw
 kpHl8EFICsIqWcXM+1dzWTTVFlz4dXRFgglMOoFYBdx45H1uUUNx5FiU2WRsS107
 WLsWQ3znEK6iqmYfG0KLkmIuEQKUodfQ4IJX5BkrNyck+1UbSQkWJFlBsMzLOiMX
 XmmnSyGmfk8FOvb1NXk7BzlZBPSF1xt55QIeLjd0sWIyEAnqx4lTz/CRA7WCBkXo
 qCLD2EaUi1RQUNItGjnq2hPmtv6hlA2zusvh5kC2I6ojTJaYcU5Sr0jFARzONbCE
 dKJLmh3RoVA63tt4lFF7DYqWI+AXt1j50aq4CbV0GzoGYaQ4UHMtlyTUFEiVtoGO
 A4tYI23UCvJe0ozduCbSkAv8o9zHEyboIBMDlPbASKLtLMkwTxOQgJ+vyjljHgS+
 vDWvRw7auosQoLxF31bhyJlYWirNUz0SNKVveGawpBQxfXNr3CKubiDORJwZIp5s
 vZLvQ0f4CqC0sx/25cnqDRFRJaNZcxvEXNBUMuN9v2713IzU5+WzRlmykxj17yfb
 B4gBml3MurCwPFA=
 =rKvo
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v5.7-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:
 "Fix the previous, flawed gfs2_find_jhead commit"

* tag 'gfs2-v5.7-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Even more gfs2_find_jhead fixes
2020-05-29 13:58:13 -07:00
Linus Torvalds 4f23460cfa Ensure __cpu_up() returns an error if cpu_online() is false after
waiting for completion on cpu_running.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl7RNTMACgkQa9axLQDI
 XvFU+A//Q805+HR9CHmy9PvIVNgyfWmB5IReXY9LVC0q9Enk6izOoI0CYDI/3X19
 v7ASxnXfzx4TGUjra1uvEwKkBeODifZbgOfmwa6XQjr9R5gAF8gVWomswx3wqoUB
 qf1bA6VJUjlOKmEs1EASY8TXd15EoYA30Xw0E9UHGflWfVwMRoDCYrl7XCR32qGM
 xVR3AhkxvVrsDM2aaV4tT7sG2+ypfoV+j/AX6HaTmEjL8bKTweu8oQkeLXNGSV7c
 LHD07vM/RGUbHN6sgSZBPU7zzGCVyr0UqtPKZXaBg5hLkbYHiY9E4xUH2Z3+MosC
 MSGvqi3V/kSkkeDYfURhOo6Li4q8dvRW7fLcy3rFA/ZJVK25KJkm6D+nyOMkQJhJ
 IEHz1iavthdsVCL2qVVHKvE1J1S/kc7bQ5ebfcyHpU5TFokDzrux1j+Qvy2v7cgd
 EPRba7yTx+LxDDAuozataHoTjbgz3RRC20ficMnvZnSk9BcFxiKRF5KkkM/C+Hf9
 br1c4O3LqI1kCVBoqe8GXjfQheEKom60gUjqbR28PP1vSIjoea0zhP3bXSX5/lzV
 uv4u0KlIS1wN5qMaLeZD7bo+9mCrwOi/uu3lZDiderlkQdSqhjzBR7kvR9y1WhBy
 SMqkCIc99dALJRDZrlUA3ImC2dF+nI+jWhuHFQbdxcNhpuK1BGo=
 =468t
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Ensure __cpu_up() returns an error if cpu_online() is false after
  waiting for completion on cpu_running"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/kernel: Fix return value when cpu_online() fails in __cpu_up()
2020-05-29 13:51:52 -07:00
Linus Torvalds ef4531be68 Merge branch 'parisc-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
 "Fix a kernel panic at boot time for some HP-PARISC machines"

* 'parisc-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix kernel panic in mem_init()
2020-05-29 13:50:31 -07:00
Linus Torvalds b58f2140ea IOMMU Fixes for Linux v5.7-rc7
Including:
 
 	- Two compile test fixes for issues introduced during the
 	  5.7-rc1 merge window.
 
 	- A fix for a reference count leak in an error path of
 	  iommu_group_alloc().
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl7RWq0ACgkQK/BELZcB
 GuOOtQ//cue8YWh3c3wpiSYLSqFETqojpNh8qCG1XZLjbqR67Ysfio0JWO0CvG68
 X/KLJ0fXQoHj6ifVLo693ylQsRiAYC6wABzltT5MLmEpgy9r3lIR4WJd6F8zjGYm
 wy6am2QulKR05+JtEZag7gIhQqtfc6UoLnceEJpfb+eqzIMma0V3l11vWBWbeaEH
 pARisxqLkooi8nosrScSjLnL8pzMYbekKKoEiB+U9wcDM/bOJThdRNnmYePXef+O
 oR+qGbIKyOpvBuLGcfvvRHn/p+GPizzHZW6O8VYBbG3s6fLr7jtB2Y0yUtD1rbwZ
 1PbxmVfMqxHM45I8bJLx2R7cOWtZN767e5jplk8sSdurg0CtQluh90Zt9w+u+MkG
 SVXCBUTBoUqbnKDed4VMfB7TcBIpBOp4XEt20E410EayQrhp10Pcs75ZA7ydFjFd
 Z3H7fVjTmxvBn1VNK/oTSOOHgNTG3V5c184zqjmxUnO+SB0z96Po4t2b3jlU4H84
 H/uHBoqTpH2CN3U/dCbxc0LyWkZUB6aRCtxdTwuc3tqCY3oHefU2GurEEDt24F5Z
 a1NZi6ezmP2Qilu10bYFcUq84S/ZkOy05dh7rAq5kh3KWTSc608vNx16vJAL1RyM
 9lF+GKm1q9gLa7JXgO43XgISnbMLeIo4PwqSNEMOkDRMj9R/Lm4=
 =JtbT
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Two build fixes for issues introduced during the merge window

 - A fix for a reference count leak in an error path of
   iommu_group_alloc()

* tag 'iommu-fixes-v5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Fix reference count leak in iommu_group_alloc.
  x86: Hide the archdata.iommu field behind generic IOMMU_API
  ia64: Hide the archdata.iommu field behind generic IOMMU_API
2020-05-29 13:41:33 -07:00
Linus Torvalds 75574f1212 block-5.7-2020-05-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl7RbAcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpjytD/9ptRsnVCTj2H7dFFcGzMaN6OUBGJqSpCHm
 5XnZEcdhKPCHl8iDxlPBKT++X8CQd0vW6fBHO1UAfZvbDsyRR3/oGJ6kO9k/4e9d
 UpG1N3lpebnvbonoXamriUs94J+r5FQ45wRRbO6eARMzpcg1Cf1EjGSz/7WAeYgV
 zW/wv9cDPwCK2sS/fAeLW7kQNkn65nxUehyNdwlYjv1FuYDrQG2etEgVMFb2VLSz
 QkYd5ftZHjNNLUCi/XTaO+j0zAcVRFPo2RZXkyk1iuYzcVc7E/FtMoQp1Bx5plbA
 0GKldQkoG6KOCh1D5WRX5vPVDmAR9Xs/XF8yDo/WohqpVppMVWj4cyvqe2ae/OhQ
 5xlz2JWhFRdUxUmADMHibeQMmmd6QNHvuzjR9HUnAzKW0FPJLEzETcWGRDTDfsN1
 XRT3QbvcZ0Ks8ewVhTXJrolkR9oAtzb/P1z0Xq4QHbqSUMBC2qaJWYVvjM5ZUC2U
 53YDXWa+SAb/G/loJVmr8qPcGfQtjWPOBc5/kz+PifjgmzR8eTgOEGvQFhsjJHCw
 HSM9X41boathhJt1nt/YLYkCA1zgvXjb/F+tTDsu4aau1Skhxd1foDQML5+oGlKA
 eS2hF4V3pyZ0r3X17Y7ykiKpBAHx5cqksFd44lfl4772WiQIM/rrEHgOFhMjh6Bi
 mG4Sl6vOvA==
 =GfOk
 -----END PGP SIGNATURE-----

Merge tag 'block-5.7-2020-05-29' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Two small fixes:

   - Revert a block change that mixed up the return values for non-mq
     devices

   - NVMe poll race fix"

* tag 'block-5.7-2020-05-29' of git://git.kernel.dk/linux-block:
  Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"
  nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
2020-05-29 13:39:26 -07:00
Linus Torvalds 6ff64d2537 Third RDMA 5.7 rc pull request
A few bug fixes:
 
 - Incorrect error unwind in qib and pvrdma
 
 - User triggerable NULL pointer crash in mlx5 with ODP prefetch
 
 - syzkaller RCU race in uverbs
 
 - Rare double free crash in ipoib
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl7RDU8ACgkQOG33FX4g
 mxrhnQ/8DLbe0xnNpEoHjllcb2CfEIQQQoHgoXZU6NF0EaXwXiM4G2SwVVP4cftN
 7oSGcvrp0YnBrGHp10unu3UX0IUpaw867jNyNC6QKj5RPinHR6IUjJYC4PugAtoc
 RQbyb5Y6Y0PEIhn15vhCOTEwvRDgu9GlXEe6ks0W5143rDF0L96Pr8BJC34HRGUp
 sx+aAkNCT9guuWUNBn5u3XDsboo+XjbeyDwywe9dzCmLKatWEd+y2XdPORGu4OvQ
 +7jtYTj/nGe4wWB3q5uIrFbD2eml8TcGStLKtF4hcijVk2L09xKyIL9dKHWuqc/0
 0tzD2iYvIh31BEIloucC2UXAY2gBSp8t03yDxruuAtDWXsxoS5ic/4Lo2H/mKIoy
 AYIcFJSMr05ujukl6jAHALAUBKLCsYY4Emwk14V970r6ltQNhnC9eps3E4/vbwRf
 SrhlaVFaMUfHdClNwc5mVloo/XJ6AbkTMFJLHUmcxYBXLT/42/0iNfctW03sFZ+K
 0lq0oGbAA52DHy6kUZ122n0b4B8no/W7CP6cvSiQGgAszUcLG9q+o+U/9VPZmBtG
 nfXzMv6QMHxASKCbetqDJ8WqI6LgzYkWJxOVe7DvpeqHLAFxqNYBh3y6by2lm4Dh
 gfJt1CMH4qFtXbjP1HfMV9RFVj74PJgQZfI/ILmh9kPLeXKVy5s=
 =uF5R
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Nothing profound here, just a last set of long standing bug fixes:

   - Incorrect error unwind in qib and pvrdma

   - User triggerable NULL pointer crash in mlx5 with ODP prefetch

   - syzkaller RCU race in uverbs

   - Rare double free crash in ipoib"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
  RDMA/core: Fix double destruction of uobject
  RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()
  RDMA/mlx5: Fix NULL pointer dereference in destroy_prefetch_work
  IB/qib: Call kobject_put() when kobject_init_and_add() fails
2020-05-29 13:35:45 -07:00
John Fastabend cf66c29bd7 bpf, selftests: Add a verifier test for assigning 32bit reg states to 64bit ones
Added a verifier test for assigning 32bit reg states to
64bit where 32bit reg holds a constant value of 0.

Without previous kernel verifier.c fix, the test in
this patch will fail.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159077335867.6014.2075350327073125374.stgit@john-Precision-5820-Tower
2020-05-29 13:34:06 -07:00
John Fastabend e3effcdfe0 bpf, selftests: Verifier bounds tests need to be updated
After previous fix for zero extension test_verifier tests #65 and #66 now
fail. Before the fix we can see the alu32 mov op at insn 10

10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
              smin_value=4294967168,smax_value=4294967423,
              umin_value=4294967168,umax_value=4294967423,
              var_off=(0x0; 0x1ffffffff),
              s32_min_value=-2147483648,s32_max_value=2147483647,
              u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
10: (bc) w1 = w1
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
              smin_value=0,smax_value=2147483647,
              umin_value=0,umax_value=4294967295,
              var_off=(0x0; 0xffffffff),
              s32_min_value=-2147483648,s32_max_value=2147483647,
              u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

After the fix at insn 10 because we have 's32_min_value < 0' the following
step 11 now has 'smax_value=U32_MAX' where before we pulled the s32_max_value
bound into the smax_value as seen above in 11 with smax_value=2147483647.

10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=4294967168,smax_value=4294967423,
             umin_value=4294967168,umax_value=4294967423,
             var_off=(0x0; 0x1ffffffff),
             s32_min_value=-2147483648, s32_max_value=2147483647,
             u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
10: (bc) w1 = w1
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=0,smax_value=4294967295,
             umin_value=0,umax_value=4294967295,
             var_off=(0x0; 0xffffffff),
             s32_min_value=-2147483648, s32_max_value=2147483647,
             u32_min_value=0, u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

The fall out of this is by the time we get to the failing instruction at
step 14 where previously we had the following:

14: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=72057594021150720,smax_value=72057594029539328,
             umin_value=72057594021150720,umax_value=72057594029539328,
             var_off=(0xffffffff000000; 0xffffff),
             s32_min_value=-16777216,s32_max_value=-1,
             u32_min_value=-16777216,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
14: (0f) r0 += r1

We now have,

14: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=inv(id=0,
             smin_value=0,smax_value=72057594037927935,
             umin_value=0,umax_value=72057594037927935,
             var_off=(0x0; 0xffffffffffffff),
             s32_min_value=-2147483648,s32_max_value=2147483647,
             u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm
14: (0f) r0 += r1

In the original step 14 'smin_value=72057594021150720' this trips the logic
in the verifier function check_reg_sane_offset(),

 if (smin >= BPF_MAX_VAR_OFF || smin <= -BPF_MAX_VAR_OFF) {
	verbose(env, "value %lld makes %s pointer be out of bounds\n",
		smin, reg_type_str[type]);
	return false;
 }

Specifically, the 'smin <= -BPF_MAX_VAR_OFF' check. But with the fix
at step 14 we have bounds 'smin_value=0' so the above check is not tripped
because BPF_MAX_VAR_OFF=1<<29.

We have a smin_value=0 here because at step 10 the smaller smin_value=0 means
the subtractions at steps 11 and 12 bring the smin_value negative.

11: (17) r1 -= 2147483584
12: (17) r1 -= 2147483584
13: (77) r1 >>= 8

Then the shift clears the top bit and smin_value is set to 0. Note we still
have the smax_value in the fixed code so any reads will fail. An alternative
would be to have reg_sane_check() do both smin and smax value tests.

To fix the test we can omit the 'r1 >>=8' at line 13. This will change the
err string, but keeps the intention of the test as suggseted by the title,
"check after truncation of boundary-crossing range". If the verifier logic
changes a different value is likely to be thrown in the error or the error
will no longer be thrown forcing this test to be examined. With this change
we see the new state at step 13.

13: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0)
    R1_w=invP(id=0,
              smin_value=-4294967168,smax_value=127,
              umin_value=0,umax_value=18446744073709551615,
              s32_min_value=-2147483648,s32_max_value=2147483647,
              u32_min_value=0,u32_max_value=-1)
    R10=fp0 fp-8_w=mmmmmmmm

Giving the expected out of bounds error, "value -4294967168 makes map_value
pointer be out of bounds" However, for unpriv case we see a different error
now because of the mixed signed bounds pointer arithmatic. This seems OK so
I've only added the unpriv_errstr for this. Another optino may have been to
do addition on r1 instead of subtraction but I favor the approach above
slightly.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159077333942.6014.14004320043595756079.stgit@john-Precision-5820-Tower
2020-05-29 13:34:06 -07:00
John Fastabend 3a71dc366d bpf: Fix a verifier issue when assigning 32bit reg states to 64bit ones
With the latest trunk llvm (llvm 11), I hit a verifier issue for
test_prog subtest test_verif_scale1.

The following simplified example illustrate the issue:
    w9 = 0  /* R9_w=inv0 */
    r8 = *(u32 *)(r1 + 80)  /* __sk_buff->data_end */
    r7 = *(u32 *)(r1 + 76)  /* __sk_buff->data */
    ......
    w2 = w9 /* R2_w=inv0 */
    r6 = r7 /* R6_w=pkt(id=0,off=0,r=0,imm=0) */
    r6 += r2 /* R6_w=inv(id=0) */
    r3 = r6 /* R3_w=inv(id=0) */
    r3 += 14 /* R3_w=inv(id=0) */
    if r3 > r8 goto end
    r5 = *(u32 *)(r6 + 0) /* R6_w=inv(id=0) */
       <== error here: R6 invalid mem access 'inv'
    ...
  end:

In real test_verif_scale1 code, "w9 = 0" and "w2 = w9" are in
different basic blocks.

In the above, after "r6 += r2", r6 becomes a scalar, which eventually
caused the memory access error. The correct register state should be
a pkt pointer.

The inprecise register state starts at "w2 = w9".
The 32bit register w9 is 0, in __reg_assign_32_into_64(),
the 64bit reg->smax_value is assigned to be U32_MAX.
The 64bit reg->smin_value is 0 and the 64bit register
itself remains constant based on reg->var_off.

In adjust_ptr_min_max_vals(), the verifier checks for a known constant,
smin_val must be equal to smax_val. Since they are not equal,
the verifier decides r6 is a unknown scalar, which caused later failure.

The llvm10 does not have this issue as it generates different code:
    w9 = 0  /* R9_w=inv0 */
    r8 = *(u32 *)(r1 + 80)  /* __sk_buff->data_end */
    r7 = *(u32 *)(r1 + 76)  /* __sk_buff->data */
    ......
    r6 = r7 /* R6_w=pkt(id=0,off=0,r=0,imm=0) */
    r6 += r9 /* R6_w=pkt(id=0,off=0,r=0,imm=0) */
    r3 = r6 /* R3_w=pkt(id=0,off=0,r=0,imm=0) */
    r3 += 14 /* R3_w=pkt(id=0,off=14,r=0,imm=0) */
    if r3 > r8 goto end
    ...

To fix the above issue, we can include zero in the test condition for
assigning the s32_max_value and s32_min_value to their 64-bit equivalents
smax_value and smin_value.

Further, fix the condition to avoid doing zero extension bounds checks
when s32_min_value <= 0. This could allow for the case where bounds
32-bit bounds (-1,1) get incorrectly translated to (0,1) 64-bit bounds.
When in-fact the -1 min value needs to force U32_MAX bound.

Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/159077331983.6014.5758956193749002737.stgit@john-Precision-5820-Tower
2020-05-29 13:34:06 -07:00
Linus Torvalds 411ea6790e MMC core:
- Fix use-after-free issue for rpmb partition
 
 MMC host
  - Fix quirk for broken CQE support
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAl7Q5BEXHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCludxAA06hfX/T9D15zMR/OX9BExyh0
 Zyx6aEQN7zRZBBkC0Hl21Hz5g+wws7ezwJkxWH4tzk7XsTT1F1shnAaf2dyG0qdM
 Ntbsd69wM5amCus7l2TesvL3epC15qWXCQcnv1RWFWz/ZYG6z3/gvv9p2mTh0TIm
 X92sbNptHgsIHdHD02GWYBkWPf3kF4b0UMp0Nc35WAzh4hbhRbNQw/7ktzmMyUD7
 iYjt2liho0PlAwrIEWpgw9whUyUBgnhO7TuvZHxwthmXWTRlDY6i9OqOKoF9FjyW
 hiOho+A5etE6GG//GgHW70yi9oziYPbmVPa9onzKBmBYQY9pilTcNm3apggwE+tl
 22QKZEi8yohajeo8x0xNvHtNEtIovYUEQe9yqXK411KPe4mz4zQbdufE0JV9Uh0s
 RCh9J+y0ge2T0o9hALN/GD2dgJHOMORebKlq+hFt6wyyKEziYGVqmzkbu4Kvojnq
 UYDy0h/ZR0etiNev/OACuft7Hs+Q7VYne1xLKTSM1VHB0LZtvk8uvqNRpCvlZ9js
 qrcCsRa7A5PCEkTvLTKdmYWcOFzmF0srr74hz0JDYuCy7OX9nEE5m/WcYOfQhUQf
 yEmiwcIFTmhfxgPBv4l3pYsf2QuRL3DtrW+wPf4Jxl9RuQjTRAgBKA0XUmIK9v9N
 5YxgDGxN6SulAqBTiG8=
 =JSSa
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Fix use-after-free issue for rpmb partition

  MMC host:
   - Fix quirk for broken CQE support"

* tag 'mmc-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: block: Fix use-after-free issue for rpmb
  mmc: sdhci: Fix SDHCI_QUIRK_BROKEN_CQE
2020-05-29 13:34:01 -07:00