Commit Graph

901440 Commits

Author SHA1 Message Date
Brian Foster 211683b21d xfs: rework collapse range into an atomic operation
The collapse range operation uses a unique transaction and ilock
cycle for the hole punch and each extent shift iteration of the
overall operation. While the hole punch is safe as a separate
operation due to the iolock, cycling the ilock after each extent
shift is risky w.r.t. concurrent operations, similar to insert range.

To avoid this problem, make collapse range atomic with respect to
ilock. Hold the ilock across the entire operation, replace the
individual transactions with a single rolling transaction sequence
and finish dfops on each iteration to perform pending frees and roll
the transaction. Remove the unnecessary quota reservation as
collapse range can only ever merge extents (and thus remove extent
records and potentially free bmap blocks). The dfops call
automatically relogs the inode to keep it moving in the log. This
guarantees that nothing else can change the extent mapping of an
inode while a collapse range operation is in progress.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Brian Foster dd87f87d87 xfs: rework insert range into an atomic operation
The insert range operation uses a unique transaction and ilock cycle
for the extent split and each extent shift iteration of the overall
operation. While this works, it is risks racing with other
operations in subtle ways such as COW writeback modifying an extent
tree in the middle of a shift operation.

To avoid this problem, make insert range atomic with respect to
ilock. Hold the ilock across the entire operation, replace the
individual transactions with a single rolling transaction sequence
and relog the inode to keep it moving in the log. This guarantees
that nothing else can change the extent mapping of an inode while
an insert range operation is in progress.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Brian Foster b73df17e4c xfs: open code insert range extent split helper
The insert range operation currently splits the extent at the target
offset in a separate transaction and lock cycle from the one that
shifts extents. In preparation for reworking insert range into an
atomic operation, lift the code into the caller so it can be easily
condensed to a single rolling transaction and lock cycle and
eliminate the helper. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Collins <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Jules Irenge daebba1b36 xfs: Add missing annotation to xfs_ail_check()
Sparse reports a warning at xfs_ail_check()

warning: context imbalance in xfs_ail_check() - unexpected unlock

The root cause is the missing annotation at xfs_ail_check()

Add the missing __must_hold(&ailp->ail_lock) annotation

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Qian Cai 4982bff1ac xfs: fix an undefined behaviour in _da3_path_shift
In xfs_da3_path_shift() "blk" can be assigned to state->path.blk[-1] if
state->path.active is 1 (which is a valid state) when it tries to add an
entry to a single dir leaf block and then to shift forward to see if
there's a sibling block that would be a better place to put the new
entry. This causes a UBSAN warning given negative array indices are
undefined behavior in C. In practice the warning is entirely harmless
given that "blk" is never dereferenced in this case, but it is still
better to fix up the warning and slightly improve the code.

 UBSAN: Undefined behaviour in fs/xfs/libxfs/xfs_da_btree.c:1989:14
 index -1 is out of range for type 'xfs_da_state_blk_t [5]'
 Call trace:
  dump_backtrace+0x0/0x2c8
  show_stack+0x20/0x2c
  dump_stack+0xe8/0x150
  __ubsan_handle_out_of_bounds+0xe4/0xfc
  xfs_da3_path_shift+0x860/0x86c [xfs]
  xfs_da3_node_lookup_int+0x7c8/0x934 [xfs]
  xfs_dir2_node_addname+0x2c8/0xcd0 [xfs]
  xfs_dir_createname+0x348/0x38c [xfs]
  xfs_create+0x6b0/0x8b4 [xfs]
  xfs_generic_create+0x12c/0x1f8 [xfs]
  xfs_vn_mknod+0x3c/0x4c [xfs]
  xfs_vn_create+0x34/0x44 [xfs]
  do_last+0xd4c/0x10c8
  path_openat+0xbc/0x2f4
  do_filp_open+0x74/0xf4
  do_sys_openat2+0x98/0x180
  __arm64_sys_openat+0xf8/0x170
  do_el0_svc+0x170/0x240
  el0_sync_handler+0x150/0x250
  el0_sync+0x164/0x180

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Christoph Hellwig 4ab45e259f xfs: ratelimit xfs_discard_page messages
Use printk_ratelimit() to limit the amount of messages printed from
xfs_discard_page.  Without that a failing device causes a large
number of errors that doesn't really help debugging the underling
issue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Christoph Hellwig 13b1f811b1 xfs: ratelimit xfs_buf_ioerror_alert messages
Use printk_ratelimit() to limit the amount of messages printed from
xfs_buf_ioerror_alert.  Without that a failing device causes a large
number of errors that doesn't really help debugging the underling
issue.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:51 -08:00
Christoph Hellwig ba8adad5d0 xfs: remove the kuid/kgid conversion wrappers
Remove the XFS wrappers for converting from and to the kuid/kgid types.
Mostly this means switching to VFS i_{u,g}id_{read,write} helpers, but
in a few spots the calls to the conversion functions is open coded.
To match the use of sb->s_user_ns in the helpers and other file systems,
sb->s_user_ns is also used in the quota code.  The ACL code already does
the conversion in a grotty layering violation in the VFS xattr code,
so it keeps using init_user_ns for the identity mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:50 -08:00
Christoph Hellwig 542951592c xfs: remove the icdinode di_uid/di_gid members
Use the Linux inode i_uid/i_gid members everywhere and just convert
from/to the scalar value when reading or writing the on-disk inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:50 -08:00
Christoph Hellwig 3d8f282150 xfs: ensure that the inode uid/gid match values match the icdinode ones
Instead of only synchronizing the uid/gid values in xfs_setup_inode,
ensure that they always match to prepare for removing the icdinode
fields.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:50 -08:00
Darrick J. Wong 93baa55af1 xfs: improve error message when we can't allocate memory for xfs_buf
If xfs_buf_get_map can't allocate enough memory for the buffer it's
trying to create, it'll cough up an error about not being able to
allocate "pagesn".  That's not particularly helpful (and if we're really
out of memory the message is very spammy) so change the message to tell
us how many pages were actually requested, and ratelimit it too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-03-02 20:55:50 -08:00
Zheng Bin d0c7feaf87 xfs: add agf freeblocks verify in xfs_agf_verify
We recently used fuzz(hydra) to test XFS and automatically generate
tmp.img(XFS v5 format, but some metadata is wrong)

xfs_repair information(just one AG):
agf_freeblks 0, counted 3224 in ag 0
agf_longest 536874136, counted 3224 in ag 0
sb_fdblocks 613, counted 3228

Test as follows:
mount tmp.img tmpdir
cp file1M tmpdir
sync

In 4.19-stable, sync will stuck, the reason is:
xfs_mountfs
  xfs_check_summary_counts
    if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) ||
       XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) &&
       !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS))
	return 0;  -->just return, incore sb_fdblocks still be 613
    xfs_initialize_perag_data

cp file1M tmpdir -->ok(write file to pagecache)
sync -->stuck(write pagecache to disk)
xfs_map_blocks
  xfs_iomap_write_allocate
    while (count_fsb != 0) {
      nimaps = 0;
      while (nimaps == 0) { --> endless loop
         nimaps = 1;
         xfs_bmapi_write(..., &nimaps) --> nimaps becomes 0 again
xfs_bmapi_write
  xfs_bmap_alloc
    xfs_bmap_btalloc
      xfs_alloc_vextent
        xfs_alloc_fix_freelist
          xfs_alloc_space_available -->fail(agf_freeblks is 0)

In linux-next, sync not stuck, cause commit c2b3164320 ("xfs:
use the latest extent at writeback delalloc conversion time") remove
the above while, dmesg is as follows:
[   55.250114] XFS (loop0): page discard on page ffffea0008bc7380, inode 0x1b0c, offset 0.

Users do not know why this page is discard, the better soultion is:
1. Like xfs_repair, make sure sb_fdblocks is equal to counted
(xfs_initialize_perag_data did this, who is not called at this mount)
2. Add agf verify, if fail, will tell users to repair

This patch use the second soultion.

Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Ren Xudong <renxudong1@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:50 -08:00
Brian Foster 6b789c337a xfs: fix iclog release error check race with shutdown
Prior to commit df732b29c8 ("xfs: call xlog_state_release_iclog with
l_icloglock held"), xlog_state_release_iclog() always performed a
locked check of the iclog error state before proceeding into the
sync state processing code. As of this commit, part of
xlog_state_release_iclog() was open-coded into
xfs_log_release_iclog() and as a result the locked error state check
was lost.

The lockless check still exists, but this doesn't account for the
possibility of a race with a shutdown being performed by another
task causing the iclog state to change while the original task waits
on ->l_icloglock. This has reproduced very rarely via generic/475
and manifests as an assert failure in __xlog_state_release_iclog()
due to an unexpected iclog state.

Restore the locked error state check in xlog_state_release_iclog()
to ensure that an iclog state update via shutdown doesn't race with
the iclog release state processing code.

Fixes: df732b29c8 ("xfs: call xlog_state_release_iclog with l_icloglock held")
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-02 20:55:50 -08:00
Linus Torvalds 98d54f81e3 Linux 5.6-rc4 2020-03-01 16:38:46 -06:00
Linus Torvalds e70869821a Two more bug fixes (including a regression) for 5.6
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAl5cMPoACgkQ8vlZVpUN
 gaNYmgf/WX4/jMSYQu2fICudCqLr5fkLqsybvYGZGei3F8BaJ90zohQAQybNznWS
 iyF0JzrOp37b/o0haz7KfDr7xVB3lAVsKu9Bglq+zL8mc9IkPmjhCXuLbknUtOUw
 j3aVdntt4d6S3szbtP4PIZxNqh+/4KJDS2soWvuNWRpYMOv2yoMClptWWQtsimAt
 3fYpxasSz0Jrhtbuf+I1oID++wOycDT3RKiko5tpLlQiFVoKBzfou+0ZdkC4+UIl
 KvcpMBm1ijdGAaN9jfb2L2KCY5UdSvmeVui3sMXtHBEpKMJl2QsClylR1wGfgBKi
 +YMEsjBONxKo3kH2DaPJaU6LEm8JuQ==
 =rszH
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Two more bug fixes (including a regression) for 5.6"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
  jbd2: fix data races at struct journal_head
2020-03-01 16:35:08 -06:00
Linus Torvalds f853ed90e2 More bugfixes, including a few remaining "make W=1" issues such
as too large frame sizes on some configurations.  On the
 ARM side, the compiler was messing up shadow stacks between
 EL1 and EL2 code, which is easily fixed with __always_inline.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJeXAT4AAoJEL/70l94x66DWywH/1kv4MmeGo6PI0Nxk/yvA7X8
 78iqIBchtxZX0v/9kqpTB7bYmHyTgmZHM+IkwtIUANDSaOvWqJwU+TLUfduOiuXF
 NxBHcZDyuMoftX5CSQ+bJ5PwxKijAdJsIkCZ13CnsTCkwcfamSGypFUCK8LacPeq
 WHvV5Ws5pFc51xrP3CH1DrRhLoulaBmt5xxqK9fxWtslrlsnm1uNza5vs8As8CzM
 apnmdRIf5p4v91Zic3PFH7/GXES0m1tjIBKdtZ4YHb8yrXV/kBsEVhhTjqE9mrUq
 qtRRl5waOFoP4yc9ey52PAbMm1x1Ho/pyunpM0xh40Yq8OPFwqXBPTnWfobSoiM=
 =LNQc
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "More bugfixes, including a few remaining "make W=1" issues such as too
  large frame sizes on some configurations.

  On the ARM side, the compiler was messing up shadow stacks between EL1
  and EL2 code, which is easily fixed with __always_inline"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: check descriptor table exits on instruction emulation
  kvm: x86: Limit the number of "kvm: disabled by bios" messages
  KVM: x86: avoid useless copy of cpufreq policy
  KVM: allow disabling -Werror
  KVM: x86: allow compiling as non-module with W=1
  KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
  KVM: Introduce pv check helpers
  KVM: let declaration of kvm_get_running_vcpus match implementation
  KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter
  arm64: Ask the compiler to __always_inline functions used by KVM at HYP
  KVM: arm64: Define our own swab32() to avoid a uapi static inline
  KVM: arm64: Ask the compiler to __always_inline functions used at HYP
  kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()
  KVM: arm/arm64: Fix up includes for trace.h
2020-03-01 15:16:35 -06:00
Oliver Upton 86f7e90ce8 KVM: VMX: check descriptor table exits on instruction emulation
KVM emulates UMIP on hardware that doesn't support it by setting the
'descriptor table exiting' VM-execution control and performing
instruction emulation. When running nested, this emulation is broken as
KVM refuses to emulate L2 instructions by default.

Correct this regression by allowing the emulation of descriptor table
instructions if L1 hasn't requested 'descriptor table exiting'.

Fixes: 07721feee4 ("KVM: nVMX: Don't emulate instructions in guest mode")
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-01 19:26:31 +01:00
Linus Torvalds fb279f4e23 Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "I2C has three driver bugfixes for you. We agreed on the Mac regression
  to go in via I2C"

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  macintosh: therm_windtunnel: fix regression when instantiating devices
  i2c: altera: Fix potential integer overflow
  i2c: jz4780: silence log flood on txabrt
2020-02-29 19:16:46 -06:00
Dan Carpenter 37b0b6b8b9 ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()
If sbi->s_flex_groups_allocated is zero and the first allocation fails
then this code will crash.  The problem is that "i--" will set "i" to
-1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
is type promoted to unsigned and becomes UINT_MAX.  Since UINT_MAX
is more than zero, the condition is true so we call kvfree(new_groups[-1]).
The loop will carry on freeing invalid memory until it crashes.

Fixes: 7c990728b9 ("ext4: fix potential race between s_flex_groups online resizing and access")
Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29 17:48:08 -05:00
Wolfram Sang 38b17afb0e macintosh: therm_windtunnel: fix regression when instantiating devices
Removing attach_adapter from this driver caused a regression for at
least some machines. Those machines had the sensors described in their
DT, too, so they didn't need manual creation of the sensor devices. The
old code worked, though, because manual creation came first. Creation of
DT devices then failed later and caused error logs, but the sensors
worked nonetheless because of the manually created devices.

When removing attach_adaper, manual creation now comes later and loses
the race. The sensor devices were already registered via DT, yet with
another binding, so the driver could not be bound to it.

This fix refactors the code to remove the race and only manually creates
devices if there are no DT nodes present. Also, the DT binding is updated
to match both, the DT and manually created devices. Because we don't
know which device creation will be used at runtime, the code to start
the kthread is moved to do_probe() which will be called by both methods.

Fixes: 3e7bed5271 ("macintosh: therm_windtunnel: drop using attach_adapter")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Tested-by: Erhard Furtner <erhard_f@mailbox.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org # v4.19+
2020-02-29 21:13:22 +01:00
Qian Cai 6c5d911249 jbd2: fix data races at struct journal_head
journal_head::b_transaction and journal_head::b_next_transaction could
be accessed concurrently as noticed by KCSAN,

 LTP: starting fsync04
 /dev/zero: Can't open blockdev
 EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
 EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
 ==================================================================
 BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]

 write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
  __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
  __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
  jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
  (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
  kjournald2+0x13b/0x450 [jbd2]
  kthread+0x1cd/0x1f0
  ret_from_fork+0x27/0x50

 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
  jbd2_write_access_granted+0x1b2/0x250 [jbd2]
  jbd2_write_access_granted at fs/jbd2/transaction.c:1155
  jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
  __ext4_journal_get_write_access+0x50/0x90 [ext4]
  ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
  ext4_mb_new_blocks+0x54f/0xca0 [ext4]
  ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
  ext4_map_blocks+0x3b4/0x950 [ext4]
  _ext4_get_block+0xfc/0x270 [ext4]
  ext4_get_block+0x3b/0x50 [ext4]
  __block_write_begin_int+0x22e/0xae0
  __block_write_begin+0x39/0x50
  ext4_write_begin+0x388/0xb50 [ext4]
  generic_perform_write+0x15d/0x290
  ext4_buffered_write_iter+0x11f/0x210 [ext4]
  ext4_file_write_iter+0xce/0x9e0 [ext4]
  new_sync_write+0x29c/0x3b0
  __vfs_write+0x92/0xa0
  vfs_write+0x103/0x260
  ksys_write+0x9d/0x130
  __x64_sys_write+0x4c/0x60
  do_syscall_64+0x91/0xb05
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

 5 locks held by fsync04/25724:
  #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
  #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
  #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
  #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
  #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
 irq event stamp: 1407125
 hardirqs last  enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
 softirqs last  enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
 softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0

 Reported by Kernel Concurrency Sanitizer on:
 CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019

The plain reads are outside of jh->b_state_lock critical section which result
in data races. Fix them by adding pairs of READ|WRITE_ONCE().

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29 13:40:02 -05:00
Linus Torvalds 7557c1b3f7 SCSI fixes on 20200229
Four small fixes.  Three are in drivers for fairly obvious bugs.  The
 fourth is a set of regressions introduced by the compat_ioctl changes
 because some of the compat updates wrongly replaced .ioctl instead of
 .compat_ioctl.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXlpxDCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSXsAPwOGPkU
 ObFbUs75Tdmk1M7jqtxgBsNhuNta0S8d7dJ3aAEA/YBtGGQWoeEGivUKwzwA4cwL
 1w1GbhPEblpMNO8keVA=
 =I7qk
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four small fixes.

  Three are in drivers for fairly obvious bugs. The fourth is a set of
  regressions introduced by the compat_ioctl changes because some of the
  compat updates wrongly replaced .ioctl instead of .compat_ioctl"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places
  scsi: zfcp: fix wrong data and display format of SFP+ temperature
  scsi: sd_sbc: Fix sd_zbc_report_zones()
  scsi: libfc: free response frame from GPN_ID
2020-02-29 09:58:47 -06:00
Linus Torvalds 29795de0d2 pci-v5.6-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl5ZZ4MUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxQKRAAq51gX+DRNfGDf42a0EwuQi4lZVfD
 mg9wEvKdreNL+pa5dSYopEwTcQPF2Pf5trFYZIFkZ61FaK4W/A3pOw/43i50LBFw
 rE2x/pvNfVDDLVXZUiDLQp1Yz3UIBAoxZQd0s0TzMApUBEtylVQDEN/9+NYGreFp
 QU+sb2wkU3gc99woyqpvcacybS8NgKDWI4jGGOkxod/QOlCGE3tbyOYYUNx5/IhE
 nt1AIKS8D/LqPJMh+pfZYatSF2uxfqk5YEN1k+S4Z0h/EAgIfVGRyVMo/HSeg+Al
 yKHiGQ3ApIeTqsOiscdeb00jPSn8IsED/1Uv/QCkmmlP72arAyCyNDs7ZVUqrhGx
 nufA/oHel4g/pYPcY2Dyjr/qR/dM2X17SE/7KhUkThAVim01KS26uGMAuL4RApUS
 d4gPS2iY3LqmHyQBbcUHjrkrKRmqyl6V+WMwz9DihZ66FiNw1w2gLoI8FiYN7e9e
 XwNVhuGNPLZiVw0gmAfIHoPDucj9B+IRjOmLU4/x93vAbhduKUwCIDHv91Mbe2Gt
 bHHzJeiyzOotw+SpgvRTrafX6JZ/K4fbe81yDryf8a8WSHxSLd6HpvyVukLJYdd0
 n0d7w+a2XxspptTzsGe593CQ7GG93I5Ena/GeXXMztQmFrSG9ufUcN54+PJia3P+
 frvcmHb1sZair5Q=
 =kzKz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Fix build issue on 32-bit ARM with old compilers (Marek Szyprowski)

 - Update MAINTAINERS for recent Cadence driver file move (Lukas
   Bulwahn)

* tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  MAINTAINERS: Correct Cadence PCI driver path
  PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers
2020-02-28 11:51:53 -08:00
Linus Torvalds 2edc78b9a4 block-5.6-2020-02-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl5ZXl0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmltEACSA4yxdvWsVMYRCijjm/FzBEq7C8PSsWNK
 H8KPmjQiNpbSiZSi1uMVsHMlhBmBM8ZQ6Zc+gbZSs6xMqa4yP/iRtmzxnGonC7TB
 f5Ne2QuC0+TKMFJJTG8cCTzrgEOrWYkFKkmabzDml7HtloJtuzgArrmPzRj2sUfY
 J+d0osdp1b4U4sqhhAnxSm/zYJkGrQb+9UgNdVjhZCUzaX6oCcuK8xUwu2reLGlM
 qPkSKOywnl3WHCSCJXsCrNLKX0QWtIfMzlWDr40GYgHauPBbWfa8+1yHR1/lWP4R
 zyxGk63I9f6/+iQSUC72wP77bAVWKW674c53jgd7r1pNL9TiuK+a3E4lgf7eU+rl
 ymA/rM6Iy3SjTgiLT57PPOecsILJns3cwZ6mhvSRs0+zpao7LOQZXWdu9V0+Fyqo
 jur+7Ll/Qfdv/CLlM94DeBJtwhaTWiHTfDoaDHlG9p1/vvcWWXTUTIVPwAD+YGbj
 geio/bIWECnQxDtZL5Jikf5zsC76aQ46vvxK4F6RJlXj6jaugIbN3mWLsg17sUVf
 Y4h+IEVtQr0zA0LkPrfVdAS9IqVlTrMRDCkrrlhsDt7FI0orCOag7JOcmN2/nPn/
 2H22nl6i02b0gdGrScU5pyBswSPaImddH5tqE9uL2rK4hrFe6oKxL5EicTFDZmTh
 tHnukoc+Yg==
 =1bzv
 -----END PGP SIGNATURE-----

Merge tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Passthrough insertion fix (Ming)

 - Kill off some unused arguments (John)

 - blktrace RCU fix (Jan)

 - Dead fields removal for null_blk (Dongli)

 - NVMe polled IO fix (Bijan)

* tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block:
  nvme-pci: Hold cq_poll_lock while completing CQEs
  blk-mq: Remove some unused function arguments
  null_blk: remove unused fields in 'nullb_cmd'
  blktrace: Protect q->blk_trace with RCU
  blk-mq: insert passthrough request into hctx->dispatch directly
2020-02-28 11:43:30 -08:00
Linus Torvalds 74dea5d99d io_uring-5.6-2020-02-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl5ZXkgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgprZqEACOvhiprH9Q75Pp+ZwQknM3xGyJRWI3Mbj9
 ZOyTVTK0qhTeaq6rN4MLSYevXOh+L68x5WRt1YJ1UnQRE0i8+ZQyZczqLKxxl8gF
 trhbYDXjvXIWr9zvdtiL01PKKu4Vjjp6eZAomrbxCTFku0qn76fo9wDgGPRGL+Kx
 lNO/6QvCXr9EjDniEUhlQsxTad5xc4sL0cnL4s2i7RlTCYtW4WJXJMC/4Gkg69j+
 W5GBZyjJDa8Sj3pEbLjtDtA4ooE9VMaldb7ZvR62ONUVwGpftPsbN7UhVlhyhpW+
 8v4ZEf07CxB246+hj7oL0RvEW3+/nB2hym1ySMXyBzpbx4O1JOUG7hQtNgdLRbCZ
 27IOg2O36qbUKM1hUwn7Qm3XAfBPQdFpVmqE2+E9MEOKzigLzhRP6Bu5d9x9VQGh
 JDxsm3B8PRHFJVAasiYu0p7mlx/+BCLjB84UrMB3I9UCBuVfk4mtmuwZX+mcK2PR
 pV1xJlEMYKme3cz2/u6uB8p3Nq6ipE1nSVrI6AnfEvJbQ9sFL61KaG4wHKPvtb0y
 mlNgc4seSjiWcBR2/84561a4CSmlXAn9dWMIGdHFFA43mTPYGc5omTcM8FwcEDkW
 cTFGB8sFukcTNmOw62HUHYI1vPpowX6apV08lEQrScz7GiK5piTYqTFNneqEzcwZ
 3bIMisH3Gg==
 =WheR
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - Fix for a race with IOPOLL used with SQPOLL (Xiaoguang)

 - Only show ->fdinfo if procfs is enabled (Tobias)

 - Fix for a chain with multiple personalities in the SQEs

 - Fix for a missing free of personality idr on exit

 - Removal of the spin-for-work optimization

 - Fix for next work lookup on request completion

 - Fix for non-vec read/write result progation in case of links

 - Fix for a fileset references on switch

 - Fix for a recvmsg/sendmsg 32-bit compatability mode

* tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block:
  io_uring: fix 32-bit compatability with sendmsg/recvmsg
  io_uring: define and set show_fdinfo only if procfs is enabled
  io_uring: drop file set ref put/get on switch
  io_uring: import_single_range() returns 0/-ERROR
  io_uring: pick up link work on submit reference drop
  io-wq: ensure work->task_pid is cleared on init
  io-wq: remove spin-for-work optimization
  io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL
  io_uring: fix personality idr leak
  io_uring: handle multiple personalities in link chains
2020-02-28 11:39:14 -08:00
Jens Axboe 5b8ea58b6a Merge branch 'nvme-5.6-rc4' of git://git.infradead.org/nvme into block-5.6
Pull NVMe fix from Keith.

* 'nvme-5.6-rc4' of git://git.infradead.org/nvme:
  nvme-pci: Hold cq_poll_lock while completing CQEs
2020-02-28 10:02:36 -07:00
Linus Torvalds c60c040213 ACPI fixes for 5.6-rc4
Fix a couple of configuration issues in the ACPI watchdog (WDAT)
 driver (Mika Westerberg) and make it possible to disable that
 driver at boot time in case it still does not work as
 expected (Jean Delvare).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl5Y6UYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxj7YP/1qVMcIJo3CcItiQVV7+2S+WS+kK+Rf8
 t6g+8F5KL0DBrDqji7l19kv0O+kPpn2uGnREcNk3T3mHEu0jSywghvBI+nzrym2P
 Q+p2+2Fryy5zgiNTyhq7imA3IH4Zw5CVt6GRJjITvzFwhMGM38NQsDUbZEpZtJrz
 HTVQyqPgmhvox5sa2Qk1+vZOtclWlkPenWmtJEs8m+Pg8w+Ejfs4Lj+CQYlI5RB8
 GGhTgBiBMS7A80yTVxl8LULsrini6PWAGGY2yMScwDUYNZTAUuT8QSUx4/QBmyyP
 U5K1sIK2pd/JuE2QqSBztrYC7x1fpfoBJylF8B0HK5uNBpZth7eXhe6t+xDYVTpw
 U31unYfmcx6UT47lF/yWWNuhP6EDvdvZ0Xx4DHlgBJqKNuyfhz4Uazsr7HyLi2sI
 /YrBPimtgvICFHYorrNVwEheTUEp+QXtNZp6aTugLGVX6zbQ3nZp7ajifNO7hdsQ
 K+IjjRgH/YsVi9GQY/U3A/zTO+39fSupQgKiH33pG8Yp5bZbreoo1kPKzlnt23JJ
 DXuaN7wH74sgT2EqZm30lFapF/0wv+iIpdWmorlE/4WwTwIeDck/ULeBiGFqpUI6
 9Di10O6w+TJ+VevyJ5Hqkzukjlj93Vq+kCY8eOiWVpTga2ieQiYKFSbxDPE3OC1k
 j2wkDwKqkh0o
 =eIlw
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "Fix a couple of configuration issues in the ACPI watchdog (WDAT)
  driver (Mika Westerberg) and make it possible to disable that driver
  at boot time in case it still does not work as expected (Jean
  Delvare)"

* tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: watchdog: Set default timeout in probe
  ACPI: watchdog: Fix gas->access_width usage
  ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro
  ACPI: watchdog: Allow disabling WDAT at boot
2020-02-28 09:02:18 -08:00
Linus Torvalds 3642859812 Power management fixes for 5.6-rc4
Fix a recent cpufreq initialization regression (Rafael Wysocki),
 revert a devfreq commit that made incompatible changes and broke
 user land on some systems (Orson Zhai), drop a stale reference to
 a document that has gone away recently (Jonathan Neuschäfer) and
 fix a typo in a hibernation code comment (Alexandre Belloni).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl5Y6MYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxqCAP/3zmwDsBMG07c+g9kOEJzuGLkpUrGIFp
 t1vR2+EmV7bCq7onCLSBz090wvogLCmNPpXmq45Ddt5hx3ltZSx4stwYdbsS/xmU
 t+9pFiCSSFufs1vaUzZqCuw53jpizRD4KKoUbCT2kpCgH4wjF7ftRIC5y2DYI57Y
 hOIfrI9XzwR2UZYRWZqCYtwpPkMicao8lXLfhs1mrZFk+AxwwwUFSp8/Gw94LBPD
 +ESctvsrGfmqEDEvcZUaYd2i0i5PsqnbnVy6wxWb3rhmhSXJTfiCzCGECItBJuSA
 HcZj3m2dohOXDxXK/Yevwiv+fBkqeeeP+KVlvtjzmYzTDW1bryLYQpxAsdpeYGEE
 bANjAl6MARqBDKUjvw+N6yDdAPPNbsT6zEK3f0w4eQNRpfmjhHP7TxrOsjC0DRJu
 EZpN6hWGQs1Vmndr+xaK0ITeRXB+7IQeXG6t6XDYZg2gS65hkhF5z1E5vjmn1Cen
 wpQsqvek6K7/DqqLqCH27yATNNKff7PNNCfpIazPE6CjDKBAiM9jsMpWkQf4E9/7
 VKJFzmhy5P9VsHnUNonRN1K8Vl5eo+b7YOfmR8kvfsmNpfvKqaXM813owVMkOEq6
 44kc2D38t3TnlH+GNgGfyPBnHchqlw/P0y0mypQ73Wyg+PQB+fx+W9vAXWDe6fBW
 /tIhZYILu90f
 =ySea
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Fix a recent cpufreq initialization regression (Rafael Wysocki),
  revert a devfreq commit that made incompatible changes and broke user
  land on some systems (Orson Zhai), drop a stale reference to a
  document that has gone away recently (Jonathan Neuschäfer), and fix a
  typo in a hibernation code comment (Alexandre Belloni)"

* tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Fix policy initialization for internal governor drivers
  Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
  PM / hibernate: fix typo "reserverd_size" -> "reserved_size"
  Documentation: power: Drop reference to interface.rst
2020-02-28 08:49:52 -08:00
Linus Torvalds bfeb4f9977 zonefs fixes for 5.6-rc4
Two fixes in this pull request:
 * Revert the initial decision to silently ignore IOCB_NOWAIT for
   asynchronous direct IOs to sequential zone files. Instead, return an
   error to the user to signal that the feature is not supported (from
   Christoph)
 * A fix to zonefs Kconfig to select FS_IOMAP to avoid build failures if
   no other file system already selected this option (from Johannes).
 
 Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCXljJWAAKCRDdoc3SxdoY
 dmztAP9Sj74cHVTxac+HoDKwf6DYWfjPWonT5tO4wc8q0PBDOgEAhKzHQJZNqJvd
 a0BrEf/t6RLWDgsi75cB/U6HsiGkiA0=
 =+maQ
 -----END PGP SIGNATURE-----

Merge tag 'zonefs-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs

Pull zonefs fixes from Damien Le Moal:
 "Two fixes in here:

   - Revert the initial decision to silently ignore IOCB_NOWAIT for
     asynchronous direct IOs to sequential zone files. Instead, return
     an error to the user to signal that the feature is not supported
     (from Christoph)

   - A fix to zonefs Kconfig to select FS_IOMAP to avoid build failures
     if no other file system already selected this option (from
     Johannes)"

* tag 'zonefs-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: select FS_IOMAP
  zonefs: fix IOCB_NOWAIT handling
2020-02-28 08:34:47 -08:00
Paolo Bonzini e951445f4d KVM/arm fixes for 5.6, take #1
- Fix compilation on 32bit
 - Move  VHE guest entry/exit into the VHE-specific entry code
 - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl5VsNMPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDLhUQAIsecO9IyYjy1J0Q5AxaKLL7NuKYlAaty2xX
 uY6UkTfPNsEaHFXSGYXWPDxrmkgArp2wuy4WVQB59Om00+LE7h9kiz7+xKpcUy1G
 UoHa5mzMlqoOeUIWO/oSU6LYHhYDnIpHTDco93YrscU4nNRevJZ/GVeuQeMblzuZ
 Sg7cWc+0V43FXUt9Jw8BsNhXH/D0l0p3v86p7GZLcSfFAccO62YfOwC8J/znLPym
 4S+O9RYQkCczvzFeQVYQwqImOAunaOb0OzERUbm8icOF6ekYGwywjrtlmAC/3q+q
 1g/te1yfwQ8fpprWl4QSH0sQVdfAcxdDZqcWtN2LhNaEShZtNa5yKpsRGn1V0eAS
 tIO8eexAKCXoASHrrwfSkizYjRAeDabmodBQmS50/isY9OdBE2tDel+BLrCjzBJ2
 hABwEZ3Q78216EuoqsZqWaEUZ3ck0iSW3IcXglmHE4TC8Iq6dwskvOPjay+msHr9
 dcHDCxFIN4jzv9QcpKN8LkxfmW0Us28bzap3OhKfrz0nv7b4n+j0q1xbKL1QnN/l
 RcDPW0dQeXuX9vYMeYIUDQcV4IgTUkF6IPDCRW7KCApi98HfPTbrfQ97nir79zDp
 pD8NXaNFr4PtxJoheYYia3sjZMt/fgfvP2dM32iOpsMu7W1FXdfQN7heNSc6MQmO
 ciyhf/mj
 =NpPo
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm fixes for 5.6, take #1

- Fix compilation on 32bit
- Move  VHE guest entry/exit into the VHE-specific entry code
- Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
2020-02-28 11:50:06 +01:00
Erwan Velu ef935c25fd kvm: x86: Limit the number of "kvm: disabled by bios" messages
In older version of systemd(219), at boot time, udevadm is called with :
	/usr/bin/udevadm trigger --type=devices --action=add"

This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent,
leading to the "kvm: disabled by bios" message in case of your Bios disabled
the virtualization extensions.

On a modern system running up to 256 CPU threads, this pollutes the Kernel logs.

This patch offers to ratelimit this message to avoid any userspace program triggering
this uevent printing this message too often.

This patch is only a workaround but greatly reduce the pollution without
breaking the current behavior of printing a message if some try to instantiate
KVM on a system that doesn't support it.

Note that recent versions of systemd (>239) do not have trigger this behavior.

This patch will be useful at least for some using older systemd with recent Kernels.

Signed-off-by: Erwan Velu <e.velu@criteo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 11:37:20 +01:00
Rafael J. Wysocki 189c6967fe Merge branches 'pm-sleep' and 'pm-devfreq'
* pm-sleep:
  PM / hibernate: fix typo "reserverd_size" -> "reserved_size"
  Documentation: power: Drop reference to interface.rst

* pm-devfreq:
  Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
2020-02-28 11:00:50 +01:00
Paolo Bonzini aaec7c03de KVM: x86: avoid useless copy of cpufreq policy
struct cpufreq_policy is quite big and it is not a good idea
to allocate one on the stack.  Just use cpufreq_cpu_get and
cpufreq_cpu_put which is even simpler.

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:54:50 +01:00
Paolo Bonzini 4f337faf1c KVM: allow disabling -Werror
Restrict -Werror to well-tested configurations and allow disabling it
via Kconfig.

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:45:28 +01:00
Valdis Klētnieks 575b255c16 KVM: x86: allow compiling as non-module with W=1
Compile error with CONFIG_KVM_INTEL=y and W=1:

  CC      arch/x86/kvm/vmx/vmx.o
arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=]
   68 | static const struct x86_cpu_id vmx_cpu_id[] = {
      |                                ^~~~~~~~~~
cc1: all warnings being treated as errors

When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a
reference to the structure (or any code at all).  This makes W=1 compiles
unhappy.

Wrap both in a #ifdef to avoid the issue.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
[Do the same for CONFIG_KVM_AMD. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:35:37 +01:00
Wanpeng Li 8a9442f49c KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis
Nick Desaulniers Reported:

  When building with:
  $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000
  The following warning is observed:
  arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in
  function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=]
  static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int
  vector)
              ^
  Debugging with:
  https://github.com/ClangBuiltLinux/frame-larger-than
  via:
  $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \
    kvm_send_ipi_mask_allbutself
  points to the stack allocated `struct cpumask newmask` in
  `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is
  potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for
  the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as
  8192, making a single instance of a `struct cpumask` 1024 B.

This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for
both pv tlb and pv ipis..

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:34:25 +01:00
Wanpeng Li a262bca3ab KVM: Introduce pv check helpers
Introduce some pv check helpers for consistency.

Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:34:19 +01:00
Christian Borntraeger fcd07f9adc KVM: let declaration of kvm_get_running_vcpus match implementation
Sparse notices that declaration and implementation do not match:
arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: warning: incorrect type in return expression (different address spaces)
arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17:    expected struct kvm_vcpu [noderef] <asn:3> **
arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17:    got struct kvm_vcpu *[noderef] <asn:3> *

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:33:57 +01:00
Paolo Bonzini 7943f4acea KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter
Even if APICv is disabled at startup, the backing page and ir_list need
to be initialized in case they are needed later.  The only case in
which this can be skipped is for userspace irqchip, and that must be
done because avic_init_backing_page dereferences vcpu->arch.apic
(which is NULL for userspace irqchip).

Tested-by: rmuncrief@humanavance.com
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28 10:33:17 +01:00
Linus Torvalds 45d0b75b98 drm fixes for 5.6.0-rc4
amdgpu:
 - Drop DRIVER_USE_AGP
 - Fix memory leak in GPU reset
 - Resume fix for raven
 
 radeon:
 - Drop DRIVER_USE_AGP
 
 i915:
 - downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs
 - shrinker fix
 - pmu leak and double free fixes
 - gvt user after free and virtual display reset fixes
 - randconfig build fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJeWJ+uAAoJEAx081l5xIa+SW4P/3UNXTw6SaJbuIv0ffU6PpR6
 GqynyKseoogfZ6xDNRSEYcamIkKaniRLli+TPvEFqjAbpEzu0bbaXlQDcoG3uPqm
 jGxufQ4GvxIidbIbzoJA/TLl8UOGLJ+x6fx4EhtAS0VzE3dugI5yPZOyE+cboR2D
 0DtvqB1Bmx580TMSIJlzw92Nfgh4n1K29h51lW5MpY+lEvnqOjk0aliunPeOi2wB
 8tbABzB+pY6UTPNickb/SBWmwcem7ceA/xxX6YyKE89mhREQo1PLZI6Tt5YTdlQ4
 mizHFEZT3H/JF/X67DmaAEADTM+BDMkpRXHLQlIetPdzAg4K85HeBcDQfHTPKKSg
 qtibd4VtxYXV31Mt/UWOh4EtpYHRZtbC8D42jBqF2DFBRXiTImli6PQ/G3c9xZdg
 sYPMWGVjh3HWNxnejO5Bi7na5jkWaY/ujzT/ERlF+EKmX8meKQb5SgVndcZkjATx
 yMh7lszuSxmw//qIq741bAbQk3e8/AdNm6iDISSCN3X+JCZI6bMOYP3xW8FzlQSe
 q69v/ckYlZoVqGwO/Qp5kmfMmZ6GaqHOkLTHQucJbgx5C6nhfF9jtjg+G5hhAO+l
 OcKvZo4eTPjayQbBoogxWhO+PW+0NaZ/KsU6k16a27txCeEjPRGmYDbbX5M6GHOw
 NaO7zx1LL/2j4o7fvkq8
 =5C0g
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just some fixes for this week: amdgpu, radeon and i915.

  The main i915 one is a regression Gen7 (Ivybridge/Haswell), this moves
  them back from trying to use the full-ppgtt support to the aliasing
  version it used to use due to gpu hangs. Otherwise it's pretty quiet.

  amdgpu:
   - Drop DRIVER_USE_AGP
   - Fix memory leak in GPU reset
   - Resume fix for raven

  radeon:
   - Drop DRIVER_USE_AGP

  i915:
   - downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs
   - shrinker fix
   - pmu leak and double free fixes
   - gvt user after free and virtual display reset fixes
   - randconfig build fix"

* tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drm:
  drm/radeon: Inline drm_get_pci_dev
  drm/amdgpu: Drop DRIVER_USE_AGP
  drm/i915: Avoid recursing onto active vma from the shrinker
  drm/i915/pmu: Avoid using globals for PMU events
  drm/i915/pmu: Avoid using globals for CPU hotplug state
  drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt
  drm/i915: fix header test with GCOV
  amdgpu/gmc_v9: save/restore sdpif regs during S3
  drm/amdgpu: fix memory leak during TDR test(v2)
  drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime
  drm/i915/gvt: Separate display reset from ALL_ENGINES reset
2020-02-27 21:52:18 -08:00
Dave Airlie f091bf3970 drm/i915 fixes for v5.6-rc4:
- downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs
 - shrinker fix
 - pmu leak and double free fixes
 - gvt user after free and virtual display reset fixes
 - randconfig build fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFWWmW3ewYy4RJOWc05gHnSar7m8FAl5XWzUACgkQ05gHnSar
 7m85vw//eVdAZmGj7kNoEKkGdsUNvtzfLC6i5J7xd9Appir5QthiHrzUK6jiqIce
 h/hvoeXOBouHEvv0NGBU2v91gU86GmTk8flgXHlqID5ksvDWy2oB9B0d6XyENgF1
 FJFq9DeGLRcDDqPybXA1t5BvvBTa3OIKR6Aye6K1NMoUOuXY6zDy2XtdXvVAVhG3
 /6Qagje6DPJYxca2lpzGokDQLrbYxogzAJlsqt27g1FdCBwtYkdQNhZB3Ya3RjTU
 ykCy9AfJVNf7tADtJ3mnd0uipD/GG9KA5NKQs4amVxtPFr8deAUAJLuXtb0azRVf
 Q0VC+X3G06UrTeSje6iBCOrtuIkclVJCgKDyIPQpQkQx/eG+JGrDMXddBvD83xmR
 gX2jR6vxpOCobGoVq6pR/uPr+x7oUic3RLkEFtLTLtxi1S2G5LEFB9cLuac2Rmm+
 82fS85NPzrqeaeHdtoN0t1ZZiycNdV777/a5dncyo2+nuAKUDpYGfjXaM6dhI1bn
 wJFaaUMcy7ZaifFnM4ZuThf+aPNkIDVR4AtAM096X5Ayu8ymRugLq1GGyyh7Dk5i
 k8f3Yc3AykYbLNHhNw80HkxaPgUPGwntuICubDzWwovDTsc13ITDKTFLBHqGJbSb
 6kJFgSlKJ7Z6BdPHLkYM3yqHjkHix1n+5WzJdaLmEf3ZIMkAa+M=
 =Di2j
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2020-02-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.6-rc4:
- downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs
- shrinker fix
- pmu leak and double free fixes
- gvt user after free and virtual display reset fixes
- randconfig build fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/874kvcsh00.fsf@intel.com
2020-02-28 12:40:49 +10:00
Dave Airlie e180af1970 Merge tag 'amd-drm-fixes-5.6-2020-02-26' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
amd-drm-fixes-5.6-2020-02-26:

amdgpu:
- Drop DRIVER_USE_AGP
- Fix memory leak in GPU reset
- Resume fix for raven

radeon:
- Drop DRIVER_USE_AGP

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227034106.3912-1-alexander.deucher@amd.com
2020-02-28 12:30:20 +10:00
Linus Torvalds 7058b83789 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix leak in nl80211 AP start where we leak the ACL memory, from
    Johannes Berg.

 2) Fix double mutex unlock in mac80211, from Andrei Otcheretianski.

 3) Fix RCU stall in ipset, from Jozsef Kadlecsik.

 4) Fix devlink locking in devlink_dpipe_table_register, from Madhuparna
    Bhowmik.

 5) Fix race causing TX hang in ll_temac, from Esben Haabendal.

 6) Stale eth hdr pointer in br_dev_xmit(), from Nikolay Aleksandrov.

 7) Fix TX hash calculation bounds checking wrt. tc rules, from Amritha
    Nambiar.

 8) Size netlink responses properly in schedule action code to take into
    consideration TCA_ACT_FLAGS. From Jiri Pirko.

 9) Fix firmware paths for mscc PHY driver, from Antoine Tenart.

10) Don't register stmmac notifier multiple times, from Aaro Koskinen.

11) Various rmnet bug fixes, from Taehee Yoo.

12) Fix vsock deadlock in vsock transport release, from Stefano
    Garzarella.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
  net: dsa: mv88e6xxx: Fix masking of egress port
  mlxsw: pci: Wait longer before accessing the device after reset
  sfc: fix timestamp reconstruction at 16-bit rollover points
  vsock: fix potential deadlock in transport->release()
  unix: It's CONFIG_PROC_FS not CONFIG_PROCFS
  net: rmnet: fix packet forwarding in rmnet bridge mode
  net: rmnet: fix bridge mode bugs
  net: rmnet: use upper/lower device infrastructure
  net: rmnet: do not allow to change mux id if mux id is duplicated
  net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device()
  net: rmnet: fix suspicious RCU usage
  net: rmnet: fix NULL pointer dereference in rmnet_changelink()
  net: rmnet: fix NULL pointer dereference in rmnet_newlink()
  net: phy: marvell: don't interpret PHY status unless resolved
  mlx5: register lag notifier for init network namespace only
  unix: define and set show_fdinfo only if procfs is enabled
  hinic: fix a bug of rss configuration
  hinic: fix a bug of setting hw_ioctxt
  hinic: fix a irq affinity bug
  net/smc: check for valid ib_client_data
  ...
2020-02-27 16:34:41 -08:00
Lukas Bulwahn 5901b51f3e MAINTAINERS: Correct Cadence PCI driver path
de80f95ccb ("PCI: cadence: Move all files to per-device cadence
directory") moved files of the PCI cadence drivers, but did not update the
MAINTAINERS entry.

Since then, ./scripts/get_maintainer.pl --self-test complains:

  warning: no file matches F: drivers/pci/controller/pcie-cadence*

Repair the MAINTAINERS entry.

Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2020-02-27 15:49:28 -06:00
Jens Axboe d876836204 io_uring: fix 32-bit compatability with sendmsg/recvmsg
We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise
the iovec import for these commands will not do the right thing and fail
the command with -EINVAL.

Found by running the test suite compiled as 32-bit.

Cc: stable@vger.kernel.org
Fixes: aa1fa28fc7 ("io_uring: add support for recvmsg()")
Fixes: 0fa03c624d ("io_uring: add support for sendmsg()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-02-27 14:17:49 -07:00
Andrew Lunn 3ee339eb28 net: dsa: mv88e6xxx: Fix masking of egress port
Add missing ~ to the usage of the mask.

Reported-by: Kevin Benson <Kevin.Benson@zii.aero>
Reported-by: Chris Healy <Chris.Healy@zii.aero>
Fixes: 5c74c54ce6 ("net: dsa: mv88e6xxx: Split monitor port configuration")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:29:09 -08:00
Amit Cohen ac004e8416 mlxsw: pci: Wait longer before accessing the device after reset
During initialization the driver issues a reset to the device and waits
for 100ms before checking if the firmware is ready. The waiting is
necessary because before that the device is irresponsive and the first
read can result in a completion timeout.

While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is
insufficient for Spectrum-3.

Fix this by increasing the timeout to 200ms.

Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:09:22 -08:00
Alex Maftei (amaftei) 23797b9890 sfc: fix timestamp reconstruction at 16-bit rollover points
We can't just use the top bits of the last sync event as they could be
off-by-one every 65,536 seconds, giving an error in reconstruction of
65,536 seconds.

This patch uses the difference in the bottom 16 bits (mod 2^16) to
calculate an offset that needs to be applied to the last sync event to
get to the current time.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Acked-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:04:49 -08:00
Stefano Garzarella 3f74957fcb vsock: fix potential deadlock in transport->release()
Some transports (hyperv, virtio) acquire the sock lock during the
.release() callback.

In the vsock_stream_connect() we call vsock_assign_transport(); if
the socket was previously assigned to another transport, the
vsk->transport->release() is called, but the sock lock is already
held in the vsock_stream_connect(), causing a deadlock reported by
syzbot:

    INFO: task syz-executor280:9768 blocked for more than 143 seconds.
      Not tainted 5.6.0-rc1-syzkaller #0
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    syz-executor280 D27912  9768   9766 0x00000000
    Call Trace:
     context_switch kernel/sched/core.c:3386 [inline]
     __schedule+0x934/0x1f90 kernel/sched/core.c:4082
     schedule+0xdc/0x2b0 kernel/sched/core.c:4156
     __lock_sock+0x165/0x290 net/core/sock.c:2413
     lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
     virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832
     vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
     vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
     __sys_connect_file+0x161/0x1c0 net/socket.c:1857
     __sys_connect+0x174/0x1b0 net/socket.c:1874
     __do_sys_connect net/socket.c:1885 [inline]
     __se_sys_connect net/socket.c:1882 [inline]
     __x64_sys_connect+0x73/0xb0 net/socket.c:1882
     do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
     entry_SYSCALL_64_after_hwframe+0x49/0xbe

To avoid this issue, this patch remove the lock acquiring in the
.release() callback of hyperv and virtio transports, and it holds
the lock when we call vsk->transport->release() in the vsock core.

Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com
Fixes: 408624af4c ("vsock: use local transport when it is loaded")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 12:03:56 -08:00
David S. Miller 5c05a164d4 unix: It's CONFIG_PROC_FS not CONFIG_PROCFS
Fixes: 3a12500ed5 ("unix: define and set show_fdinfo only if procfs is enabled")
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27 11:52:35 -08:00