Commit Graph

1045409 Commits

Author SHA1 Message Date
Linus Torvalds 8d1f01775f for-5.16/io_uring-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8KHcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphvVEADHMsZP3fOGyJNqnIibIrDL5ZdUGtr5iH3c
 0UIi9It0jo9xOyPX/aY2n1pInXK4vvND9ULC+XGYttSJZXWuYEbMGYQ34du2EP0r
 dypN4JPwO6X+mFkJND6x8IeDCzj/fy6LCFbWbRlDNsndTZ/gavVTOybMpOLdCJx9
 IyXE1iHismaIaD7I3Q77zvN0ei87cEwBfg9R0vRAXKBKUh5raSiLWsOYOiXQkZH4
 8iUeDmOLlaWghgXwweODxARXuWq+gWZgiBMd0tp0QCECXMv+NIpfJYauvLHJDa/u
 QScr9uRMrJS3KgRgt61o+Z2fcpzJF/bL0e0s5Ul9CgflRWucARbgodUMl4rZCi9D
 WOwxPxv8Oab8IT7Qc/ZHdY3ULJsULRgbtmc/9OqPL5Y/Ww9/9E63Is8O4q/QFc7T
 xJ1p5yZKw3G+G7oG0YBYE0U+x3RUzi4b/Ob+ECeLcAAAcp+XFg6epK6Aj8HDWd8K
 kGYlEBKEq1hILM44K59YTwAT/Cp+fkwe+x7pNQ3JjqtPpVpqGT7RoMUuCduofT1J
 ROtB+S8/AwhdABL6KKUYSVF8zlfoXbQpQs3SUKjaBtPVjwXLZwXERy7ttD/4STtT
 QjC+5/qAWnMR8CYADE0E3rlicUkHJm1+AHukYLz0REphDcNO8GuB9PCDzX4SX/ol
 SGJ6hoprYQ==
 =5U4u
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/io_uring-2021-10-29' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:
 "Light on new features - basically just the hybrid mode support.

  Outside of that it's just fixes, cleanups, and performance
  improvements.

  In detail:

   - Add ring related information to the fdinfo output (Hao)

   - Hybrid async mode (Hao)

   - Support for batched issue on block (me)

   - sqe error trace improvement (me)

   - IOPOLL efficiency improvements (Pavel)

   - submit state cleanups and improvements (Pavel)

   - Completion side improvements (Pavel)

   - Drain improvements (Pavel)

   - Buffer selection cleanups (Pavel)

   - Fixed file node improvements (Pavel)

   - io-wq setup cancelation fix (Pavel)

   - Various other performance improvements and cleanups (Pavel)

   - Misc fixes (Arnd, Bixuan, Changcheng, Hao, me, Noah)"

* tag 'for-5.16/io_uring-2021-10-29' of git://git.kernel.dk/linux-block: (97 commits)
  io-wq: remove worker to owner tw dependency
  io_uring: harder fdinfo sq/cq ring iterating
  io_uring: don't assign write hint in the read path
  io_uring: clusterise ki_flags access in rw_prep
  io_uring: kill unused param from io_file_supports_nowait
  io_uring: clean up timeout async_data allocation
  io_uring: don't try io-wq polling if not supported
  io_uring: check if opcode needs poll first on arming
  io_uring: clean iowq submit work cancellation
  io_uring: clean io_wq_submit_work()'s main loop
  io-wq: use helper for worker refcounting
  io_uring: implement async hybrid mode for pollable requests
  io_uring: Use ERR_CAST() instead of ERR_PTR(PTR_ERR())
  io_uring: split logic of force_nonblock
  io_uring: warning about unused-but-set parameter
  io_uring: inform block layer of how many requests we are submitting
  io_uring: simplify io_file_supports_nowait()
  io_uring: combine REQ_F_NOWAIT_{READ,WRITE} flags
  io_uring: arm poll for non-nowait files
  fs/io_uring: Prioritise checking faster conditions first in io_write
  ...
2021-11-01 09:41:33 -07:00
Linus Torvalds 643a7234e0 for-5.16/drivers-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8KFsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgph1ZEACwNuHkAZcIgNzKhzuLP9OjMhv9vV+q254G
 /EcM31e+qgRioMd0ihbVsgW76jOwLEmb3ldKGcN+0Wo5+Sv9Im8+wAWYY1REOZO5
 ZTUBfAzhEh63/EtqTFiU8U+7dmXqy4z7NaICnhlynjwkd3IT+I561os6kcqwJMMr
 G+Q1Cnk9rgCMIoLOCoVThIpjmjyZzF33qJb2VEIkHfkot62iNdpABWaSASF+CCba
 z8LfbvLAYz3YLl4thXlLJFU282T5y7gzgSomGvX4F0rMJSbqFbgoNEPxaYw9CvzC
 uC6MnYCYdCdvVkWVm1b8I8LYzPd5GrpVOSh3JQGvuA4Ppv2IyJCDSruYGgVUlhao
 cVPzuHCqNCfKk0ykYVRZy9oKiBk5wmFeKM/lSHu408y8VNraPNIAEpB6sA9qGr22
 AYr8lNh3JDr0g8dtFsDOq+7u3MANW0KQozfzwTPZo6NjzEE1D2jIg39Ljiijo9+Y
 3pU8pitIAhsKd2KhW1H6LmtJbF4dX756VKYDXOhzgORU0NZYgvGhBIj9tAdpQR0S
 xeae5Kj0/wBGcqR/owf/n1EY/q7rWgNDETnsBhbmzMZyhwH3L6zhT+bfD8YoQCHY
 ueyqhyIUe4YBxTrIpICqwDlqaMYAmQ0jRaci+bK9ovVlQ89FQ9o/BE2COPlI/DGX
 w+rUmmoX4g==
 =HiWU
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/drivers-2021-10-29' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:

 - paride driver cleanups (Christoph)

 - Remove cryptoloop support (Christoph)

 - null_blk poll support (me)

 - Now that add_disk() supports proper error handling, add it to various
   drivers (Luis)

 - Make ataflop actually work again (Michael)

 - s390 dasd fixes (Stefan, Heiko)

 - nbd fixes (Yu, Ye)

 - Remove redundant wq flush in mtip32xx (Christophe)

 - NVMe updates
      - fix a multipath partition scanning deadlock (Hannes Reinecke)
      - generate uevent once a multipath namespace is operational again
        (Hannes Reinecke)
      - support unique discovery controller NQNs (Hannes Reinecke)
      - fix use-after-free when a port is removed (Israel Rukshin)
      - clear shadow doorbell memory on resets (Keith Busch)
      - use struct_size (Len Baker)
      - add error handling support for add_disk (Luis Chamberlain)
      - limit the maximal queue size for RDMA controllers (Max Gurtovoy)
      - use a few more symbolic names (Max Gurtovoy)
      - fix error code in nvme_rdma_setup_ctrl (Max Gurtovoy)
      - add support for ->map_queues on FC (Saurav Kashyap)
      - support the current discovery subsystem entry (Hannes Reinecke)
      - use flex_array_size and struct_size (Len Baker)

 - bcache fixes (Christoph, Coly, Chao, Lin, Qing)

 - MD updates (Christoph, Guoqing, Xiao)

 - Misc fixes (Dan, Ding, Jiapeng, Shin'ichiro, Ye)

* tag 'for-5.16/drivers-2021-10-29' of git://git.kernel.dk/linux-block: (117 commits)
  null_blk: Fix handling of submit_queues and poll_queues attributes
  block: ataflop: Fix warning comparing pointer to 0
  bcache: replace snprintf in show functions with sysfs_emit
  bcache: move uapi header bcache.h to bcache code directory
  nvmet: use flex_array_size and struct_size
  nvmet: register discovery subsystem as 'current'
  nvmet: switch check for subsystem type
  nvme: add new discovery log page entry definitions
  block: ataflop: more blk-mq refactoring fixes
  block: remove support for cryptoloop and the xor transfer
  mtd: add add_disk() error handling
  rnbd: add error handling support for add_disk()
  um/drivers/ubd_kern: add error handling support for add_disk()
  m68k/emu/nfblock: add error handling support for add_disk()
  xen-blkfront: add error handling support for add_disk()
  bcache: add error handling support for add_disk()
  dm: add add_disk() error handling
  block: aoe: fixup coccinelle warnings
  nvmet: use struct_size over open coded arithmetic
  nvme: drop scan_lock and always kick requeue list when removing namespaces
  ...
2021-11-01 09:27:38 -07:00
Linus Torvalds 33c8846c81 for-5.16/block-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8KDgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmQ2D/wO0nH3U+3+OZChi3XUwYck9Dev3o6BANCF
 ClATiK/kivZY0xY1r8J4ixirZo2gcjIMpWSC3JGYZ5LdspfmYGLUbMjfZsaeU23i
 lAKaX1IqfArmHN76k3IU1bKCg7B0/LFwC0q9QTFWTSwNSs8RK/EZLJ61U1hEXUb3
 OfIpaMmvPiMaU7yuPqhcZK14m1cg1srrLM4rFB/PqsWWStF07pHq32WeArGDAU0e
 Fe0YSnYD7qqA5Qc37KwqjCTmmxKX5YZf7etIcA6p3DNmwcuQrVNzKoCH/ZEDijaD
 E2bS/BWbN1x96+rtoEZfBYEaNIrkmJzmW6+fJ53OITbJF3KqP6V66erhqNcFYCzC
 mhFlRe7voXb/8AP7zQqSIhK529BUBM36sQ6nF7EiQcDrfLc1z39mq6eblUxbknIA
 DDPISD5Tseik9N9x0bc7vINseKyHI1E90VAU/XKADcuGbzLvehPx+2p+Iq5ch5Ah
 oa1G3RdlWWQOZxphJHWJhu1qMfo5+FP9dFZj1aoo7b8Kbc/CedyoQe71cpIE5wNh
 Jj/EpWJnuyKXwuTic2VYGC+6ezM9O5DSdqCfP3YuZky95VESyvRCKJYMMgBYRVdC
 /LuxhnBXIY2G8An7ZTnX0kLCCvLbapIwa0NyA98/xeOngO843coJ6wn8ZmE9LJNH
 kMmpCygUrA==
 =QWC+
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - mq-deadline accounting improvements (Bart)

 - blk-wbt timer fix (Andrea)

 - Untangle the block layer includes (Christoph)

 - Rework the poll support to be bio based, which will enable adding
   support for polling for bio based drivers (Christoph)

 - Block layer core support for multi-actuator drives (Damien)

 - blk-crypto improvements (Eric)

 - Batched tag allocation support (me)

 - Request completion batching support (me)

 - Plugging improvements (me)

 - Shared tag set improvements (John)

 - Concurrent queue quiesce support (Ming)

 - Cache bdev in ->private_data for block devices (Pavel)

 - bdev dio improvements (Pavel)

 - Block device invalidation and block size improvements (Xie)

 - Various cleanups, fixes, and improvements (Christoph, Jackie,
   Masahira, Tejun, Yu, Pavel, Zheng, me)

* tag 'for-5.16/block-2021-10-29' of git://git.kernel.dk/linux-block: (174 commits)
  blk-mq-debugfs: Show active requests per queue for shared tags
  block: improve readability of blk_mq_end_request_batch()
  virtio-blk: Use blk_validate_block_size() to validate block size
  loop: Use blk_validate_block_size() to validate block size
  nbd: Use blk_validate_block_size() to validate block size
  block: Add a helper to validate the block size
  block: re-flow blk_mq_rq_ctx_init()
  block: prefetch request to be initialized
  block: pass in blk_mq_tags to blk_mq_rq_ctx_init()
  block: add rq_flags to struct blk_mq_alloc_data
  block: add async version of bio_set_polled
  block: kill DIO_MULTI_BIO
  block: kill unused polling bits in __blkdev_direct_IO()
  block: avoid extra iter advance with async iocb
  block: Add independent access ranges support
  blk-mq: don't issue request directly in case that current is to be blocked
  sbitmap: silence data race warning
  blk-cgroup: synchronize blkg creation against policy deactivation
  block: refactor bio_iov_bvec_set()
  block: add single bio async direct IO helper
  ...
2021-11-01 09:19:50 -07:00
Linus Torvalds 9ac211426f File locking changes for v5.16
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmF77NQTHGpsYXl0b25A
 a2VybmVsLm9yZwAKCRAADmhBGVaCFfP9D/wN8rCAPA2J6SpBjdJXSpSQS4PoAOqC
 Z002bOc7sq/zg2cWk+pX1aOR/+wUpk+PvaQdyvfO+o4TVCpsTOklRh/yfYuOkJdP
 PoINUR7vb43/CGqd04YI45+pxOFMJk9JoLkNha0uIY4ukXdt9mA6u/+qBEDboyDQ
 Jbn1JGitRc9WYaE7BV26ba0l+Deb5h2/4c1DiDlsgmLkDPhpowkOznovUCkBnH7H
 bfwlssjZ5P0K5ttZDw6VlkC2xE+Yr56BsEco2bXO42LwUHOx6r6ZNp04rh9zh1Zp
 hFPybgU+ur17EOOmBbCq8aHZqRRcjQQDH/rZ1heHSOfTrEWWth1xNjmeewSgRZHL
 0oMi3yIJPwvuDBQPEQg0VD87k5Z8xbRPql6eM6GeGxDZvzXWqqYKW2OYXtNxG91m
 bGvu2OOGkdF/4WGYBixdjUQb5KjcOqdIFkq3/oHfLQ+cS2uc6eOfnCdxa7cTnTdd
 BcFDgZmWQDLFs6/DIbwUI0KWMAiLFMZnZACE937JvlE74EGiHu47JMzwcU15J6zO
 VD0Oq0XsWQN+TgY2RnjuTFqm6DTpbrkgw88sNDr5g3jZbgJZiZ/r/3M55lcBVWvk
 PFT4fjKhD1mT6/SscAAmOxUKUeDbN7EktiRsZmH9C2sUCERufDb/cmY/RYZ00C4t
 01ovPjs7VukS/w==
 =bcaq
 -----END PGP SIGNATURE-----

Merge tag 'locks-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking updates from Jeff Layton:
 "Most of this is just follow-on cleanup work of documentation and
  comments from the mandatory locking removal in v5.15.

  The only real functional change is that LOCK_MAND flock() support is
  also being removed, as it has basically been non-functional since the
  v2.5 days"

* tag 'locks-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fs: remove leftover comments from mandatory locking removal
  locks: remove changelog comments
  docs: fs: locks.rst: update comment about mandatory file locking
  Documentation: remove reference to now removed mandatory-locking doc
  locks: remove LOCK_MAND flock lock support
2021-11-01 09:06:53 -07:00
Linus Torvalds ad98a92466 tpmdd updates for Linux v5.16
-----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCYXdi4xIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9LSLgD/cnxDX1yhRPve9udhWlvP4n1e/GRSM5Jy
 69mOWRP2MqUBANxayKOcFCWKOFWT4xgepA9/rrf9CjK/Du/Hk5OnpuYJ
 =NzZ4
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen:
 "Only bug fixes"

* tag 'tpmdd-next-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm_tis_spi: Add missing SPI ID
  tpm: fix Atmel TPM crash caused by too frequent queries
  tpm: Check for integer overflow in tpm2_map_response_body()
  tpm: tis: Kconfig: Add helper dependency on COMPILE_TEST
2021-11-01 09:02:15 -07:00
Linus Torvalds 49f8275c7d Memory folios
Add memory folios, a new type to represent either order-0 pages or
 the head page of a compound page.  This should be enough infrastructure
 to support filesystems converting from pages to folios.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmF9uI0ACgkQDpNsjXcp
 gj7MUAf/R7LCZ+xFiIedw7SAgb/DGK0C9uVjuBEIZgAw21ZUw/GuPI6cuKBMFGGf
 rRcdtlvMpwi7yZJcoNXxaqU/xPaaJMjf2XxscIvYJP1mjlZVuwmP9dOx0neNvWOc
 T+8lqR6c1TLl82lpqIjGFLwvj2eVowq2d3J5jsaIJFd4odmmYVInrhJXOzC/LQ54
 Niloj5ksehf+KUIRLDz7ycppvIHhlVsoAl0eM2dWBAtL0mvT7Nyn/3y+vnMfV2v3
 Flb4opwJUgTJleYc16oxTn9svT2yS8q2uuUemRDLW8ABghoAtH3fUUk43RN+5Krd
 LYCtbeawtkikPVXZMfWybsx5vn0c3Q==
 =7SBe
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache

Pull memory folios from Matthew Wilcox:
 "Add memory folios, a new type to represent either order-0 pages or the
  head page of a compound page. This should be enough infrastructure to
  support filesystems converting from pages to folios.

  The point of all this churn is to allow filesystems and the page cache
  to manage memory in larger chunks than PAGE_SIZE. The original plan
  was to use compound pages like THP does, but I ran into problems with
  some functions expecting only a head page while others expect the
  precise page containing a particular byte.

  The folio type allows a function to declare that it's expecting only a
  head page. Almost incidentally, this allows us to remove various calls
  to VM_BUG_ON(PageTail(page)) and compound_head().

  This converts just parts of the core MM and the page cache. For 5.17,
  we intend to convert various filesystems (XFS and AFS are ready; other
  filesystems may make it) and also convert more of the MM and page
  cache to folios. For 5.18, multi-page folios should be ready.

  The multi-page folios offer some improvement to some workloads. The
  80% win is real, but appears to be an artificial benchmark (postgres
  startup, which isn't a serious workload). Real workloads (eg building
  the kernel, running postgres in a steady state, etc) seem to benefit
  between 0-10%. I haven't heard of any performance losses as a result
  of this series. Nobody has done any serious performance tuning; I
  imagine that tweaking the readahead algorithm could provide some more
  interesting wins. There are also other places where we could choose to
  create large folios and currently do not, such as writes that are
  larger than PAGE_SIZE.

  I'd like to thank all my reviewers who've offered review/ack tags:
  Christoph Hellwig, David Howells, Jan Kara, Jeff Layton, Johannes
  Weiner, Kirill A. Shutemov, Michal Hocko, Mike Rapoport, Vlastimil
  Babka, William Kucharski, Yu Zhao and Zi Yan.

  I'd also like to thank those who gave feedback I incorporated but
  haven't offered up review tags for this part of the series: Nick
  Piggin, Mel Gorman, Ming Lei, Darrick Wong, Ted Ts'o, John Hubbard,
  Hugh Dickins, and probably a few others who I forget"

* tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache: (90 commits)
  mm/writeback: Add folio_write_one
  mm/filemap: Add FGP_STABLE
  mm/filemap: Add filemap_get_folio
  mm/filemap: Convert mapping_get_entry to return a folio
  mm/filemap: Add filemap_add_folio()
  mm/filemap: Add filemap_alloc_folio
  mm/page_alloc: Add folio allocation functions
  mm/lru: Add folio_add_lru()
  mm/lru: Convert __pagevec_lru_add_fn to take a folio
  mm: Add folio_evictable()
  mm/workingset: Convert workingset_refault() to take a folio
  mm/filemap: Add readahead_folio()
  mm/filemap: Add folio_mkwrite_check_truncate()
  mm/filemap: Add i_blocks_per_folio()
  mm/writeback: Add folio_redirty_for_writepage()
  mm/writeback: Add folio_account_redirty()
  mm/writeback: Add folio_clear_dirty_for_io()
  mm/writeback: Add folio_cancel_dirty()
  mm/writeback: Add folio_account_cleaned()
  mm/writeback: Add filemap_dirty_folio()
  ...
2021-11-01 08:47:59 -07:00
Linus Torvalds 8bb7eca972 Linux 5.15 2021-10-31 13:53:10 -07:00
Linus Torvalds 75fcbd3860 perf tools fixes for v5.15: 5th batch
- Fix compilation of callchain related code on powerpc with gcc11+.
 
 - Fix PERF_SAMPLE_WEIGHT_STRUCT support in 'perf script'
 
 - Check session->header.env.arch before using it, fixing a segmentation fault.
 
 - Suppress 'rm dlfilter' build messages.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYX7NvgAKCRCyPKLppCJ+
 J0kmAQCAuh2Pt2eL+KUat/RgHXG+on5EkFfqlHVzebjmTv3/nwD/eyKF5AcgvTfb
 DGIntW5QskCeAal05g9Po1+xPdU2aQU=
 =3oio
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix compilation of callchain related code on powerpc with gcc11+

 - Fix PERF_SAMPLE_WEIGHT_STRUCT support in 'perf script'

 - Check session->header.env.arch before using it, fixing a segmentation
   fault

 - Suppress 'rm dlfilter' build messages

* tag 'perf-tools-fixes-for-v5.15-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
  perf callchain: Fix compilation on powerpc with gcc11+
  perf script: Check session->header.env.arch before using it
  perf build: Suppress 'rm dlfilter' build message
2021-10-31 11:24:06 -07:00
Linus Torvalds ca5e83eddc * Fixes for s390 interrupt delivery
* Fixes for Xen emulator bugs showing up as debug kernel WARNs
 * Fix another issue with SEV/ES string I/O VMGEXITs
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmF6uGIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNRagf/Srvk9lNcRh4cEzsczErKMyr3xOqA
 jgsTSqgl1ExJI9sBLMpVYBOFGILMaMSrhLPIltKPy0Bj/E+hw8WOQwPa44QjWlSD
 MAUxO1Nryt9Luc2L8uSd1c//g4fr4V1BhOaumk1lM14Q8EDfQBcDIMI2ZKueMU1+
 2Q+n8/AsG63jQIINwKNidof0dzRtbfcE30Wq/8QHttIPo5wt6l0YClOlOikqNY8N
 5+WSQFmuutHIXftq5Jb/Ldn/+HVukWZyZOEVwLnBpM9uBvIubNgcEakqvxsaVtAn
 FHdvnA+Bk99/Xuhl+wRLQo8ofzQIQ13RQv3HPArJAJv34oAJZx2rNObVlA==
 =6ofB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:

 - Fixes for s390 interrupt delivery

 - Fixes for Xen emulator bugs showing up as debug kernel WARNs

 - Fix another issue with SEV/ES string I/O VMGEXITs

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Take srcu lock in post_kvm_run_save()
  KVM: SEV-ES: fix another issue with string I/O VMGEXITs
  KVM: x86/xen: Fix kvm_xen_has_interrupt() sleeping in kvm_vcpu_block()
  KVM: x86: switch pvclock_gtod_sync_lock to a raw spinlock
  KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu
  KVM: s390: clear kicked_mask before sleeping again
2021-10-31 11:19:02 -07:00
Kan Liang 27730c8cd6 perf script: Fix PERF_SAMPLE_WEIGHT_STRUCT support
-F weight in perf script is broken.

  # ./perf mem record
  # ./perf script -F weight
  Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.

The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.

With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.

Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT

Fixes: ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Jiri Olsa 89ac61ff05 perf callchain: Fix compilation on powerpc with gcc11+
Got following build fail on powerpc:

    CC      arch/powerpc/util/skip-callchain-idx.o
  In function ‘check_return_reg’,
      inlined from ‘check_return_addr’ at arch/powerpc/util/skip-callchain-idx.c:213:7,
      inlined from ‘arch_skip_callchain_idx’ at arch/powerpc/util/skip-callchain-idx.c:265:7:
  arch/powerpc/util/skip-callchain-idx.c:54:18: error: ‘dwarf_frame_register’ accessing 96 bytes \
  in a region of size 64 [-Werror=stringop-overflow=]
     54 |         result = dwarf_frame_register(frame, ra_regno, ops_mem, &ops, &nops);
        |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/powerpc/util/skip-callchain-idx.c: In function ‘arch_skip_callchain_idx’:
  arch/powerpc/util/skip-callchain-idx.c:54:18: note: referencing argument 3 of type ‘Dwarf_Op *’
  In file included from /usr/include/elfutils/libdwfl.h:32,
                   from arch/powerpc/util/skip-callchain-idx.c:10:
  /usr/include/elfutils/libdw.h:1069:12: note: in a call to function ‘dwarf_frame_register’
   1069 | extern int dwarf_frame_register (Dwarf_Frame *frame, int regno,
        |            ^~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors

The dwarf_frame_register args changed with [1],
Updating ops_mem accordingly.

[1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=5621fe5443da23112170235dd5cac161e5c75e65

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Mark Wieelard <mjw@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20210928195253.1267023-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Song Liu 29c77550ee perf script: Check session->header.env.arch before using it
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.

Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.

Committer notes:

If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.

Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Adrian Hunter 095729484e perf build: Suppress 'rm dlfilter' build message
The following build message:

	rm dlfilters/dlfilter-test-api-v0.o

is unwanted.

The object file is being treated as an intermediate file and being
automatically removed. Mark the object file as .SECONDARY to prevent
removal and hence the message.

Requested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210930062849.110416-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-31 12:51:41 -03:00
Linus Torvalds 180eca540a SCSI fixes on 20211030
Three small fixes, all in drivers, and one sizeable update to the UFS
 driver to remove the HPB 2.0 feature that has been objected to by Jens
 and Christoph.  Although the UFS patch is large and last minute, it's
 essentially the least intrusive way of resolving the objections in
 time for the 5.15 release.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYX25RSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQSWAP48whPB
 LUQx2tffOVmf18q5HA22mFXz/KTjgmvTVvNJtAEAn8OPN3OAnndPgBYDXAKhChI9
 iMvhn3UwYGM+6DQ9bHo=
 =qS2s
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three small fixes, all in drivers, and one sizeable update to the UFS
  driver to remove the HPB 2.0 feature that has been objected to by Jens
  and Christoph.

  Although the UFS patch is large and last minute, it's essentially the
  least intrusive way of resolving the objections in time for the 5.15
  release"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: ufshpb: Remove HPB2.0 flows
  scsi: mpt3sas: Fix reference tag handling for WRITE_INSERT
  scsi: ufs: ufs-exynos: Correct timeout value setting registers
  scsi: ibmvfc: Fix up duplicate response detection
2021-10-30 15:56:38 -07:00
Linus Torvalds 3a4347d82e One fix for the composite clk that broke when we changed this clk type
to use the determine_rate instead of round_rate clk op by default. This
 caused lots of problems on Rockchip SoCs because they heavily use
 the composite clk code to model the clk tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmF8sUARHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSVDDhAAn4VSkWRS2mlwSWHCjrNc5XBVqTuzQX4k
 pH8k8rwLYJgEMnzN0a62I5kPUYcx47ILX/5+64vHfUyxYQsuDWu3O+Uvn8N5q8iD
 pXwH+Thac30DdCYoi7NRHmfKg0wFuMf4ExlLzfSP64OvPR4OeIz45BLo2KcnASy5
 9KW/epH89fwUDpE01p9wI+nS6ItJ3qTvm9PUIOWqUOJz0ZkBjsIid2I6y7k/HgEk
 sfjBq2swvxaF7cIuCA11vesJ8pFVSSRowHgHdGsHa3okrB/x0nsi2JW2ie2zt/VQ
 EFkuhuiWfs2O5brjqZiQLgkeggEKkO+9V4o30s3mVFLx9hcEsKR7a3XaPOtXdlHh
 XrnDKnspWV1HKRRceRMbNw2oxeaDcJ2RC5kXpAVJraGeMDjn1NyYMxCFOyuwY0q6
 8Mb7M1dRuMCdjIrhBGss+u5wg/wEG/vUZF6M4LBZAPzgRmX6lKVqzZrLH+9M0xSJ
 ey77No79nhntzPg57FZHED/ghL+5TNVyp9pO2rDcVcn1QjPGcNw3M9Fm/yfRPAJF
 C+5mvW5aoc+B6DFuQrpKSfUaa2KgqFKBj2CcEST/fcAH+FNKe9EZ8fkkM0jCdqZo
 nNK5CbsuW5HSCF+L5EALosru4krAFK/VNzXI7hzfraKOLymkw6h9j78F+Gi0qRzS
 0IHipNdFIhA=
 =5b/O
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "One fix for the composite clk that broke when we changed this clk type
  to use the determine_rate instead of round_rate clk op by default.
  This caused lots of problems on Rockchip SoCs because they heavily use
  the composite clk code to model the clk tree"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: composite: Also consider .determine_rate for rate + mux composites
2021-10-30 09:55:46 -07:00
Linus Torvalds bf85ba018f RISC-V Fixes for 5.15 (or -rc8)
* A fix to ensure the trap vector's address is aligned.
 * A fix to avoid re-populating the KASAN shadow memory.
 * A fix to allow kasan to build without warnings, which have recently
   become errors.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmF8OSoTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVQvD/9u7Dgx473QjQWhP4mgHe6QWhBHVOB3
 3b4eJ39aSvpo7ODbw8ZpJ/SfHLM4UzB3/jjuIZW1tsM0TfwryOVPho5rqRugP5ho
 8kHX4TS6eRz3wVxpBWaiJP8sSoeDkq3+SyVVA+vK4siFZ67RdzRbn+vt3JsF0B9M
 jvyJJQrNxRvnY/yR6CWXSvyrzHPxfd0OPOHF5cYBFRnPK/bp3hSVhyTFUt17MCWE
 SBOKkUU5zuWFqZxYxq8i+g+wd5HvMGOkm4NhpVtU88sgq3EVh6yCMDeFSTn54n2r
 NIw356/nNLOeX+nRJP7vQZrKeU4lHrSSc8Kma1tuekgdV6mxbZbueJWrOzDAE7XM
 0CAMwwF2tn+jjR0Q9LmtonjYggGzQuOF0djLyL6HFTbp3VD7I6J8HXtmD5K9vA+I
 zspx7kHT6zkF/6XzTkDImkXSAJTlzxgeFKVYhHfSzd4oKviYrCTf92t3XZnO55Ug
 settBaGeMjzi9uYaHjdeiOBJnZXbL1Yxvy4zDcK91Gwhsp4f3Q45Qc/90WtUOxIu
 R91xrrEH4cmQIbPMH1UvuQ710wRfypv+AuzeYgufj1HAihNV2cs90m34y1uFzD6y
 Ozgdu+x46ac1oC+iOrMygzli+OCqLKHrqBkDLI2Zd1pzBgJWXAuT/oxarOkv5p6P
 Ot8yxTZ72qJxWA==
 =BArU
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "These are pretty late, but they do fix concrete issues.

   - ensure the trap vector's address is aligned.

   - avoid re-populating the KASAN shadow memory.

   - allow kasan to build without warnings, which have recently become
     errors"

* tag 'riscv-for-linus-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix asan-stack clang build
  riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
  riscv: fix misalgned trap vector base address
2021-10-30 09:28:24 -07:00
Avri Altman 09d9e4d041 scsi: ufs: ufshpb: Remove HPB2.0 flows
The Host Performance Buffer feature allows UFS read commands to carry the
physical media addresses along with the LBAs, thus allowing less internal
L2P-table switches in the device.  HPB1.0 allowed a single LBA, while
HPB2.0 increases this capacity up to 255 blocks.

Carrying more than a single record, the read operation is no longer purely
of type "read" but a "hybrid" command: Writing the physical address to the
device in one operation and reading back the required payload in another.

The JEDEC HPB spec defines two commands for this operation:
HPB-WRITE-BUFFER (0x2) to write the physical addresses to device, and
HPB-READ to read the payload.

With the current HPB design the UFS driver has no alternative but to divide
the READ request into 2 separate commands: HPB-WRITE-BUFFER and HPB-READ.
This causes a great deal of aggravation to the block layer guys who
demanded that we completely revert the entire HPB driver regardless of the
huge amount of corporate effort already invested in it.

As a compromise, remove only the pieces that implement the 2.0
specification. This is done as a matter of urgency for the final 5.15
release.

Link: https://lore.kernel.org/r/20211030062301.248-1-avri.altman@wdc.com
Tested-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Co-developed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-30 10:01:01 -04:00
Linus Torvalds 119c85055d powerpc fixes for 5.15 #6
Three commits fixing some issues introduced with the recent IOMMU changes we merged.
 
 Thanks to: Alexey Kardashevskiy
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmF8e20THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgG4GEAC6SAo82kdWOR5MEwZ+U9ht91KqzK/O
 UYm5WLapGaWzKtS5zMZUzc3dTXFsbeSdPghfGp2eLpnkQkfDnltyGA4ERVunEQUg
 soGXt0OXBfmt++D+yogMkURr9tYzZ+ssrdCEC6Vmuv35Tf/dsHA9upWCVVw4UOOv
 w/RRR1uku6kup7NCX6TJZoUQSeAXISDhwk3LXF5jh/hqSyV3BV9yulHYs3J4WtFQ
 zr2dsxXL5DvgM3cOmLBZbnmTYSmU1f0jtGhqEf/6Ar3ljLHZgIvLLWmZK1UyyMwv
 4P7yFIBVObW0sBqGz/4K9p73l8MnVHORrBUe73OavZgrxaM2u0k/7JRj72txcbMw
 KXNKvDoA9nambWV98OXt4HE2bInsporn4DSDSAwJZmuTcqltbFOCf47t0kpe66fB
 ZC2IJCOfrol5ELZvmdAsuVWEucHkdPMPqz0ZB46E/givQf45RmfYnjvVKm8HTQls
 aOivKNuP2VLXhHdeocly1adaDMaeMEouYw1p00VKhOW4GbfHCZs4IczDtqacHRMR
 NB+C0awi6DDQ7WXCEYURte9iURn8owjJhMuIsQj2/SUHVfuQEeJL1llSzo9sa/f2
 ONT0esDhKSuq74T1nmimUMOvAyBg/TKuwY87TJ9BKcJzqSj+oPuuw28d8HBi61qk
 S9rG2jHto84HVw==
 =R9Yq
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Three commits fixing some issues introduced with the recent IOMMU
  changes we merged.

  Thanks to Alexey Kardashevskiy"

* tag 'powerpc-5.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries/iommu: Create huge DMA window if no MMIO32 is present
  powerpc/pseries/iommu: Check if the default window in use before removing it
  powerpc/pseries/iommu: Use correct vfree for it_map
2021-10-29 17:35:56 -07:00
Linus Torvalds db2398a56a gpio fixes for v5.15
- fix the return value check when parsing the ngpios property in gpio-xgs-iproc
 - check the return value of bgpio_init() in gpio-mlxbf2
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmF8XHAACgkQEacuoBRx
 13ImBg/+NT/XE6gU+vuXNrRhNRqYI1s3HRn/mviCXMnjS2OJGD5Pd6KOt4gWX+Sm
 z8/+/4W9hwLRmd4oFF+4+Q073YHmNNBXYdQ/JZM37KDF88hHDcX675pBX19qJjKT
 xWgAzF3LMDHgTtXUrvsf36e2tezkZcnN6Rda2xzEi+j9Tq+ZrePfD7V44YlY3ka6
 Vu/+NODR2umYc6rcrvIb2hHqUBIQUDCmIkIAGFK4CwVdC+VxtWya3cB4L8JqtF4l
 KMPIC9mO2jyAho0LufF/uBAlgNyeg/rfYpSYJo9pw882aKupxAMjI9OUT2CkwuKU
 tCi4wsYZUas5CqULe7bNNoLT4kZNVbiauc/3pqew0P84T9Uus2I4CyxHSgU9sfLx
 3767JdXwJPZPyJFVSkdvQTidVRVZEgRKTkV/mc8f3FBw6sAa3OGOQKoSeuhS8lii
 BYdZ+9dulLHI3wlhfNSaY/mBdO0flLwsUZppOo5nDXm36Wz1jbbk1sApDFP6Ywsd
 YHY5W5gyb84Qt7alQ2MbzhkFrr9dFKdfIGmLDK0dtPna8DM7Wy0GyLGiJKkUf/Kw
 sP2SYGrNMP0s/pH1j4O3ULiVX1Fo0L7FgyycnWfUmxinE8Z/yzjbdAcxdO6SwYMy
 iSxvuuT08wvnJ8tZih3+y3gnL+gAOhlu0xndfcXdSYDs2f31ZeQ=
 =Sxrc
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix the return value check when parsing the ngpios property in
   gpio-xgs-iproc

 - check the return value of bgpio_init() in gpio-mlxbf2

* tag 'gpio-fixes-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: mlxbf2.c: Add check for bgpio_init failure
  gpio: xgs-iproc: fix parsing of ngpios property
2021-10-29 17:04:38 -07:00
Linus Torvalds a379fbbcb8 block-5.15-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8DN8QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptM8D/44F9YcY8qRhrZmsUFr0QFlvFHHUCVCWtDR
 JW3JQN3hV0zBEVIvc0P3NSKAih/1+rJ3WZmVZA0lczm5OHv4C+ESZSmcl3Muv4Tk
 skOWxwDTIoSCvC+DzDw8k5UluOucLU9V7uLHQYqDOsqHngLwUerGDMwGfkMXKkNb
 zRvVaqQMUJufY0tN5QaEjl+GsaXiZJ0pid0MOtXo8NeU+K0BDyoBUF5Gco3/8ZYa
 NtD4hwM48kYoCNJDAeJmRNo3vArPpZdiJ77jeVXHrj42Mp20LK/jD7PdguEbUzq5
 3uXhn0boZCKFGhWQntkL18WwaZbFRZzTBpBqpFIjQKIvicNRoGArIguwwmwdt53P
 lbsgGgyMqQ3KvuOIEgrAFieA/mQ8iw8Pf/QWiQRk2aYA5n+miex1XfmVX7dVipdm
 OcV5HtLrKPR1newr0/eZIvN31C3tgaViYxxQOunfW57fXPthCazal+ON6K5w9ZZ8
 y79P+K1czCS/edKLTB+idvmWWijoF4GRguUMoCKsD4uXOZO0tk/f/U/ds8/+LBxm
 KQv8T9wBd+r5h225cB+boM+zslkB4vqnCT+MyiIp070ZAbi/ohirYcFC2fM5j57B
 UZ58C+WjGjtC0nhb0xS8EGM43ow6hkS+8LPub3fb01AKKvfaw22grc9vHBaFgXgW
 5rqLDaAIcg==
 =73x5
 -----END PGP SIGNATURE-----

Merge tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request:
      - fix nvmet-tcp header digest verification (Amit Engel)
      - fix a memory leak in nvmet-tcp when releasing a queue (Maurizio
        Lombardi)
      - fix nvme-tcp H2CData PDU send accounting again (Sagi Grimberg)
      - fix digest pointer calculation in nvme-tcp and nvmet-tcp (Varun
        Prakash)
      - fix possible nvme-tcp req->offset corruption (Varun Prakash)

 - Queue drain ordering fix (Ming)

 - Partition check regression for zoned devices (Shin'ichiro)

 - Zone queue restart fix (Naohiro)

* tag 'block-5.15-2021-10-29' of git://git.kernel.dk/linux-block:
  block: Fix partition check for host-aware zoned block devices
  nvmet-tcp: fix header digest verification
  nvmet-tcp: fix data digest pointer calculation
  nvme-tcp: fix data digest pointer calculation
  nvme-tcp: fix possible req->offset corruption
  block: schedule queue restart after BLK_STS_ZONE_RESOURCE
  block: drain queue after disk is removed from sysfs
  nvme-tcp: fix H2CData PDU send accounting (again)
  nvmet-tcp: fix a memory leak when releasing a queue
2021-10-29 11:10:29 -07:00
Martin K. Petersen 61a9f252c1 scsi: mpt3sas: Fix reference tag handling for WRITE_INSERT
Testing revealed a problem with how the reference tag was handled for
a WRITE_INSERT operation. The SCSI_PROT_REF_CHECK flag is not set when
the controller is asked to generate the protection information
(i.e. not DIX). And as a result the initial reference tag would not be
set in the WRITE_INSERT case.

Separate handling of the REF_CHECK and REF_INCREMENT flags to align
with both the DIX spec and the MPI implementation.

Link: https://lore.kernel.org/r/20211028034202.24225-1-martin.petersen@oracle.com
Fixes: b3e2c72af1 ("scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-29 14:03:58 -04:00
Linus Torvalds 17d50f8941 MMC host:
- tmio: Re-enable card irqs after a reset
  - mtk-sd: Fixup probing of cqhci for crypto
  - cqhci: Fix support for suspend/resume
  - vub300: Fix control-message timeouts
  - dw_mmc-exynos: Fix support for tuning
  - winbond: Silences build errors on M68K
  - sdhci-esdhc-imx: Fix support for tuning
  - sdhci-pci: Read card detect from ACPI for Intel Merrifield
  - sdhci: Fix eMMC support for Thundercomm TurboX CM2290
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmF74zoXHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmQ2Q//fF/GEc69vN2pP9lps/jN6pdT
 Njd29UHonDf89krYYEoUd0MmwijndNC9HKIICM3XKRwzywweVDcgDGn0yJ/DPtma
 vgZRsZG4KM1h6RPs8TbGS17aMO1ZGNvZZAiP4f6CQzQqJUGS2Iu8kTx+XeZmqkuF
 IShAQB2X4CcThI7nZXPsaQVYFHgXY6L0MMxgZmvO36WDCHRij22xaEG5S4gmv825
 l+P+z3KabAeIrpTU098EshvXp/TfmAxar9OSymxgQqDlD2t3eW1NwcexNOfa/aW0
 yfQrMH/r37T6hDZoa7U27e7cA2Ma/y+A3c6J22siVEuiq/DQ3cUrRrHJvoYtz9uG
 m98z9ZjosxJy3O09bKjUWgzYwVDywwZtWOmXgNNRQpm8Qa4R6+2ZRitst+H9iJ6L
 m81Wx3iZK1I6Iak9a8palxOskHiD76IQ3V2xqbUP08O/DQSc+/460iuFqQ+yIybd
 PvNp0PpeZ7iCGC4gzTjGpL+SXxgrqGIO+MRMPUaSOMAZuaiIoAmBn8qwWdUUxiwD
 csmej1e64bDDctvHCg9PvZU2a8+TmQcdGxUTGB5pvf7mcJ5gSfN9AzKneqhWzw9J
 eBJwRMjTqGrRkSBGzj7PUrEjISlkzb0rt0sa7bYmd/7nIh1XX+MP5v7I35162jC6
 mPLTdFW+mxaNn6Nx2+E=
 =SM2+
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

 - tmio: Re-enable card irqs after a reset

 - mtk-sd: Fixup probing of cqhci for crypto

 - cqhci: Fix support for suspend/resume

 - vub300: Fix control-message timeouts

 - dw_mmc-exynos: Fix support for tuning

 - winbond: Silences build errors on M68K

 - sdhci-esdhc-imx: Fix support for tuning

 - sdhci-pci: Read card detect from ACPI for Intel Merrifield

 - sdhci: Fix eMMC support for Thundercomm TurboX CM2290

* tag 'mmc-v5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: tmio: reenable card irqs after the reset callback
  mmc: mediatek: Move cqhci init behind ungate clock
  mmc: cqhci: clear HALT state after CQE enable
  mmc: vub300: fix control-message timeouts
  mmc: dw_mmc: exynos: fix the finding clock sample value
  mmc: winbond: don't build on M68K
  mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset standard tuning circuit
  mmc: sdhci-pci: Read card detect from ACPI for Intel Merrifield
  mmc: sdhci: Map more voltage level to SDHCI_POWER_330
2021-10-29 10:54:44 -07:00
Linus Torvalds fd919bbd33 for-5.15-rc7-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmF72q0ACgkQxWXV+ddt
 WDvFOxAAkcryx2FP5aqaoMzBKfoCtMFHO3uAvm+rsMcglWe5kaXhBnHa2HPzoyEh
 YqEx2TeXMTuA2I15bU8KV1RMhQzzRjC4NhdRqY6uaKAcKgON6sJlK5qsq2BnB+V3
 nrue1jppM2Vv8wNzjMNeVETQNC7pmg29yQP/fvWaB36Yar2tyfyWDF11e42HR7cU
 yLQUedg30WEayz3Mp6MTBF36h09WXQrZSs7Iwk1JMQbpxWcpn2CjXrO+vIZOMdvH
 XZZsxBTNB8GJIaJlXssgsq3OP2wspK1lrVHNfi5PYtcZEaFrhkPaVB6enDfd41YV
 zXwj1dnemCni9fh88gZprel9bLyB37dSVfIqq2Ly3hQbSAN4dmHIpxGwPSRIr+Hl
 Bn3UfClHpAftbpd/Y77U7GgcYnkuRo3Bd4mGTF3ZuPDLVrf/QX5BlfGa2dmJYoml
 NfBit7Ha4UrxLW6C8RC6fyEbLQxpNYFY55Ra0Tj0BBO/uhWiqtQGZwC/qbyPKfzN
 YZFcPR6iTILoCHXNan3iZIuLeASMT0djgAtunXXf/BuFnxGfnOuqL3bKt2vojh3+
 rsqpeIxSP/VklKv4JcP3axeLmUK6cA8/9dV2ES0M0Fc0o341jfh+AoVw0GleFeus
 gXlDFPRJeE8yyXmjKyW4shctOczqoeMIq3umebXPP9R4jd/LU/g=
 =YWGa
 -----END PGP SIGNATURE-----

Merge tag 'for-5.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "Last minute fixes for crash on 32bit architectures when compression is
  in use. It's a regression introduced in 5.15-rc and I'd really like
  not let this into the final release, fixes via stable trees would add
  unnecessary delay.

  The problem is on 32bit architectures with highmem enabled, the pages
  for compression may need to be kmapped, while the patches removed that
  as we don't use GFP_HIGHMEM allocations anymore. The pages that don't
  come from local allocation still may be from highmem. Despite being on
  32bit there's enough such ARM machines in use so it's not a marginal
  issue.

  I did full reverts of the patches one by one instead of a huge one.
  There's one exception for the "lzo" revert as there was an
  intermediate patch touching the same code to make it compatible with
  subpage. I can't revert that one too, so the revert in lzo.c is
  manual. Qu Wenruo has worked on that with me and verified the changes"

* tag 'for-5.15-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Revert "btrfs: compression: drop kmap/kunmap from lzo"
  Revert "btrfs: compression: drop kmap/kunmap from zlib"
  Revert "btrfs: compression: drop kmap/kunmap from zstd"
  Revert "btrfs: compression: drop kmap/kunmap from generic helpers"
2021-10-29 10:46:59 -07:00
Linus Torvalds 6f11521267 Tracing comment fixes:
- Some bots have informed me that some of the ftrace functions kernel-doc
   has formatting issues.
 
 - Also, fix my snake instinct.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYXwAqxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qpl+AQDn8QTG2RN9+jWVTNkCcWNNQDKPi5Ij
 QYquG/clrCVnwgEAtxUoC+ZAcnsadBm5gXwjzCi62aWfAb9apkTHb6fZAgc=
 =H3GU
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.15-rc6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing comment fixes from Steven Rostedt:

 - Some bots have informed me that some of the ftrace functions
   kernel-doc has formatting issues.

 - Also, fix my snake instinct.

* tag 'trace-v5.15-rc6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix misspelling of "missing"
  ftrace: Fix kernel-doc formatting issues
2021-10-29 10:41:07 -07:00
Linus Torvalds 75c7a6c1ca Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "Fix a build-time warning in x86/sm4"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: x86/sm4 - Fix invalid section entry size
2021-10-29 10:17:08 -07:00
Linus Torvalds 2c04d67ec1 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "11 patches.

  Subsystems affected by this patch series: mm (memcg, memory-failure,
  oom-kill, secretmem, vmalloc, hugetlb, damon, and tools), and ocfs2"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
  mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'
  mm: khugepaged: skip huge page collapse for special files
  mm, thp: bail out early in collapse_file for writeback page
  mm/vmalloc: fix numa spreading for large hash tables
  mm/secretmem: avoid letting secretmem_users drop to zero
  ocfs2: fix race between searching chunks and release journal_head from buffer_head
  mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap
  mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
  mm: hwpoison: remove the unnecessary THP check
  memcg: page_alloc: skip bulk allocator for __GFP_ACCOUNT
2021-10-29 10:03:07 -07:00
Alexandre Ghiti 54c5639d8f
riscv: Fix asan-stack clang build
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:54:50 -07:00
Alexandre Ghiti cf11d01135
riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.

We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:53:42 -07:00
Pavel Begunkov 1d5f5ea7cb io-wq: remove worker to owner tw dependency
INFO: task iou-wrk-6609:6612 blocked for more than 143 seconds.
      Not tainted 5.15.0-rc5-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:iou-wrk-6609    state:D stack:27944 pid: 6612 ppid:  6526 flags:0x00004006
Call Trace:
 context_switch kernel/sched/core.c:4940 [inline]
 __schedule+0xb44/0x5960 kernel/sched/core.c:6287
 schedule+0xd3/0x270 kernel/sched/core.c:6366
 schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857
 do_wait_for_common kernel/sched/completion.c:85 [inline]
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion+0x176/0x280 kernel/sched/completion.c:138
 io_worker_exit fs/io-wq.c:183 [inline]
 io_wqe_worker+0x66d/0xc40 fs/io-wq.c:597
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

io-wq worker may submit a task_work to the master task and upon
io_worker_exit() wait for the tw to get executed. The problem appears
when the master task is waiting in coredump.c:

468                     freezer_do_not_count();
469                     wait_for_completion(&core_state->startup);
470                     freezer_count();

Apparently having some dependency on children threads getting everything
stuck. Workaround it by cancelling the taks_work callback that causes it
before going into io_worker_exit() waiting.

p.s. probably a better option is to not submit tw elevating the refcount
in the first place, but let's leave this excercise for the future.

Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+27d62ee6f256b186883e@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/142a716f4ed936feae868959059154362bfa8c19.1635509451.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 09:49:33 -06:00
Jens Axboe f75d118349 io_uring: harder fdinfo sq/cq ring iterating
The ring iteration is racy, which isn't necessarily a problem except it
can cause us to iterate the whole thing. That isn't desired or ideal,
and it can lead to excessive runtimes of reading fdinfo.

Cap the iteration at tail - head OR the ring size. While in there, clean
up the ring masking and just dump the raw values along with the masks.
That provides more useful debug info.

Fixes: 83f84356bc ("io_uring: add more uring info to fdinfo for debug")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 09:49:33 -06:00
Steven Rostedt (VMware) ddcf906fe5 tracing: Fix misspelling of "missing"
My snake instinct was on and I wrote "misssing" instead of "missing".

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-29 09:54:14 -04:00
Steven Rostedt (VMware) 6130722f11 ftrace: Fix kernel-doc formatting issues
Some functions had kernel-doc that used a comma instead of a hash to
separate the function name from the one line description.

Also, the "ftrace_is_dead()" had an incomplete description.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-29 09:52:23 -04:00
Shin'ichiro Kawasaki 15dfc662ef null_blk: Fix handling of submit_queues and poll_queues attributes
Commit 0a593fbbc2 ("null_blk: poll queue support") introduced the poll
queue feature to null_blk. After this change, null_blk device has both
submit queues and poll queues, and null_map_queues() callback maps the
both queues for corresponding hardware contexts. The commit also added
the device configuration attribute 'poll_queues' in same manner as the
existing attribute 'submit_queues'. These attributes allow to modify the
numbers of queues. However, when the new values are stored to these
attributes, the values are just handled only for the corresponding
queue. When number of submit_queue is updated, number of poll_queue is
not counted, or vice versa.  This caused inconsistent number of queues
and queue mapping and resulted in null-ptr-dereference. This failure was
observed in blktests block/029 and block/030.

To avoid the inconsistency, fix the attribute updates to care both
submit_queues and poll_queues. Introduce the helper function
nullb_update_nr_hw_queues() to handle stores to the both two attributes.
Add poll_queues field to the struct nullb_device to track the number in
same manner as submit_queues. Add two more fields prev_submit_queues and
prev_poll_queues to keep the previous values before change. In case the
block layer failed to update the nr_hw_queues, refer the previous values
in null_map_queues() to map queues in same manner as before change.

Also add poll_queues value checks in nullb_update_nr_hw_queues() and
null_validate_conf(). They ensure the poll_queues value of each device
is within the range from 1 to module parameter value of poll_queues.

Fixes: 0a593fbbc2 ("null_blk: poll queue support")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20211029103926.845635-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 06:55:39 -06:00
Jiapeng Chong df75db1fc1 block: ataflop: Fix warning comparing pointer to 0
Fix the following coccicheck warning:

./drivers/block/ataflop.c:1464:20-21: WARNING comparing pointer to 0.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/1635501029-81391-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 06:54:39 -06:00
John Garry 9b84c629c9 blk-mq-debugfs: Show active requests per queue for shared tags
Currently we show the hctx.active value for the per-hctx "active" file.

However this is not maintained for shared tags, and we instead keep a
record of the number active requests per request queue - see commit
f1b49fdc1c ("blk-mq: Record active_queues_shared_sbitmap per tag_set for
when using shared sbitmap).

Change for the case of shared tags to show the active requests per request
queue by using __blk_mq_active_requests() helper.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1635496823-33515-1-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 06:53:34 -06:00
Qing Wang 1b86db5f4e bcache: replace snprintf in show functions with sysfs_emit
coccicheck complains about the use of snprintf() in sysfs show functions.

Fix the following coccicheck warning:
drivers/md/bcache/sysfs.h:54:12-20: WARNING: use scnprintf or sprintf.

Implement sysfs_print() by sysfs_emit() and remove snprint() since no one
uses it any more.

Suggested-by: Coly Li <colyli@suse.de>
Signed-off-by: Qing Wang <wangqing@vivo.com>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20211029060930.119923-3-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 06:43:21 -06:00
Coly Li cf2197ca4b bcache: move uapi header bcache.h to bcache code directory
The header file include/uapi/linux/bcache.h is not really a user space
API heaer. This file defines the ondisk format of bcache internal meta
data but no one includes it from user space, bcache-tools has its own
copy of this header with minor modification.

Therefore, this patch moves include/uapi/linux/bcache.h to bcache code
directory as drivers/md/bcache/bcache_ondisk.h.

Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20211029060930.119923-2-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 06:43:21 -06:00
David Sterba ccaa66c8dd Revert "btrfs: compression: drop kmap/kunmap from lzo"
This reverts commit 8c945d32e6.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

The revert does not apply cleanly due to changes in a6e66e6f8c
("btrfs: rework lzo_decompress_bio() to make it subpage compatible")
that reworked the page iteration so the revert is done to be equivalent
to the original code.

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Tested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 13:25:43 +02:00
David Sterba 55276e14df Revert "btrfs: compression: drop kmap/kunmap from zlib"
This reverts commit 696ab562e6.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 13:03:05 +02:00
David Sterba 56ee254d23 Revert "btrfs: compression: drop kmap/kunmap from zstd"
This reverts commit bbaf9715f3.

The kmaps in compression code are still needed and cause crashes on
32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
with enabled LZO or ZSTD compression.

Example stacktrace with ZSTD on a 32bit ARM machine:

  Unable to handle kernel NULL pointer dereference at virtual address 00000000
  pgd = c4159ed3
  [00000000] *pgd=00000000
  Internal error: Oops: 5 [#1] PREEMPT SMP ARM
  Modules linked in:
  CPU: 0 PID: 210 Comm: kworker/u2:3 Not tainted 5.14.0-rc79+ #12
  Hardware name: Allwinner sun4i/sun5i Families
  Workqueue: btrfs-delalloc btrfs_work_helper
  PC is at mmiocpy+0x48/0x330
  LR is at ZSTD_compressStream_generic+0x15c/0x28c

  (mmiocpy) from [<c0629648>] (ZSTD_compressStream_generic+0x15c/0x28c)
  (ZSTD_compressStream_generic) from [<c06297dc>] (ZSTD_compressStream+0x64/0xa0)
  (ZSTD_compressStream) from [<c049444c>] (zstd_compress_pages+0x170/0x488)
  (zstd_compress_pages) from [<c0496798>] (btrfs_compress_pages+0x124/0x12c)
  (btrfs_compress_pages) from [<c043c068>] (compress_file_range+0x3c0/0x834)
  (compress_file_range) from [<c043c4ec>] (async_cow_start+0x10/0x28)
  (async_cow_start) from [<c0475c3c>] (btrfs_work_helper+0x100/0x230)
  (btrfs_work_helper) from [<c014ef68>] (process_one_work+0x1b4/0x418)
  (process_one_work) from [<c014f210>] (worker_thread+0x44/0x524)
  (worker_thread) from [<c0156aa4>] (kthread+0x180/0x1b0)
  (kthread) from [<c0100150>]

Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 13:02:50 +02:00
David Yang 9c7516d669 tools/testing/selftests/vm/split_huge_page_test.c: fix application of sizeof to pointer
The coccinelle check report:

  ./tools/testing/selftests/vm/split_huge_page_test.c:344:36-42:
  ERROR: application of sizeof to pointer

Use "strlen" to fix it.

Link: https://lkml.kernel.org/r/20211012030116.184027-1-davidcomponentone@gmail.com
Signed-off-by: David Yang <davidcomponentone@gmail.com>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
SeongJae Park 2e014660b3 mm/damon/core-test: fix wrong expectations for 'damon_split_regions_of()'
Kunit test cases for 'damon_split_regions_of()' expects the number of
regions after calling the function will be same to their request
('nr_sub').  However, the requested number is just an upper-limit,
because the function randomly decides the size of each sub-region.

This fixes the wrong expectation.

Link: https://lkml.kernel.org/r/20211028090628.14948-1-sj@kernel.org
Fixes: 17ccae8bb5 ("mm/damon: add kunit tests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Yang Shi a4aeaa06d4 mm: khugepaged: skip huge page collapse for special files
The read-only THP for filesystems will collapse THP for files opened
readonly and mapped with VM_EXEC.  The intended usecase is to avoid TLB
misses for large text segments.  But it doesn't restrict the file types
so a THP could be collapsed for a non-regular file, for example, block
device, if it is opened readonly and mapped with EXEC permission.  This
may cause bugs, like [1] and [2].

This is definitely not the intended usecase, so just collapse THP for
regular files in order to close the attack surface.

[shy828301@gmail.com: fix vm_file check [3]]

Link: https://lore.kernel.org/lkml/CACkBjsYwLYLRmX8GpsDpMthagWOjWWrNxqY6ZLNQVr6yx+f5vA@mail.gmail.com/ [1]
Link: https://lore.kernel.org/linux-mm/000000000000c6a82505ce284e4c@google.com/ [2]
Link: https://lkml.kernel.org/r/CAHbLzkqTW9U3VvTu1Ki5v_cLRC9gHW+znBukg_ycergE0JWj-A@mail.gmail.com [3]
Link: https://lkml.kernel.org/r/20211027195221.3825-1-shy828301@gmail.com
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reported-by: Hao Sun <sunhao.th@gmail.com>
Reported-by: syzbot+aae069be1de40fb11825@syzkaller.appspotmail.com
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Andrea Righi <andrea.righi@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Rongwei Wang 74c42e1baa mm, thp: bail out early in collapse_file for writeback page
Currently collapse_file does not explicitly check PG_writeback, instead,
page_has_private and try_to_release_page are used to filter writeback
pages.  This does not work for xfs with blocksize equal to or larger
than pagesize, because in such case xfs has no page->private.

This makes collapse_file bail out early for writeback page.  Otherwise,
xfs end_page_writeback will panic as follows.

  page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:ffff0003f88c86a8 index:0x0 pfn:0x84ef32
  aops:xfs_address_space_operations [xfs] ino:30000b7 dentry name:"libtest.so"
  flags: 0x57fffe0000008027(locked|referenced|uptodate|active|writeback)
  raw: 57fffe0000008027 ffff80001b48bc28 ffff80001b48bc28 ffff0003f88c86a8
  raw: 0000000000000000 0000000000000000 00000000ffffffff ffff0000c3e9a000
  page dumped because: VM_BUG_ON_PAGE(((unsigned int) page_ref_count(page) + 127u <= 127u))
  page->mem_cgroup:ffff0000c3e9a000
  ------------[ cut here ]------------
  kernel BUG at include/linux/mm.h:1212!
  Internal error: Oops - BUG: 0 [#1] SMP
  Modules linked in:
  BUG: Bad page state in process khugepaged  pfn:84ef32
   xfs(E)
  page:fffffe00201bcc80 refcount:0 mapcount:0 mapping:0 index:0x0 pfn:0x84ef32
   libcrc32c(E) rfkill(E) aes_ce_blk(E) crypto_simd(E) ...
  CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Tainted: ...
  pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--)
  Call trace:
    end_page_writeback+0x1c0/0x214
    iomap_finish_page_writeback+0x13c/0x204
    iomap_finish_ioend+0xe8/0x19c
    iomap_writepage_end_bio+0x38/0x50
    bio_endio+0x168/0x1ec
    blk_update_request+0x278/0x3f0
    blk_mq_end_request+0x34/0x15c
    virtblk_request_done+0x38/0x74 [virtio_blk]
    blk_done_softirq+0xc4/0x110
    __do_softirq+0x128/0x38c
    __irq_exit_rcu+0x118/0x150
    irq_exit+0x1c/0x30
    __handle_domain_irq+0x8c/0xf0
    gic_handle_irq+0x84/0x108
    el1_irq+0xcc/0x180
    arch_cpu_idle+0x18/0x40
    default_idle_call+0x4c/0x1a0
    cpuidle_idle_call+0x168/0x1e0
    do_idle+0xb4/0x104
    cpu_startup_entry+0x30/0x9c
    secondary_start_kernel+0x104/0x180
  Code: d4210000 b0006161 910c8021 94013f4d (d4210000)
  ---[ end trace 4a88c6a074082f8c ]---
  Kernel panic - not syncing: Oops - BUG: Fatal exception in interrupt

Link: https://lkml.kernel.org/r/20211022023052.33114-1-rongwei.wang@linux.alibaba.com
Fixes: 99cb0dbd47 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Suggested-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Song Liu <song@kernel.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Chen Wandun ffb29b1c25 mm/vmalloc: fix numa spreading for large hash tables
Eric Dumazet reported a strange numa spreading info in [1], and found
commit 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings") introduced
this issue [2].

Dig into the difference before and after this patch, page allocation has
some difference:

before:
  alloc_large_system_hash
    __vmalloc
      __vmalloc_node(..., NUMA_NO_NODE, ...)
        __vmalloc_node_range
          __vmalloc_area_node
            alloc_page /* because NUMA_NO_NODE, so choose alloc_page branch */
              alloc_pages_current
                alloc_page_interleave /* can be proved by print policy mode */

after:
  alloc_large_system_hash
    __vmalloc
      __vmalloc_node(..., NUMA_NO_NODE, ...)
        __vmalloc_node_range
          __vmalloc_area_node
            alloc_pages_node /* choose nid by nuam_mem_id() */
              __alloc_pages_node(nid, ....)

So after commit 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings"),
it will allocate memory in current node instead of interleaving allocate
memory.

Link: https://lore.kernel.org/linux-mm/CANn89iL6AAyWhfxdHO+jaT075iOa3XcYn9k6JJc7JR2XYn6k_Q@mail.gmail.com/ [1]
Link: https://lore.kernel.org/linux-mm/CANn89iLofTR=AK-QOZY87RdUZENCZUT4O6a0hvhu3_EwRMerOg@mail.gmail.com/ [2]
Link: https://lkml.kernel.org/r/20211021080744.874701-2-chenwandun@huawei.com
Fixes: 121e6f3258 ("mm/vmalloc: hugepage vmalloc mappings")
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Kees Cook 855d44434f mm/secretmem: avoid letting secretmem_users drop to zero
Quoting Dmitry:
 "refcount_inc() needs to be done before fd_install(). After
  fd_install() finishes, the fd can be used by userspace and
  we can have secret data in memory before the refcount_inc().

  A straightforward misuse where a user will predict the returned
  fd in another thread before the syscall returns and will use it
  to store secret data is somewhat dubious because such a user just
  shoots themself in the foot.

  But a more interesting misuse would be to close the predicted fd
  and decrement the refcount before the corresponding refcount_inc,
  this way one can briefly drop the refcount to zero while there are
  other users of secretmem."

Move fd_install() after refcount_inc().

Link: https://lkml.kernel.org/r/20211021154046.880251-1-keescook@chromium.org
Link: https://lore.kernel.org/lkml/CACT4Y+b1sW6-Hkn8HQYw_SsT7X3tp-CJNh2ci0wG3ZnQz9jjig@mail.gmail.com
Fixes: 9a436f8ff6 ("PM: hibernate: disable when there are active secretmem users")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jordy Zomer <jordy@pwning.systems>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Gautham Ananthakrishna 6f1b228529 ocfs2: fix race between searching chunks and release journal_head from buffer_head
Encountered a race between ocfs2_test_bg_bit_allocatable() and
jbd2_journal_put_journal_head() resulting in the below vmcore.

  PID: 106879  TASK: ffff880244ba9c00  CPU: 2   COMMAND: "loop3"
  Call trace:
    panic
    oops_end
    no_context
    __bad_area_nosemaphore
    bad_area_nosemaphore
    __do_page_fault
    do_page_fault
    page_fault
      [exception RIP: ocfs2_block_group_find_clear_bits+316]
    ocfs2_block_group_find_clear_bits [ocfs2]
    ocfs2_cluster_group_search [ocfs2]
    ocfs2_search_chain [ocfs2]
    ocfs2_claim_suballoc_bits [ocfs2]
    __ocfs2_claim_clusters [ocfs2]
    ocfs2_claim_clusters [ocfs2]
    ocfs2_local_alloc_slide_window [ocfs2]
    ocfs2_reserve_local_alloc_bits [ocfs2]
    ocfs2_reserve_clusters_with_limit [ocfs2]
    ocfs2_reserve_clusters [ocfs2]
    ocfs2_lock_refcount_allocators [ocfs2]
    ocfs2_make_clusters_writable [ocfs2]
    ocfs2_replace_cow [ocfs2]
    ocfs2_refcount_cow [ocfs2]
    ocfs2_file_write_iter [ocfs2]
    lo_rw_aio
    loop_queue_work
    kthread_worker_fn
    kthread
    ret_from_fork

When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the
bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and
released the jounal head from the buffer head.  Needed to take bit lock
for the bit 'BH_JournalHead' to fix this race.

Link: https://lkml.kernel.org/r/1634820718-6043-1-git-send-email-gautham.ananthakrishna@oracle.com
Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: <rajesh.sivaramasubramaniom@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Suren Baghdasaryan 337546e83f mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap
Race between process_mrelease and exit_mmap, where free_pgtables is
called while __oom_reap_task_mm is in progress, leads to kernel crash
during pte_offset_map_lock call.  oom-reaper avoids this race by setting
MMF_OOM_VICTIM flag and causing exit_mmap to take and release
mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock.

Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to
fix this race, however that would be considered a hack.  Fix this race
by elevating mm->mm_users and preventing exit_mmap from executing until
process_mrelease is finished.  Patch slightly refactors the code to
adapt for a possible mmget_not_zero failure.

This fix has considerable negative impact on process_mrelease
performance and will likely need later optimization.

Link: https://lkml.kernel.org/r/20211022014658.263508-1-surenb@google.com
Fixes: 884a7e5964 ("mm: introduce process_mrelease system call")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner <christian@brauner.io>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Yang Shi eac96c3efd mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
When handling shmem page fault the THP with corrupted subpage could be
PMD mapped if certain conditions are satisfied.  But kernel is supposed
to send SIGBUS when trying to map hwpoisoned page.

There are two paths which may do PMD map: fault around and regular
fault.

Before commit f9ce0be71d ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path.  The THP
could be PMD mapped as long as the VMA fits regardless what subpage is
accessed and corrupted.  After this commit as long as head page is not
corrupted the THP could be PMD mapped.

In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.

This loophole could be fixed by iterating every subpage to check if any
of them is hwpoisoned or not, but it is somewhat costly in page fault
path.

So introduce a new page flag called HasHWPoisoned on the first tail
page.  It indicates the THP has hwpoisoned subpage(s).  It is set if any
subpage of THP is found hwpoisoned by memory failure and after the
refcount is bumped successfully, then cleared when the THP is freed or
split.

The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully.
But shmem THP didn't get split then migrated at all.

Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
Fixes: 800d8c63b2 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00
Yang Shi c7cb42e944 mm: hwpoison: remove the unnecessary THP check
When handling THP hwpoison checked if the THP is in allocation or free
stage since hwpoison may mistreat it as hugetlb page.  After commit
415c64c145 ("mm/memory-failure: split thp earlier in memory error
handling") the problem has been fixed, so this check is no longer
needed.  Remove it.  The side effect of the removal is hwpoison may
report unsplit THP instead of unknown error for shmem THP.  It seems not
like a big deal.

The following patch "mm: filemap: check if THP has hwpoisoned subpage
for PMD page fault" depends on this, which fixes shmem THP with
hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
backported to -stable as well.

Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-28 17:18:55 -07:00