-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmAUXQsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgppO4EAClcqoneAuhT4UvRVNxblXPhPaoC69aNgXd
s+34uQSCqeWrWIAokfKp8bh3kyRqe00591auA7DwtwNqGpWuIECX8o9QvROEkuxv
0o4JFGMTHOJKP1W79Oy3RpF5oee6rMMOQN7EFL272p2xd8NRCP33c4fKvJRz+DDE
0kCcZhVjca0nZ+9OJC+WAlV+dit3azCAKSp7cItJsdOgZL74ZcGECm0pA8RpStyi
tQrUr2yiHLkm1lcOYfid0fG2/5a4vAGZQav+EshOWYw9UGeMquq/aqPuZZtEUjKe
oEECACfJ9cWErsi1CirIk5j5RKHOHmFSG3kRAmyvFB4f3YDGYxerI7eodWjNA0d5
38wW96sWuV4l0ShPmD3jGWIDTTcDZh4nEImCObf5YJFbr2fQXofWVWseIyo0zG8Y
zDa1N/M7XgkrScX8OF33NC1uv/oExhHA7jXuQN6mRBESYjcCrH2Lf6mXAA2C8u4T
z1RaG7ckRXGSbV3ol1ROrHj0RTXQ3zeIHj3yMRU8TKH0z6s+ob46D2PZCLi6cLvI
IuELhzKsS1EzMSVsYk9/AegynWFjVCRJoVUVxTsrxfGEF7attwmur3lOAjbZwSWb
jXlRbrkgBL1Pwbjg8AODEoq0jJgVM/S/3fG2rpcYLwwYC+FQ73/K+URmEuMsqkFC
GrYllTSMFg==
=hb7W
-----END PGP SIGNATURE-----
Merge tag 'block-5.11-2021-01-29' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"All over the place fixes for this release:
- blk-cgroup iteration teardown resched fix (Baolin)
- NVMe pull request from Christoph:
- add another Write Zeroes quirk (Chaitanya Kulkarni)
- handle a no path available corner case (Daniel Wagner)
- use the proper RCU aware list_add helper (Chao Leng)
- bcache regression fix (Coly)
- bdev->bd_size_lock IRQ fix. This will be fixed in drivers for 5.12,
but for now, we'll make it IRQ safe (Damien)
- null_blk zoned init fix (Damien)
- add_partition() error handling fix (Dinghao)
- s390 dasd kobject fix (Jan)
- nbd fix for freezing queue while adding connections (Josef)
- tag queueing regression fix (Ming)
- revert of a patch that inadvertently meant that we regressed write
performance on raid (Maxim)"
* tag 'block-5.11-2021-01-29' of git://git.kernel.dk/linux-block:
null_blk: cleanup zoned mode initialization
nvme-core: use list_add_tail_rcu instead of list_add_tail for nvme_init_ns_head
nvme-multipath: Early exit if no path is available
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a SPCC device
bcache: only check feature sets when sb->version >= BCACHE_SB_VERSION_CDEV_WITH_FEATURES
block: fix bd_size_lock use
blk-cgroup: Use cond_resched() when destroy blkgs
Revert "block: simplify set_init_blocksize" to regain lost performance
nbd: freeze the queue while we're adding connections
s390/dasd: Fix inconsistent kobject removal
block: Fix an error handling in add_partition
blk-mq: test QUEUE_FLAG_HCTX_ACTIVE for sbitmap_shared in hctx_may_queue
For super block version < BCACHE_SB_VERSION_CDEV_WITH_FEATURES, it
doesn't make sense to check the feature sets. This patch checks
super block version in bch_has_feature_* routines, if the version
doesn't have feature sets yet, returns 0 (false) to the caller.
Fixes: 5342fd4255 ("bcache: set bcache device into read-only mode for BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET")
Fixes: ffa4703275 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Cc: stable@vger.kernel.org # 5.9+
Reported-and-tested-by: Bockholdt Arne <a.bockholdt@precitec-optronik.de>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmANwMUQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpr1PEACPVkVvLlxH2YfnBNykDn+tV+goOs37oJkB
0XR+flcZAHHycLvWnix4AVca4ebTU4diwrFMM4gB01mzzoPYhicHefNuhPV4abYO
qq6SCcpZkBfbjYqfB+Fmo0823UXaGUqX/oaxbYwePjkMbFjW6kQEThEHGH07CmWA
s5VfOSn695hXLUBpKwsj5m88NohP4tSZMm+VE2RvycdVt2uzJuga1aDDAFPfZFRA
YHQyhIEUClYl3eC3Yo5E32nBezrRCJtumRZKmQHMCBXGQ2Z9OfgalD8wz4zBb/D0
ypzr68M27coQIQ9qNBruHuOfnjvwy4jwB0Eci7bGHfKUiVwUDiLD8TVqnQzcwxR6
VNm4RbCEazsfZ33ztk7iiKHijesJ5wHlaDNlL1xBxpNajqOVv3T65kmaETarhUAQ
h/EgHUFzYrzy9Y9ZpuClAE3LSk9gV3EzFuWmgSvwSY99TPNCpkG6raXTdmRzj6fV
ZFz8L7AlXnbBjREfK8x4lB3W1T16zAQpiPrGYIfQsSIAeBbbyxGIo/H5wdCZwQBj
0k5+UoxWT8oYK2C6xdK7elDN6jT4BaC4p5IZBhlVXCIh+VUD5+Ol6/2pFRr6qRWt
giXITDzuVMECTtj8gNhf/P209kG9eSglYfXFRr334GFirSMfN8a22BB3tabp+dDF
GzCHw+zZuA==
=Bblh
-----END PGP SIGNATURE-----
Merge tag 'block-5.11-2021-01-24' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- NVMe pull request from Christoph:
- fix a status code in nvmet (Chaitanya Kulkarni)
- avoid double completions in nvme-rdma/nvme-tcp (Chao Leng)
- fix the CMB support to cope with NVMe 1.4 controllers (Klaus Jensen)
- fix PRINFO handling in the passthrough ioctl (Revanth Rajashekar)
- fix a double DMA unmap in nvme-pci
- lightnvm error path leak fix (Pan)
- MD pull request from Song:
- Flush request fix (Xiao)
* tag 'block-5.11-2021-01-24' of git://git.kernel.dk/linux-block:
lightnvm: fix memory leak when submit fails
nvme-pci: fix error unwind in nvme_map_data
nvme-pci: refactor nvme_unmap_data
md: Set prev_flush_start and flush_bio in an atomic way
nvmet: set right status on error in id-ns handler
nvme-pci: allow use of cmb on v1.4 controllers
nvme-tcp: avoid request double completion for concurrent nvme_tcp_timeout
nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
nvme: check the PRINFO bit before deciding the host buffer length
This reverts commit
644bda6f34 ("dm table: fall back to getting device using name_to_dev_t()")
dm_get_dev_t() is just used to convert an arbitrary 'path' string
into a dev_t. It doesn't presume that the device is present; that
check will be done later, as the only caller is dm_get_device(),
which does a dm_get_table_device() later on, which will properly
open the device.
So if the path string already _is_ in major:minor representation
we can convert it directly, avoiding a recursion into the filesystem
to lookup the block device.
This avoids a hang in multipath_message() when the filesystem is
inaccessible.
Fixes: 644bda6f34 ("dm table: fall back to getting device using name_to_dev_t()")
Cc: stable@vger.kernel.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
In commit d68b29584c ("dm crypt: use GFP_ATOMIC when allocating
crypto requests from softirq") code was incorrectly copy and pasted
from crypt_alloc_req_skcipher()'s crypto request allocation code to
crypt_alloc_req_aead(). It is OK from runtime perspective as both
simple encryption request pointer and AEAD request pointer are part of
a union, but may confuse code reviewers.
Fixes: d68b29584c ("dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq")
Cc: stable@vger.kernel.org # v5.9+
Reported-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.
Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.
This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Daniel Glockner <dg@emlix.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Recalculate can only be specified with internal_hash.
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
One customer reports a crash problem which causes by flush request. It
triggers a warning before crash.
/* new request after previous flush is completed */
if (ktime_after(req_start, mddev->prev_flush_start)) {
WARN_ON(mddev->flush_bio);
mddev->flush_bio = bio;
bio = NULL;
}
The WARN_ON is triggered. We use spin lock to protect prev_flush_start and
flush_bio in md_flush_request. But there is no lock protection in
md_submit_flush_data. It can set flush_bio to NULL first because of
compiler reordering write instructions.
For example, flush bio1 sets flush bio to NULL first in
md_submit_flush_data. An interrupt or vmware causing an extended stall
happen between updating flush_bio and prev_flush_start. Because flush_bio
is NULL, flush bio2 can get the lock and submit to underlayer disks. Then
flush bio1 updates prev_flush_start after the interrupt or extended stall.
Then flush bio3 enters in md_flush_request. The start time req_start is
behind prev_flush_start. The flush_bio is not NULL(flush bio2 hasn't
finished). So it can trigger the WARN_ON now. Then it calls INIT_WORK
again. INIT_WORK() will re-initialize the list pointers in the
work_struct, which then can result in a corrupted work list and the
work_struct queued a second time. With the work list corrupted, it can
lead in invalid work items being used and cause a crash in
process_one_work.
We need to make sure only one flush bio can be handled at one same time.
So add spin lock in md_submit_flush_data to protect prev_flush_start and
flush_bio in an atomic way.
Reviewed-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
- Select missing Kconfig dependencies for DM integrity and zoned
targets.
- 4 fixes for DM crypt target's support to optionally bypass
kcryptd workqueues.
- Fix DM snapshot merge supports missing data flushes before
committing metadata.
- Fix DM integrity data device flushing when external metadata is
used.
- Fix DM integrity's maximum number of supported constructor arguments
that user can request when creating an integrity device.
- Eliminate DM core ioctl logging noise when an ioctl is issued
without required CAP_SYS_RAWIO permission.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmACJwsTHHNuaXR6ZXJA
cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWr93B/sFpKbCjRxZhGdOz8u2UIV9jGoSzAaZ
3qdSTruCp1HMTsz0J/+bFBTMQKKap+8u8kzhCOLAqtZgEz13P0D3NyDUGiCxj07O
vPZJQ3COrClknq5byKws/aOKcnbr25odk6Zq+gFmk47pdNFBMkzNMf0IbdVJsj19
d0tnTRkIiycC8hnXOr/tHufnmERrw9nZpq3cLEHT6vaXc4FxcYXbo4UEMIwKjgcB
AnsRrqpqKyayG+G/SEs01JM9eYznWRPqf3L03wTWfieAsFjDoIp/ek8iXWjEwLtZ
BnhxFwodWvycFb9ITaxlFgfyiBSgU1f3tsIYjU4MHCSFI6u5b83y57Dl
=FnV7
-----END PGP SIGNATURE-----
Merge tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM-raid's raid1 discard limits so discards work.
- Select missing Kconfig dependencies for DM integrity and zoned
targets.
- Four fixes for DM crypt target's support to optionally bypass kcryptd
workqueues.
- Fix DM snapshot merge supports missing data flushes before committing
metadata.
- Fix DM integrity data device flushing when external metadata is used.
- Fix DM integrity's maximum number of supported constructor arguments
that user can request when creating an integrity device.
- Eliminate DM core ioctl logging noise when an ioctl is issued without
required CAP_SYS_RAWIO permission.
* tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm crypt: defer decryption to a tasklet if interrupts disabled
dm integrity: fix the maximum number of arguments
dm crypt: do not call bio_endio() from the dm-crypt tasklet
dm integrity: fix flush with external metadata device
dm: eliminate potential source of excessive kernel log noise
dm snapshot: flush merged data before committing metadata
dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq
dm crypt: do not wait for backlogged crypto request completion in softirq
dm zoned: select CONFIG_CRC32
dm integrity: select CRYPTO_SKCIPHER
dm raid: fix discard limits for raid1
Advance the maximum number of arguments from 9 to 15 to account for
all potential feature flags that may be supplied.
Linux 4.19 added "meta_device"
(356d9d52e1) and "recalculate"
(a3fcf72531) flags.
Commit 468dfca38b added
"sectors_per_bit" and "bitmap_flush_interval".
Commit 84597a44a9 added
"allow_discards".
And the commit d537858ac8 added
"fix_padding".
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Sometimes, when dm-crypt executes decryption in a tasklet, we may get
"BUG: KASAN: use-after-free in tasklet_action_common.constprop..."
with a kasan-enabled kernel.
When the decryption fully completes in the tasklet, dm-crypt will call
bio_endio(), which in turn will call clone_endio() from dm.c core code. That
function frees the resources associated with the bio, including per bio private
structures. For dm-crypt it will free the current struct dm_crypt_io, which
contains our tasklet object, causing use-after-free, when the tasklet is being
dequeued by the kernel.
To avoid this, do not call bio_endio() from the current tasklet context, but
delay its execution to the dm-crypt IO workqueue.
Fixes: 39d42fa96b ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Cc: <stable@vger.kernel.org> # v5.9+
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/7KA0QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpn6WEACeUa97qyzm7G8/E5ejBL6lXSTRXNc8qa+h
YCdrDltkqs6OHAuEyUCwGw3zPmb7fp4M5RLZ/Dp9EtMwld45HfoN6mpRe0+i4U96
iAkHMNUo6ytp3wXX1XKgZ0FhcSOSwkQK8CMzmLPn+pxkDYzQPFg38AUISPpoDA/L
YNh4tEiHHd5oprHIzludE00m2i1oYNrBcmUe27sKxR0mak0kEJtxr4cXLrqBtN3k
9C31A0gstCINSHmQPAcRvFerDxDM0WPYQ7K6UEXfkCfbyf6i+1eG/qLUwUCdm9MD
Rjot6dXzQ2LzqJbaAZndjJRDRZx2xpC2TNlNaBjYzSOC6AXSY0MKiZBCnH/i/OoZ
f0Bq/k7LVeMbyu02cgIis4DPLabfG+XQUOniu4HQTrzK8+neApAlCwINc73cvQOb
hBS+LfUVqP6K6g3oVGSvqG01wj2HK69SWMNKTr9GZ3GIqrcWYtA/JnqFfTE7/KwC
H7rkPL8i3+NBXmjjz6hm8hx3MrnekKJpsdCBicm9OOYqJRbkGVjoUYeDFz5MElfp
k71u2WDQ81aiqfWajsJkZaUFxZgUrRzuWeyBZiQQP9kJEMzUUiDSg4K+0WJhk5bO
Y0EX0sdCz8k9IBKfi2+FcF5dYj3RDolALmBDrrcfchTW0h7vxMpn4rr/ueN7gViz
rW/Gj9pRsA==
=CClj
-----END PGP SIGNATURE-----
Merge tag 'block-5.11-2021-01-10' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Missing CRC32 selections (Arnd)
- Fix for a merge window regression with bdev inode init (Christoph)
- bcache fixes
- rnbd fixes
- NVMe pull request from Christoph:
- fix a race in the nvme-tcp send code (Sagi Grimberg)
- fix a list corruption in an nvme-rdma error path (Israel Rukshin)
- avoid a possible double fetch in nvme-pci (Lalithambika Krishnakumar)
- add the susystem NQN quirk for a Samsung driver (Gopal Tiwari)
- fix two compiler warnings in nvme-fcloop (James Smart)
- don't call sleeping functions from irq context in nvme-fc (James Smart)
- remove an unused argument (Max Gurtovoy)
- remove unused exports (Minwoo Im)
- Use-after-free fix for partition iteration (Ming)
- Missing blk-mq debugfs flag annotation (John)
- Bdev freeze regression fix (Satya)
- blk-iocost NULL pointer deref fix (Tejun)
* tag 'block-5.11-2021-01-10' of git://git.kernel.dk/linux-block: (26 commits)
bcache: set bcache device into read-only mode for BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET
bcache: introduce BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE for large bucket
bcache: check unsupported feature sets for bcache register
bcache: fix typo from SUUP to SUPP in features.h
bcache: set pdev_set_uuid before scond loop iteration
blk-mq-debugfs: Add decode for BLK_MQ_F_TAG_HCTX_SHARED
block/rnbd-clt: avoid module unload race with close confirmation
block/rnbd: Adding name to the Contributors List
block/rnbd-clt: Fix sg table use after free
block/rnbd-srv: Fix use after free in rnbd_srv_sess_dev_force_close
block/rnbd: Select SG_POOL for RNBD_CLIENT
block: pre-initialize struct block_device in bdev_alloc_inode
fs: Fix freeze_bdev()/thaw_bdev() accounting of bd_fsfreeze_sb
nvme: remove the unused status argument from nvme_trace_bio_complete
nvmet-rdma: Fix list_del corruption on queue establishment failure
nvme: unexport functions with no external caller
nvme: avoid possible double fetch in handling CQE
nvme-tcp: Fix possible race of io_work and direct send
nvme-pci: mark Samsung PM1725a as IGNORE_DEV_SUBNQN
nvme-fcloop: Fix sscanf type and list_first_entry_or_null warnings
...
If BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET is set in incompat feature
set, it means the cache device is created with obsoleted layout with
obso_bucket_site_hi. Now bcache does not support this feature bit, a new
BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE incompat feature bit is added
for a better layout to support large bucket size.
For the legacy compatibility purpose, if a cache device created with
obsoleted BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET feature bit, all bcache
devices attached to this cache set should be set to read-only. Then the
dirty data can be written back to backing device before re-create the
cache device with BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE feature bit
by the latest bcache-tools.
This patch checks BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET feature bit
when running a cache set and attach a bcache device to the cache set. If
this bit is set,
- When run a cache set, print an error kernel message to indicate all
following attached bcache device will be read-only.
- When attach a bcache device, print an error kernel message to indicate
the attached bcache device will be read-only, and ask users to update
to latest bcache-tools.
Such change is only for cache device whose bucket size >= 32MB, this is
for the zoned SSD and almost nobody uses such large bucket size at this
moment. If you don't explicit set a large bucket size for a zoned SSD,
such change is totally transparent to your bcache device.
Fixes: ffa4703275 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When large bucket feature was added, BCH_FEATURE_INCOMPAT_LARGE_BUCKET
was introduced into the incompat feature set. It used bucket_size_hi
(which was added at the tail of struct cache_sb_disk) to extend current
16bit bucket size to 32bit with existing bucket_size in struct
cache_sb_disk.
This is not a good idea, there are two obvious problems,
- Bucket size is always value power of 2, if store log2(bucket size) in
existing bucket_size of struct cache_sb_disk, it is unnecessary to add
bucket_size_hi.
- Macro csum_set() assumes d[SB_JOURNAL_BUCKETS] is the last member in
struct cache_sb_disk, bucket_size_hi was added after d[] which makes
csum_set calculate an unexpected super block checksum.
To fix the above problems, this patch introduces a new incompat feature
bit BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE, when this bit is set, it
means bucket_size in struct cache_sb_disk stores the order of power-of-2
bucket size value. When user specifies a bucket size larger than 32768
sectors, BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE will be set to
incompat feature set, and bucket_size stores log2(bucket size) more
than store the real bucket size value.
The obsoleted BCH_FEATURE_INCOMPAT_LARGE_BUCKET won't be used anymore,
it is renamed to BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET and still only
recognized by kernel driver for legacy compatible purpose. The previous
bucket_size_hi is renmaed to obso_bucket_size_hi in struct cache_sb_disk
and not used in bcache-tools anymore.
For cache device created with BCH_FEATURE_INCOMPAT_LARGE_BUCKET feature,
bcache-tools and kernel driver still recognize the feature string and
display it as "obso_large_bucket".
With this change, the unnecessary extra space extend of bcache on-disk
super block can be avoided, and csum_set() may generate expected check
sum as well.
Fixes: ffa4703275 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch adds the check for features which is incompatible for
current supported feature sets.
Now if the bcache device created by bcache-tools has features that
current kernel doesn't support, read_super() will fail with error
messoage. E.g. if an unsupported incompatible feature detected,
bcache register will fail with dmesg "bcache: register_bcache() error :
Unsupported incompatible feature found".
Fixes: d721a43ff6 ("bcache: increase super block version for cache device and backing device")
Fixes: ffa4703275 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch fixes the following typos,
from BCH_FEATURE_COMPAT_SUUP to BCH_FEATURE_COMPAT_SUPP
from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_INCOMPAT_SUPP
from BCH_FEATURE_INCOMPAT_SUUP to BCH_FEATURE_RO_COMPAT_SUPP
Fixes: d721a43ff6 ("bcache: increase super block version for cache device and backing device")
Fixes: ffa4703275 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no need to reassign pdev_set_uuid in the second loop iteration,
so move it to the place before second loop.
Signed-off-by: Yi Li <yili@winhong.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With external metadata device, flush requests are not passed down to the
data device.
Fix this by submitting the flush request in dm_integrity_flush_buffers. In
order to not degrade performance, we overlap the data device flush with
the metadata device flush.
Reported-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
There wasn't ever a real need to log an error in the kernel log for
ioctls issued with insufficient permissions. Simply return an error
and if an admin/user is sufficiently motivated they can enable DM's
dynamic debugging to see an explanation for why the ioctls were
disallowed.
Reported-by: Nir Soffer <nsoffer@redhat.com>
Fixes: e980f62353 ("dm: don't allow ioctls to targets that don't map to whole devices")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If the origin device has a volatile write-back cache and the following
events occur:
1: After finishing merge operation of one set of exceptions,
merge_callback() is invoked.
2: Update the metadata in COW device tracking the merge completion.
This update to COW device is flushed cleanly.
3: System crashes and the origin device's cache where the recent
merge was completed has not been flushed.
During the next cycle when we read the metadata from the COW device,
we will skip reading those metadata whose merge was completed in
step (1). This will lead to data loss/corruption.
To address this, flush the origin device post merge IO before
updating the metadata.
Cc: stable@vger.kernel.org
Signed-off-by: Akilesh Kailash <akailash@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Without crc32 support, this driver fails to link:
arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_write_sb':
dm-zoned-metadata.c:(.text+0xe98): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_check_sb':
dm-zoned-metadata.c:(.text+0x7978): undefined reference to `crc32_le'
Fixes: 3b1a94c88b ("dm zoned: drive-managed zoned block device target")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The integrity target relies on skcipher for encryption/decryption, but
certain kernel configurations may not enable CRYPTO_SKCIPHER, leading to
compilation errors due to unresolved symbols. Explicitly select
CRYPTO_SKCIPHER for DM_INTEGRITY, since it is unconditionally dependent
on it.
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Block core warned that discard_granularity was 0 for dm-raid with
personality of raid1. Reason is that raid_io_hints() was incorrectly
special-casing raid1 rather than raid0.
Fix raid_io_hints() by removing discard limits settings for
raid1. Check for raid0 instead.
Fixes: 61697a6abd ("dm: eliminate 'split_discard_bios' flag from DM target interface")
Cc: stable@vger.kernel.org
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Stephan Bärwolf <stephan@matrixstorm.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
functionality that requires reloading a dm-crypt DM table).
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl/qS/4THHNuaXR6ZXJA
cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWiD0CACEywiVceLgV4dTnlH5h5YJjxybjo9g
LyVe+4X9bN2LauJNpNmlWgRQZTovC6wIk9kyY7p69ZZqXe9Y4sXxoynRoUu/2DiG
5+MnOleTIOafUHJTJOGhDs+vDPgNnYh3xmoVNZqAQpcexPJg/E0wuhcgmO9lXFew
ldcNOXV51AdANfjLKFlyQckqBG028ktdV2xt7u/B1FAKcmUbb0rfG7LDHlGf1ggj
KAVlZzMry7wMhEGPS3IQtmA0mO5DSMn1Kp4iM0Wd6cVrJ+jZafv0uFHFUHiH7DYy
yxj7AurM/0wxS06cHvCN82OoOl9AzDmNAKy94I/PiWAqyP3I6iiep5rE
=K4dv
-----END PGP SIGNATURE-----
Merge tag 'for-5.11/dm-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mike Snitzer:
"Revert WQ_SYSFS change that broke reencryption (and all other
functionality that requires reloading a dm-crypt DM table)"
* tag 'for-5.11/dm-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
Revert "dm crypt: export sysfs of kcryptd workqueue"
This reverts commit a2b8b2d975.
WQ_SYSFS breaks the ability to reload a DM table due to sysfs kobject
collision (due to active and inactive table). Given lack of
demonstrated need for exposing this workqueue via sysfs: revert
exposing it.
Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/kHXgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpkhFEADYuiBZbYEonaV4/nqOhMZ6lXj99rEqVZui
AOMm7W8nopb97pWy0sJxZHPPMnjglubkTZbX/2TH08ndppGBQAa9HsgDIITO5Ap3
vjnew6oPsrMtxMUMqR8w8nMb4z5tfqpvUYRPd+qpvYSYLEeSh0UNAEdQ/MOGbA1t
nl8UPdNW/s1MbeyLJhU+NgGM0aZahED8KuJeVLOY2im6dySO+CoeoB0/mWrdc0PZ
SDcGBEjhmFspDVAkW4Wo8bMw7Cr72es0esvJSJyx0mlo0jSCR7ZYParDtPwuAl9H
thKo+brLibz9M+wRtW+7w37oUADTTRL+KV2xJ/J+DeAVjI7/hxgZKewh6hEORmrF
DwjbS8cnEVp70rfF7+Z9FV/+2GJXm1uI/swBi69Y42CQ33FNLZmikBW6Q3pimrDj
ZeQSCLRpGlo01xtJpsm8L71KGSfYd+njWaIFnESPhxJ+sTxd8kvFRsdjHlc4KNBG
UnMzvn6pE3WNhzJJoIqdM2uXa4XDyKkMfO7VbpUgbyij103jpbs7ruWXq3DDU5/t
Gdwa821zlq5azDYAw7PNb7FhKrsHhiKhOuxyKqJUbOUkOHBNwxJmxXcaMnR3fOSi
B+AlqYhC6A9DhkL0HG0QLcdFwYawpOsDxbFUbemt/UXn74lAjRnp8+rNMSn5tjbn
OCnk4Tpaww==
=Q42d
-----END PGP SIGNATURE-----
Merge tag 'block-5.11-2020-12-23' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few stragglers in here, but mostly just straight fixes. In
particular:
- Set of rnbd fixes for issues around changes for the merge window
(Gioh, Jack, Md Haris Iqbal)
- iocost tracepoint addition (Baolin)
- Copyright/maintainers update (Christoph)
- Remove old blk-mq fast path CPU warning (Daniel)
- loop max_part fix (Josh)
- Remote IPI threaded IRQ fix (Sebastian)
- dasd stable fixes (Stefan)
- bcache merge window fixup and style fixup (Yi, Zheng)"
* tag 'block-5.11-2020-12-23' of git://git.kernel.dk/linux-block:
md/bcache: convert comma to semicolon
bcache:remove a superfluous check in register_bcache
block: update some copyrights
block: remove a pointless self-reference in block_dev.c
MAINTAINERS: add fs/block_dev.c to the block section
blk-mq: Don't complete on a remote CPU in force threaded mode
s390/dasd: fix list corruption of lcu list
s390/dasd: fix list corruption of pavgroup group list
s390/dasd: prevent inconsistent LCU device data
s390/dasd: fix hanging device offline processing
blk-iocost: Add iocg idle state tracepoint
nbd: Respect max_part for all partition scans
block/rnbd-clt: Does not request pdu to rtrs-clt
block/rnbd-clt: Dynamically allocate sglist for rnbd_iu
block/rnbd: Set write-back cache and fua same to the target device
block/rnbd: Fix typos
block/rnbd-srv: Protect dev session sysfs removal
block/rnbd-clt: Fix possible memleak
block/rnbd-clt: Get rid of warning regarding size argument in strlcpy
blk-mq: Remove 'running from the wrong CPU' warning
Replace a comma between expression statements by a semicolon.
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Coly Li <colyli@sue.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There have no reassign the bdev after check It is IS_ERR.
the double check !IS_ERR(bdev) is superfluous.
After commit 4e7b5671c6 ("block: remove i_bdev"),
"Switch the block device lookup interfaces to directly work with a dev_t
so that struct block_device references are only acquired by the
blkdev_get variants (and the blk-cgroup special case). This means that
we now don't need an extra reference in the inode and can generally
simplify handling of struct block_device to keep the lookups contained
in the core block layer code."
so after lookup_bdev call, there no need to do bdput.
remove a superfluous check the bdev & don't call bdput after lookup_bdev.
Fixes: 4e7b5671c6a8("block: remove i_bdev")
Signed-off-by: Yi Li <yili@winhong.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- Fix DM verity to skip verity work if IO completes with error while
system is shutting down.
- Add new DM multipath "IO affinity" path selector that maps IO
destined to a given path to a specific CPU based on user provided
mapping.
- Rename DM multipath path selector source files to have "dm-ps"
prefix.
- Add REQ_NOWAIT support to some other simple DM targets that don't
block in more elaborate ways waiting for IO.
- Export DM crypt's kcryptd workqueue via sysfs (WQ_SYSFS).
- Fix error return code in DM's target_message() if empty message is
received.
- A handful of other small cleanups.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl/iDJYTHHNuaXR6ZXJA
cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWsI2CACUk9PCCtOOHH1s//VeLjF86VUIsuxf
hNMfTgWT+arlHUIzl2Bp4c4Dq/T+hXTYlf+f5zGRAbBAd82eCqji5LTVvy/qaYt3
4gWSZnv/3VYHCx4MvV9UjtXQcSnS/ttDwyV0ZD6/NtYllQQobaCbyrE4p2tA1GIk
ql8ESRgXKK5Sio197Tm45UEkrhG0oUrEZ3riBaXeM9yOldLp1mLLVPGJeLcJDvpA
N7TDcM0owq/CMbmu5BkNv0xw7q/Vc9VQLGva8a15StxGjk1HY5/6KQWssCEkTkO7
HnIprATtWz2r0qgTcI+fOXndUmlqgQr1jhvYfJeuv45KbjoI5qubZvYr
=vHNj
-----END PGP SIGNATURE-----
Merge tag 'for-5.11/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Add DM verity support for signature verification with 2nd keyring
- Fix DM verity to skip verity work if IO completes with error while
system is shutting down
- Add new DM multipath "IO affinity" path selector that maps IO
destined to a given path to a specific CPU based on user provided
mapping
- Rename DM multipath path selector source files to have "dm-ps" prefix
- Add REQ_NOWAIT support to some other simple DM targets that don't
block in more elaborate ways waiting for IO
- Export DM crypt's kcryptd workqueue via sysfs (WQ_SYSFS)
- Fix error return code in DM's target_message() if empty message is
received
- A handful of other small cleanups
* tag 'for-5.11/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: simplify the return expression of load_mapping()
dm ebs: avoid double unlikely() notation when using IS_ERR()
dm verity: skip verity work if I/O error when system is shutting down
dm crypt: export sysfs of kcryptd workqueue
dm ioctl: fix error return code in target_message
dm crypt: Constify static crypt_iv_operations
dm: add support for REQ_NOWAIT to various targets
dm: rename multipath path selector source files to have "dm-ps" prefix
dm mpath: add IO affinity path selector
dm verity: Add support for signature verification with 2nd keyring
dm: remove unnecessary current->bio_list check when submitting split bio
The definition of IS_ERR() already applies the unlikely() notation
when checking the error status of the passed pointer. For this
reason there is no need to have the same notation outside of
IS_ERR() itself.
Clean up code by removing redundant notation.
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If emergency system shutdown is called, like by thermal shutdown,
a dm device could be alive when the block device couldn't process
I/O requests anymore. In this state, the handling of I/O errors
by new dm I/O requests or by those already in-flight can lead to
a verity corruption state, which is a misjudgment.
So, skip verity work in response to I/O error when system is shutting
down.
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/XgdYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpjTBD/4me2TNvGOogbcL0b1leAotndJ7spI/IcFM
NUMNy3pOGuRBcRjwle85xq44puAjlNkZE2LLatem5sT7ZvS+8lPNnOIoTYgfaCjt
PhKx2sKlLumVm3BwymYAPcPtke4fikGG15Mwu5nX1oOehmyGrjObGAr3Lo6gexCT
tQoCOczVqaTsV+iTXrLlmgEgs07J9Tm93uh2cNR8Jgroxb8ivuWeUq4YgbV4kWk+
Y8XvOyVE/yba0vQf5/hHtWuVoC6RdELnqZ6NCkcP/EicdBecwk1GMJAej1S3zPS1
0BT7GSFTpm3YUHcygD6LRmRg4I/BmWDTDtMi84+jLat6VvSG1HwIm//qHiCJh3ku
SlvFZENIWAv5LP92x2vlR5Lt7uE3GK2V/5Pxt2fekyzCth6mzu+hLH4CBPQ3xgyd
E1JqIQ/ilbXstp+EYoivV5x8yltZQnKEZRopws0EOqj1LsmDPj9XT1wzE9RnB0o+
PWu/DNhQFhhcmP7Z8uLgPiKIVpyGs+vjxiJLlTtGDFTCy6M5JbcgzGkEkSmnybxH
7lSanjpLt1dWj85FBMc6fNtJkv2rBPfb4+j0d1kZ45Dzcr4umirGIh7wtCHcgc83
brmXSt29hlKHseSHMMuNWK8haXcgAE7gq9tD8GZ/kzM7+vkmLLxHJa22Qhq5rp4w
URPeaBaQJw==
=ayp2
-----END PGP SIGNATURE-----
Merge tag 'for-5.11/drivers-2020-12-14' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
"Nothing major in here:
- NVMe pull request from Christoph:
- nvmet passthrough improvements (Chaitanya Kulkarni)
- fcloop error injection support (James Smart)
- read-only support for zoned namespaces without Zone Append
(Javier González)
- improve some error message (Minwoo Im)
- reject I/O to offline fabrics namespaces (Victor Gladkov)
- PCI queue allocation cleanups (Niklas Schnelle)
- remove an unused allocation in nvmet (Amit Engel)
- a Kconfig spelling fix (Colin Ian King)
- nvme_req_qid simplication (Baolin Wang)
- MD pull request from Song:
- Fix race condition in md_ioctl() (Dae R. Jeong)
- Initialize read_slot properly for raid10 (Kevin Vigor)
- Code cleanup (Pankaj Gupta)
- md-cluster resync/reshape fix (Zhao Heming)
- Move null_blk into its own directory (Damien Le Moal)
- null_blk zone and discard improvements (Damien Le Moal)
- bcache race fix (Dongsheng Yang)
- Set of rnbd fixes/improvements (Gioh Kim, Guoqing Jiang, Jack Wang,
Lutz Pogrell, Md Haris Iqbal)
- lightnvm NULL pointer deref fix (tangzhenhao)
- sr in_interrupt() removal (Sebastian Andrzej Siewior)
- FC endpoint security support for s390/dasd (Jan Höppner, Sebastian
Ott, Vineeth Vijayan). From the s390 arch guys, arch bits included
as it made it easier for them to funnel the feature through the
block driver tree.
- Follow up fixes (Colin Ian King)"
* tag 'for-5.11/drivers-2020-12-14' of git://git.kernel.dk/linux-block: (64 commits)
block: drop dead assignments in loop_init()
sr: Remove in_interrupt() usage in sr_init_command().
sr: Switch the sector size back to 2048 if sr_read_sector() changed it.
cdrom: Reset sector_size back it is not 2048.
drivers/lightnvm: fix a null-ptr-deref bug in pblk-core.c
null_blk: Move driver into its own directory
null_blk: Allow controlling max_hw_sectors limit
null_blk: discard zones on reset
null_blk: cleanup discard handling
null_blk: Improve implicit zone close
null_blk: improve zone locking
block: Align max_hw_sectors to logical blocksize
null_blk: Fail zone append to conventional zones
null_blk: Fix zone size initialization
bcache: fix race between setting bdev state to none and new write request direct to backing
block/rnbd: fix a null pointer dereference on dev->blk_symlink_name
block/rnbd-clt: Dynamically alloc buffer for pathname & blk_symlink_name
block/rnbd: call kobject_put in the failure path
Documentation/ABI/rnbd-srv: add document for force_close
block/rnbd-srv: close a mapped device from server side.
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/Xec8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoLbEACzXypgZWwMdfgRckA/Vt333rXHtbhUV+hK
2XP+P81iRvr9Esi31UPbRp82vrgcDO0cpI1QmQojS5U5TIQP88BfXptfRZZu48eb
wT5RDDNQ34HItqAh/yEuYsv9yUKcxeIrB99tBVvM+4UmQg9zTdIW3mg6PvCBdbhV
N38jI0tCF/PJatjfRuphT/nXonQLPWBlVDmZk06KZQFOwQe9ep1vUi1+nbiRPuo3
geFBpTh1Kp6Vl1B3n4RpECs6Y7I0RRuJdaH2sDizICla1/BW91F9fQwHimNnUxUq
e1Q1kMuh6ftcQGkYlHSYcPhuv6CvorldTZCO5arPxWpcwvxriTSMRPWAgUr5pEiF
fhiGhqeDu9e6vl9vS31wUD1B30hy+jFz9wyjRrDwJ3cPHH1JVBjTzvdX+cIh/1ku
IbIwUMteUtvUrzqAv/DzbGhedp7xWtOFaVo8j0QFYh9zkjd6b8yDOF/yztwX2gjY
Xt1cd+KpDSiN449ZRaoMI0sCJAxqzhMa6nsWlb0L7KuNyWKAbvKQBm9Rb47FLV9A
Vx70KC+zkFoyw23capvIahmQazerriUJ5PGe0lVm6ROgmIFdCpXTPDjnrvq/6RZ/
GEpD7gTW9atGJ7EuEE8686sAfKD5kneChWLX5EHXf0d0AG5Mr2lKsluiGp5LpPJg
Q1Xqs6xwww==
=zo4w
-----END PGP SIGNATURE-----
Merge tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"Another series of killing more code than what is being added, again
thanks to Christoph's relentless cleanups and tech debt tackling.
This contains:
- blk-iocost improvements (Baolin Wang)
- part0 iostat fix (Jeffle Xu)
- Disable iopoll for split bios (Jeffle Xu)
- block tracepoint cleanups (Christoph Hellwig)
- Merging of struct block_device and hd_struct (Christoph Hellwig)
- Rework/cleanup of how block device sizes are updated (Christoph
Hellwig)
- Simplification of gendisk lookup and removal of block device
aliasing (Christoph Hellwig)
- Block device ioctl cleanups (Christoph Hellwig)
- Removal of bdget()/blkdev_get() as exported API (Christoph Hellwig)
- Disk change rework, avoid ->revalidate_disk() (Christoph Hellwig)
- sbitmap improvements (Pavel Begunkov)
- Hybrid polling fix (Pavel Begunkov)
- bvec iteration improvements (Pavel Begunkov)
- Zone revalidation fixes (Damien Le Moal)
- blk-throttle limit fix (Yu Kuai)
- Various little fixes"
* tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block: (126 commits)
blk-mq: fix msec comment from micro to milli seconds
blk-mq: update arg in comment of blk_mq_map_queue
blk-mq: add helper allocating tagset->tags
Revert "block: Fix a lockdep complaint triggered by request queue flushing"
nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class
blk-mq: add new API of blk_mq_hctx_set_fq_lock_class
block: disable iopoll for split bio
block: Improve blk_revalidate_disk_zones() checks
sbitmap: simplify wrap check
sbitmap: replace CAS with atomic and
sbitmap: remove swap_lock
sbitmap: optimise sbitmap_deferred_clear()
blk-mq: skip hybrid polling if iopoll doesn't spin
blk-iocost: Factor out the base vrate change into a separate function
blk-iocost: Factor out the active iocgs' state check into a separate function
blk-iocost: Move the usage ratio calculation to the correct place
blk-iocost: Remove unnecessary advance declaration
blk-iocost: Fix some typos in comments
blktrace: fix up a kerneldoc comment
block: remove the request_queue to argument request based tracepoints
...
This reverts commit e0910c8e4f.
Reverting 6ffeb1c3f8 ("md: change mddev 'chunk_sectors' from int to
unsigned") exposes dm-raid.c compiler warnings detailed that commit's
header. Clearly this more conservative fix, of simply reverting
e0910c8e4f, would've been more prudent given how late we were in the
v5.10 release. Lessons have been learned.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This reverts commit 6ffeb1c3f8.
This change caused unexpected v5.10 raid6 mount failures, see:
https://lkml.org/lkml/2020/12/14/7
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/VMG8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpsBdEACjbsECHIyQZOtTHW+vY2wkOj2ph+OkXtUU
nmRlXn5itcrTEeHyenwDEc/W03WbU5m4C6BUz9ecRtXLda6K9W2d7YspSWND1JP/
Gz3YtingA4xwAl7wOtHT6gi6JHb/seDObiKWAik6f7A3xvBW3o9lV1QRy3kaNdG1
XgcYvBZBHDaP7Gdvvbg6FucSe3dNjSi+skVdvx8j+i27HF3LAEp3eoWG4vZBUePH
RIpdDyCXRao+Z22V+EeY0X5cH568fxS408Z6xXewV/g2TDd4XUaXz3WM0O9FN0jU
oTiQeXL+1y4EfU8QoboTWvHmpoKRqWMSZVD0GXbmx4ZwpiFHbyBub2SydU2Xm/Vl
oNwHc1wktVU2HFrWU9dwrlWz4Egt4KZO6D9SzJ3VDt9HLKIciyZD16rbsVwSr/Bc
BVQaFJjRYoR60xxNohIEC8C3rF3qRbkAy5HYSHFUyuOE6gm05RQIHLprY6gIjSei
7gJnP1FL3iu5bQywp7ptqaRIlkeau6kqvboc8oYFDQvVAp35JqnxjdOsAByWUZR9
lrN87bIh0n1Vqprs5arXWhw5Ixy7ZQsf0DsrFImZgU+VNTFmuB+1cKaa94EsQwo5
gQH2j39QQSlGmkXphQVWDd0eGCAt3OrmjYe721ZSSLvJjM64+7pZFZpYPc//VpR5
8Kcq/yDeOA==
=PmVb
-----END PGP SIGNATURE-----
Merge tag 'block-5.10-2020-12-12' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"This should be it for 5.10.
Mike and Song looked into the warning case, and thankfully it appears
the fix was pretty trivial - we can just change the md device chunk
type to unsigned int to get rid of it. They cannot currently be < 0,
and nobody is checking for that either.
We're reverting the discard changes as the corruption reports came in
very late, and there's just no time to attempt to deal with it at this
point. Reverting the changes in question is the right call for 5.10"
* tag 'block-5.10-2020-12-12' of git://git.kernel.dk/linux-block:
md: change mddev 'chunk_sectors' from int to unsigned
Revert "md: add md_submit_discard_bio() for submitting discard bio"
Revert "md/raid10: extend r10bio devs to raid disks"
Revert "md/raid10: pull codes that wait for blocked dev into one function"
Revert "md/raid10: improve raid10 discard request"
Revert "md/raid10: improve discard request for far layout"
Revert "dm raid: remove unnecessary discard limits for raid10"
Commit e2782f560c ("Revert "dm raid: remove unnecessary discard
limits for raid10"") exposed compiler warnings introduced by commit
e0910c8e4f ("dm raid: fix discard limits for raid1 and raid10"):
In file included from ./include/linux/kernel.h:14,
from ./include/asm-generic/bug.h:20,
from ./arch/x86/include/asm/bug.h:93,
from ./include/linux/bug.h:5,
from ./include/linux/mmdebug.h:5,
from ./include/linux/gfp.h:5,
from ./include/linux/slab.h:15,
from drivers/md/dm-raid.c:8:
drivers/md/dm-raid.c: In function ‘raid_io_hints’:
./include/linux/minmax.h:18:28: warning: comparison of distinct pointer types lacks a cast
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
^~
./include/linux/minmax.h:32:4: note: in expansion of macro ‘__typecheck’
(__typecheck(x, y) && __no_side_effects(x, y))
^~~~~~~~~~~
./include/linux/minmax.h:42:24: note: in expansion of macro ‘__safe_cmp’
__builtin_choose_expr(__safe_cmp(x, y), \
^~~~~~~~~~
./include/linux/minmax.h:51:19: note: in expansion of macro ‘__careful_cmp’
#define min(x, y) __careful_cmp(x, y, <)
^~~~~~~~~~~~~
./include/linux/minmax.h:84:39: note: in expansion of macro ‘min’
__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
^~~
drivers/md/dm-raid.c:3739:33: note: in expansion of macro ‘min_not_zero’
limits->max_discard_sectors = min_not_zero(rs->md.chunk_sectors,
^~~~~~~~~~~~
Fix this by changing the chunk_sectors member of 'struct mddev' from
int to 'unsigned int' to match the type used for the 'chunk_sectors'
member of 'struct queue_limits'. Various MD code still uses 'int' but
none of it appears to ever make use of signed int; and storing
positive signed int in unsigned is perfectly safe.
Reported-by: Song Liu <songliubraving@fb.com>
Fixes: e2782f560c ("Revert "dm raid: remove unnecessary discard limits for raid10"")
Fixes: e0910c8e4f ("dm raid: fix discard limits for raid1 and raid10")
Cc: stable@vger,kernel.org # e0910c8e4f was marked for stable@
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Song Liu <song@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This reverts commit 2628089b74.
Matthew Ruffell reported data corruption in raid10 due to the changes
in discard handling [1]. Revert these changes before we find a proper fix.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
Cc: Matthew Ruffell <matthew.ruffell@canonical.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
This reverts commit 8650a88901.
Matthew Ruffell reported data corruption in raid10 due to the changes
in discard handling [1]. Revert these changes before we find a proper fix.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
Cc: Matthew Ruffell <matthew.ruffell@canonical.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
This reverts commit f046f5d0d7.
Matthew Ruffell reported data corruption in raid10 due to the changes
in discard handling [1]. Revert these changes before we find a proper fix.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
Cc: Matthew Ruffell <matthew.ruffell@canonical.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
This reverts commit bcc90d2804.
Matthew Ruffell reported data corruption in raid10 due to the changes
in discard handling [1]. Revert these changes before we find a proper fix.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
Cc: Matthew Ruffell <matthew.ruffell@canonical.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
This reverts commit d3ee2d8415.
Matthew Ruffell reported data corruption in raid10 due to the changes
in discard handling [1]. Revert these changes before we find a proper fix.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
Cc: Matthew Ruffell <matthew.ruffell@canonical.com>
Cc: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
This reverts commit f0e90b6c66.
Matthew Ruffell reported data corruption in raid10 due to the changes
in discard handling [1]. Revert these changes before we find a proper fix.
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907262/
Cc: Matthew Ruffell <matthew.ruffell@canonical.com>
Cc: Xiao Ni <xni@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
There is a race condition in detaching as below:
A. detaching B. Write request
(1) writing back
(2) write back done, set bdev
state to clean.
(3) cached_dev_put() and
schedule_work(&dc->detach);
(4) write data [0 - 4K] directly
into backing and ack to user.
(5) power-failure...
When we restart this bcache device, this bdev is clean but not detached,
and read [0 - 4K], we will get unexpected old data from cache device.
To fix this problem, set the bdev state to none when we writeback done
in detaching, and then if power-failure happened as above, the data in
cache will not be used in next bcache device starting, it's detached, we
will read the correct data from backing derectly.
Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It should be helpful to export sysfs of "kcryptd" workqueue in some
cases, such as setting specific CPU affinity of the workqueue.
Besides, also tweak the name format a little. The slash inside a
directory name will be translate into exclamation mark, such as
/sys/devices/virtual/workqueue/'kcryptd!253:0'.
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 2ca4c92f58 ("dm ioctl: prevent empty message")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>