OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	ac524ece21	f2fs-5.13-rc1-fix This series of patches fix some critical bugs such as memory leak in compression flows, kernel panic when handling errors, and swapon failure due to newly added condition check. -----BEGIN PGP SIGNATURE----- iQIyBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmCeTNcACgkQQBSofoJI UNKc0Q/3X9Tngns5DpnzQw9kbWochucG/7Vf1HjdvV2gE7IxruDPofGtBdhPdSYV uR+nP9dxhXLQQNfWS2KICzdj0yceKCKq8xpnnNdq9SGjVJmUjCD39ByGJV3GMOGM fY6dizcywltH7iBQboMZ0Eh3ivPh6ugl5klDo20WzYsH6F/UF7CPSuSJ3K5ezGuY T6R3NkqG8v1cS6+5u+teDpmdCCHOCBEeizBFQ6XskNDBavbw7KEA0liwKOv6eghB PdoeqYemg1VfOHAKqP6F3o+eSlsT5Ljs0Zmc5x8h8qRS76JK/hb9REFngrcERaIw GPDnMPCHniCHMd61z90oqGtMf4PFqLUVnBTdxhxLK5G4+u874dlZLciKWGDIGTLv eNU2W+8c9s+KdJAZFJbYN5zVoyJUR5SW7RcYTuvZWt8wX38Ch+FZEGqOC8FxWyWU i1WHXZiGeifIlIeqUOPJP5sbslL2hfK5OqMYJotAeIW/E2RyJWnc+Yo2UwvmvXVU xPOKFOn9nAAQz2GGgSpvGaWAVfMNqhoLw7/gzwdkabP0EASIzHuW2PCJhp59c1NO Fb9eUi7yhgd94vDbYftXRBgAhrUmCd0u+/gySyou4vujWtfWcO+AvOEreV7VRFfL Su0bGkBJ03ThFEFAZnY14RenydGSwpk5Fd9wkRy4Qk7mBv9sRg== =NeKH -----END PGP SIGNATURE----- Merge tag 'f2fs-5.13-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fixes from Jaegeuk Kim: "This fixes some critical bugs such as memory leak in compression flows, kernel panic when handling errors, and swapon failure due to newly added condition check" * tag 'f2fs-5.13-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: f2fs: return EINVAL for hole cases in swap file f2fs: avoid swapon failure by giving a warning first f2fs: compress: fix to assign cc.cluster_idx correctly f2fs: compress: fix race condition of overwrite vs truncate f2fs: compress: fix to free compress page correctly f2fs: support iflag change given the mask f2fs: avoid null pointer access when handling IPU error	2021-05-14 10:49:20 -07:00
Linus Torvalds	b5304a4f9a	drm fixes for 5.13-rc1 two MAINTAINERS updates. amdgpu: - Fixes for flexible array conversions - Fix sysfs attribute init - Harvesting fixes - VCN CG/PG fixes for Picasso radeon: - Fixes for flexible array conversions - Fix for flickering on Oland with multiple 4K displays vc4: - drop an used function -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJgneCaAAoJEAx081l5xIa+F+wP+wSg9t3IKkev8dcyHkHXHHyO k7o4SCxXZ8KCNo9uOj2r6N8k45RQJYMYFZxxlMMex/cfGcHIBUkTVdL/6cz1Ks3i Ou/CLlij9k2irBi0LGV+BcDPs3ud4DBQpP0i9RWzecvJDt0tiXnmmzIdhyDaG4HV daYV4qVuF/GBmKpyAOtHT6OXgU4g8GjYjeTA+WrEJgMLJvY7qQiRgsoz4Fwog4hi e2ALZkEPqxrszBQ3Xsl/LPhlMZ/5oVs/+JSQEsL0UM88w7TTf4dCOefFlPw05fQ8 N4T0iOIBoMWQ0SnMOEiIZxJqtt4v/eQy7GgeI7DOCxx6Z4Bc8ALSDJAaqaK+xdNQ AXyJSMlg5K/r7sCkP1nQjWP6j9tWHf82kDNoHNceVspa3+eLJ3MDzWZ6PYXsJQ+q CKT+ov0AkZZfQgafHWA+vUEIcWGeVVI1yla0TJbqf0ZvmSYo5Qr+NsS5kYLHGleE 38KYuj09mcfg3zWY9tHcZXDNEeRwLqIAOX2hy37XtwAri8QOW5ZA19KoH9Y3ZRUE s7fvAI9raHmOo0O79wGKa89k6ggDbUkC7oT9lzqCQQ005VlmfKsGNtwwvGIMkabW 9NneHx0n6+uvb1NIkTU3qP9wsbIA2hzk2iDp1zeOY2U8cGhyjFgm7apc/JyCVzsG XKHR8awHEEvC81txicI6 =iwKz -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2021-05-14' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Not much here, mostly amdgpu fixes, with a couple of radeon, and a cosmetic vc4. Two MAINTAINERS file updates also. amdgpu: - Fixes for flexible array conversions - Fix sysfs attribute init - Harvesting fixes - VCN CG/PG fixes for Picasso radeon: - Fixes for flexible array conversions - Fix for flickering on Oland with multiple 4K displays vc4: - drop unused function" * tag 'drm-fixes-2021-05-14' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: update vcn1.0 Non-DPG suspend sequence drm/amdgpu: set vcn mgcg flag for picasso drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected drm/amdgpu: update the method for harvest IP for specific SKU drm/amdgpu: add judgement when add ip blocks (v2) drm/amd/display: Initialize attribute for hdcp_srm sysfs file drm/amd/pm: Fix out-of-bounds bug drm/radeon/si_dpm: Fix SMU power state load drm/radeon/ni_dpm: Fix booting bug MAINTAINERS: Update address for Emma Anholt MAINTAINERS: Update my e-mail drm/vc4: remove unused function drm/ttm: Do not add non-system domain BO into swap list	2021-05-14 10:38:16 -07:00
Catalin Marinas	588a513d34	arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache() To ensure that instructions are observable in a new mapping, the arm64 set_pte_at() implementation cleans the D-cache and invalidates the I-cache to the PoU. As an optimisation, this is only done on executable mappings and the PG_dcache_clean page flag is set to avoid future cache maintenance on the same page. When two different processes map the same page (e.g. private executable file or shared mapping) there's a potential race on checking and setting PG_dcache_clean via set_pte_at() -> __sync_icache_dcache(). While on the fault paths the page is locked (PG_locked), mprotect() does not take the page lock. The result is that one process may see the PG_dcache_clean flag set but the I/D cache maintenance not yet performed. Avoid test_and_set_bit(PG_dcache_clean) in favour of separate test_bit() and set_bit(). In the rare event of a race, the cache maintenance is done twice. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Steven Price <steven.price@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210514095001.13236-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-05-14 17:11:16 +01:00
Bart Van Assche	4bc2082311	block/partitions/efi.c: Fix the efi_partition() kernel-doc header Fix the following kernel-doc warning: block/partitions/efi.c:685: warning: wrong kernel-doc identifier on line: * efi_partition(struct parsed_partitions *state) Cc: Alexander Viro <viro@math.psu.edu> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210513171708.8391-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-14 09:00:06 -06:00
Bart Van Assche	630ef623ed	blk-mq: Swap two calls in blk_mq_exit_queue() If a tag set is shared across request queues (e.g. SCSI LUNs) then the block layer core keeps track of the number of active request queues in tags->active_queues. blk_mq_tag_busy() and blk_mq_tag_idle() update that atomic counter if the hctx flag BLK_MQ_F_TAG_QUEUE_SHARED is set. Make sure that blk_mq_exit_queue() calls blk_mq_tag_idle() before that flag is cleared by blk_mq_del_queue_tag_set(). Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Fixes: `0d2602ca30` ("blk-mq: improve support for shared tags maps") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210513171529.7977-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-14 08:59:31 -06:00
Ming Lei	03f26d8f11	blk-mq: plug request for shared sbitmap In case of shared sbitmap, request won't be held in plug list any more sine commit `32bc15afed` ("blk-mq: Facilitate a shared sbitmap per tagset"), this way makes request merge from flush plug list & batching submission not possible, so cause performance regression. Yanhui reports performance regression when running sequential IO test(libaio, 16 jobs, 8 depth for each job) in VM, and the VM disk is emulated with image stored on xfs/megaraid_sas. Fix the issue by recovering original behavior to allow to hold request in plug list. Cc: Yanhui Ma <yama@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: kashyap.desai@broadcom.com Fixes: `32bc15afed` ("blk-mq: Facilitate a shared sbitmap per tagset") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210514022052.1047665-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-14 08:59:08 -06:00
Stefano Stabellini	97729b653d	xen/swiotlb: check if the swiotlb has already been initialized xen_swiotlb_init calls swiotlb_late_init_with_tbl, which fails with -ENOMEM if the swiotlb has already been initialized. Add an explicit check io_tlb_default_mem != NULL at the beginning of xen_swiotlb_init. If the swiotlb is already initialized print a warning and return -EEXIST. On x86, the error propagates. On ARM, we don't actually need a special swiotlb buffer (yet), any buffer would do. So ignore the error and continue. CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210512201823.1963-3-sstabellini@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>	2021-05-14 15:52:11 +02:00
Christoph Hellwig	687842ec50	arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required Although SWIOTLB_NO_FORCE is meant to allow later calls to swiotlb_init, today dma_direct_map_page returns error if SWIOTLB_NO_FORCE. For now, without a larger overhaul of SWIOTLB_NO_FORCE, the best we can do is to avoid setting SWIOTLB_NO_FORCE in mem_init when we know that it is going to be required later (e.g. Xen requires it). CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: catalin.marinas@arm.com CC: will@kernel.org CC: linux-arm-kernel@lists.infradead.org Fixes: `2726bf3ff2` ("swiotlb: Make SWIOTLB_NO_FORCE perform no allocation") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210512201823.1963-2-sstabellini@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>	2021-05-14 15:52:08 +02:00
Stefano Stabellini	cb6f6b3384	xen/arm: move xen_swiotlb_detect to arm/swiotlb-xen.h Move xen_swiotlb_detect to a static inline function to make it available to !CONFIG_XEN builds. CC: boris.ostrovsky@oracle.com CC: jgross@suse.com Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20210512201823.1963-1-sstabellini@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>	2021-05-14 15:52:05 +02:00
Vitaly Kuznetsov	3486d2c9be	clocksource/drivers/hyper-v: Re-enable VDSO_CLOCKMODE_HVCLOCK on X86 Mohammed reports (https://bugzilla.kernel.org/show_bug.cgi?id=213029) the commit `e4ab4658f1` ("clocksource/drivers/hyper-v: Handle vDSO differences inline") broke vDSO on x86. The problem appears to be that VDSO_CLOCKMODE_HVCLOCK is an enum value in 'enum vdso_clock_mode' and '#ifdef VDSO_CLOCKMODE_HVCLOCK' branch evaluates to false (it is not a define). Use a dedicated HAVE_VDSO_CLOCKMODE_HVCLOCK define instead. Fixes: `e4ab4658f1` ("clocksource/drivers/hyper-v: Handle vDSO differences inline") Reported-by: Mohammed Gamal <mgamal@redhat.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210513073246.1715070-1-vkuznets@redhat.com	2021-05-14 14:55:13 +02:00
Pavel Begunkov	489809e2e2	io_uring: increase max number of reg buffers Since recent changes instead of storing a large array of struct io_mapped_ubuf, we store pointers to them, that is 4 times slimmer and we should not to so worry about restricting max number of registererd buffer slots, increase the limit 4 times. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d3dee1da37f46da416aa96a16bf9e5094e10584d.1620990371.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-14 06:06:34 -06:00
Pavel Begunkov	2d74d0421e	io_uring: further remove sqpoll limits on opcodes There are three types of requests that left disabled for sqpoll, namely epoll ctx, statx, and resources update. Since SQPOLL task is now closely mimics a userspace thread, remove the restrictions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/909b52d70c45636d8d7897582474ea5aab5eed34.1620990306.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-14 06:06:23 -06:00
Pavel Begunkov	447c19f3b5	io_uring: fix ltout double free on completion race Always remove linked timeout on io_link_timeout_fn() from the master request link list, otherwise we may get use-after-free when first io_link_timeout_fn() puts linked timeout in the fail path, and then will be found and put on master's free. Cc: stable@vger.kernel.org # 5.10+ Fixes: `90cd7e4249` ("io_uring: track link timeout's master explicitly") Reported-and-tested-by: syzbot+5a864149dd970b546223@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/69c46bf6ce37fec4fdcd98f0882e18eb07ce693a.1620990121.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-05-14 06:06:15 -06:00
Hsin-Yi Wang	2962484dfe	misc: eeprom: at24: check suspend status before disable regulator `cd5676db05` ("misc: eeprom: at24: support pm_runtime control") disables regulator in runtime suspend. If runtime suspend is called before regulator disable, it will results in regulator unbalanced disabling. Fixes: `cd5676db05` ("misc: eeprom: at24: support pm_runtime control") Cc: stable <stable@vger.kernel.org> Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Link: https://lore.kernel.org/r/20210420133050.377209-1-hsinyi@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-14 13:50:59 +02:00
Christophe JAILLET	0b0226be3a	uio_hv_generic: Fix another memory leak in error handling paths Memory allocated by 'vmbus_alloc_ring()' at the beginning of the probe function is never freed in the error handling path. Add the missing 'vmbus_free_ring()' call. Note that it is already freed in the .remove function. Fixes: `cdfa835c6e` ("uio_hv_generic: defer opening vmbus until first use") Cc: stable <stable@vger.kernel.org> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/0d86027b8eeed8e6360bc3d52bcdb328ff9bdca1.1620544055.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-14 13:26:04 +02:00
Christophe JAILLET	3ee098f96b	uio_hv_generic: Fix a memory leak in error handling paths If 'vmbus_establish_gpadl()' fails, the (recv\|send)_gpadl will not be updated and 'hv_uio_cleanup()' in the error handling path will not be able to free the corresponding buffer. In such a case, we need to free the buffer explicitly. Fixes: `cdfa835c6e` ("uio_hv_generic: defer opening vmbus until first use") Cc: stable <stable@vger.kernel.org> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/4fdaff557deef6f0475d02ba7922ddbaa1ab08a6.1620544055.git.christophe.jaillet@wanadoo.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-14 13:26:01 +02:00
Martin Ågren	156ed0215e	uio/uio_pci_generic: fix return value changed in refactoring Commit `ef84928cff` ("uio/uio_pci_generic: use device-managed function equivalents") was able to simplify various error paths thanks to no longer having to clean up on the way out. Some error paths were dropped, others were simplified. In one of those simplifications, the return value was accidentally changed from -ENODEV to -ENOMEM. Restore the old return value. Fixes: `ef84928cff` ("uio/uio_pci_generic: use device-managed function equivalents") Cc: stable <stable@vger.kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Link: https://lore.kernel.org/r/20210422192240.1136373-1-martin.agren@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-14 13:25:27 +02:00
PeiSen Hou	1d5cfca286	ALSA: hda/realtek: Add some CLOVE SSIDs of ALC293 Fix "use as headset mic, without its own jack detect" problen. Signed-off-by: PeiSen Hou <pshou@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/d0746eaf29f248a5acc30313e3ba4f99@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-05-14 12:50:51 +02:00
Takashi Sakamoto	814b43127f	ALSA: firewire-lib: fix amdtp_packet tracepoints event for packet_index field The snd_firewire_lib:amdtp_packet tracepoints event includes index of packet processed in a context handling. However in IR context, it is not calculated as expected. Cc: <stable@vger.kernel.org> Fixes: `753e717986` ("ALSA: firewire-lib: use packet descriptor for IR context") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210513125652.110249-6-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-05-14 09:41:26 +02:00
Takashi Sakamoto	1be4f21d99	ALSA: firewire-lib: fix calculation for size of IR context payload The quadlets for CIP header is handled as a part of IR context header, thus it doesn't join in IR context payload. However current calculation includes the quadlets in IR context payload. Cc: <stable@vger.kernel.org> Fixes: `f11453c7cc` ("ALSA: firewire-lib: use 16 bytes IR context header to separate CIP header") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210513125652.110249-5-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-05-14 09:41:16 +02:00
Takashi Sakamoto	395f41e2cd	ALSA: firewire-lib: fix check for the size of isochronous packet payload The check for size of isochronous packet payload just cares of the size of IR context payload without the size of CIP header. Cc: <stable@vger.kernel.org> Fixes: `f11453c7cc` ("ALSA: firewire-lib: use 16 bytes IR context header to separate CIP header") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210513125652.110249-4-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-05-14 09:40:46 +02:00
Takashi Sakamoto	0edabdfe89	ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro Mackie d.2 has an extension card for IEEE 1394 communication, which uses BridgeCo DM1000 ASIC. On the other hand, Mackie d.4 Pro has built-in function for IEEE 1394 communication by Oxford Semiconductor OXFW971, according to schematic diagram available in Mackie website. Although I misunderstood that Mackie d.2 Pro would be also a model with OXFW971, it's wrong. Mackie d.2 Pro is a model which includes the extension card as factory settings. This commit fixes entries in Kconfig and comment in ALSA OXFW driver. Cc: <stable@vger.kernel.org> Fixes: `fd6f4b0dc1` ("ALSA: bebob: Add skelton for BeBoB based devices") Fixes: `ec4dba5053` ("ALSA: oxfw: Add support for Behringer/Mackie devices") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210513125652.110249-3-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-05-14 09:40:32 +02:00
Takashi Sakamoto	1b6604896e	ALSA: dice: fix stream format at middle sampling rate for Alesis iO 26 Alesis iO 26 FireWire has two pairs of digital optical interface. It delivers PCM frames from the interfaces by second isochronous packet streaming. Although both of the interfaces are available at 44.1/48.0 kHz, first one of them is only available at 88.2/96.0 kHz. It reduces the number of PCM samples to 4 in Multi Bit Linear Audio data channel of data blocks on the second isochronous packet streaming. This commit fixes hardcoded stream formats. Cc: <stable@vger.kernel.org> Fixes: `28b208f600` ("ALSA: dice: add parameters of stream formats for models produced by Alesis") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20210513125652.110249-2-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>	2021-05-14 09:40:15 +02:00
Nicholas Piggin	c6ac667b07	powerpc/64e/interrupt: Fix nvgprs being clobbered Some interrupt handlers have an "extra" that saves 1 or 2 registers (r14, r15) in the paca save area and makes them available to use by the handler. The change to always save nvgprs in exception handlers lead to some interrupt handlers saving those scratch r14 / r15 registers into the interrupt frame's GPR saves, which get restored on interrupt exit. Fix this by always reloading those scratch registers from paca before the EXCEPTION_COMMON that saves nvgprs. Fixes: `4228b2c3d2` ("powerpc/64e/interrupt: always save nvgprs on interrupt") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210514044008.1955783-1-npiggin@gmail.com	2021-05-14 17:28:54 +10:00
Nicholas Piggin	4ec5feec1a	powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled scv support introduced the notion of code that implicitly soft-masks irqs due to the instruction addresses. This is required because scv enters the kernel with MSR[EE]=1. If a NMI (including soft-NMI) interrupt hits when we are implicitly soft-masked then its regs->softe does not reflect this because it is derived from the explicit soft mask state (paca->irq_soft_mask). This makes arch_irq_disabled_regs(regs) return false. This can trigger a warning in the soft-NMI watchdog code (shown below). Fix it by having NMI interrupts set regs->softe to disabled in case of interrupting an implicit soft-masked region. ------------[ cut here ]------------ WARNING: CPU: 41 PID: 1103 at arch/powerpc/kernel/watchdog.c:259 soft_nmi_interrupt+0x3e4/0x5f0 CPU: 41 PID: 1103 Comm: (spawn) Not tainted NIP: c000000000039534 LR: c000000000039234 CTR: c000000000009a00 REGS: c000007fffbcf940 TRAP: 0700 Not tainted MSR: 9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR: 22042482 XER: 200400ad CFAR: c000000000039260 IRQMASK: 3 GPR00: c000000000039204 c000007fffbcfbe0 c000000001d6c300 0000000000000003 GPR04: 00007ffffa45d078 0000000000000000 0000000000000008 0000000000000020 GPR08: 0000007ffd4e0000 0000000000000000 c000007ffffceb00 7265677368657265 GPR12: 9000000000009033 c000007ffffceb00 00000f7075bf4480 000000000000002a GPR16: 00000f705745a528 00007ffffa45ddd8 00000f70574d0008 0000000000000000 GPR20: 00000f7075c58d70 00000f7057459c38 0000000000000001 0000000000000040 GPR24: 0000000000000000 0000000000000029 c000000001dae058 0000000000000029 GPR28: 0000000000000000 0000000000000800 0000000000000009 c000007fffbcfd60 NIP [c000000000039534] soft_nmi_interrupt+0x3e4/0x5f0 LR [c000000000039234] soft_nmi_interrupt+0xe4/0x5f0 Call Trace: [c000007fffbcfbe0] [c000000000039204] soft_nmi_interrupt+0xb4/0x5f0 (unreliable) [c000007fffbcfcf0] [c00000000000c0e8] soft_nmi_common+0x138/0x1c4 --- interrupt: 900 at end_real_trampolines+0x0/0x1000 NIP: c000000000003000 LR: 00007ca426adb03c CTR: 900000000280f033 REGS: c000007fffbcfd60 TRAP: 0900 MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 44042482 XER: 200400ad CFAR: 00007ca426946020 IRQMASK: 0 GPR00: 00000000000000ad 00007ffffa45d050 00007ca426b07f00 0000000000000035 GPR04: 00007ffffa45d078 0000000000000000 0000000000000008 0000000000000020 GPR08: 0000000000000000 0000000000100000 0000000010000000 00007ffffa45d110 GPR12: 0000000000000001 00007ca426d4e680 00000f7075bf4480 000000000000002a GPR16: 00000f705745a528 00007ffffa45ddd8 00000f70574d0008 0000000000000000 GPR20: 00000f7075c58d70 00000f7057459c38 0000000000000001 0000000000000040 GPR24: 0000000000000000 00000f7057473f68 0000000000000003 000000000000041b GPR28: 00007ffffa45d4c4 0000000000000035 0000000000000000 00000f7057473f68 NIP [c000000000003000] end_real_trampolines+0x0/0x1000 LR [00007ca426adb03c] 0x7ca426adb03c --- interrupt: 900 Instruction dump: 60000000 60000000 60420000 38600001 482b3ae5 60000000 e93f0138 a36d0008 7daa6b78 71290001 7f7907b4 4082fd34 <0fe00000> 4bfffd2c 60420000 ea6100a8 ---[ end trace dc75f67d819779da ]--- Fixes: `118178e62e` ("powerpc: move NMI entry/exit code into wrapper") Reported-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503111708.758261-1-npiggin@gmail.com	2021-05-14 17:28:52 +10:00
Michael Ellerman	5b48ba2fbd	powerpc/64s: Fix stf mitigation patching w/strict RWX & hash The stf entry barrier fallback is unsafe to execute in a semi-patched state, which can happen when enabling/disabling the mitigation with strict kernel RWX enabled and using the hash MMU. See the previous commit for more details. Fix it by changing the order in which we patch the instructions. Note the stf barrier fallback is only used on Power6 or earlier. Fixes: `bd573a8131` ("powerpc/mm/64s: Allow STRICT_KERNEL_RWX again") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210513140800.1391706-2-mpe@ellerman.id.au	2021-05-14 17:27:37 +10:00
Michael Ellerman	49b39ec248	powerpc/64s: Fix entry flush patching w/strict RWX & hash The entry flush mitigation can be enabled/disabled at runtime. When this happens it results in the kernel patching its own instructions to enable/disable the mitigation sequence. With strict kernel RWX enabled instruction patching happens via a secondary mapping of the kernel text, so that we don't have to make the primary mapping writable. With the hash MMU this leads to a hash fault, which causes us to execute the exception entry which contains the entry flush mitigation. This means we end up executing the entry flush in a semi-patched state, ie. after we have patched the first instruction but before we patch the second or third instruction of the sequence. On machines with updated firmware the entry flush is a series of special nops, and it's safe to to execute in a semi-patched state. However when using the fallback flush the sequence is mflr/branch/mtlr, and so it's not safe to execute if we have patched out the mflr but not the other two instructions. Doing so leads to us corrputing LR, leading to an oops, for example: # echo 0 > /sys/kernel/debug/powerpc/entry_flush kernel tried to execute exec-protected page (c000000002971000) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xc000000002971000 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries CPU: 0 PID: 2215 Comm: bash Not tainted 5.13.0-rc1-00010-gda3bb206c9ce #1 NIP: c000000002971000 LR: c000000002971000 CTR: c000000000120c40 REGS: c000000013243840 TRAP: 0400 Not tainted (5.13.0-rc1-00010-gda3bb206c9ce) MSR: 8000000010009033 <SF,EE,ME,IR,DR,RI,LE> CR: 48428482 XER: 00000000 ... NIP 0xc000000002971000 LR 0xc000000002971000 Call Trace: do_patch_instruction+0xc4/0x340 (unreliable) do_entry_flush_fixups+0x100/0x3b0 entry_flush_set+0x50/0xe0 simple_attr_write+0x160/0x1a0 full_proxy_write+0x8c/0x110 vfs_write+0xf0/0x340 ksys_write+0x84/0x140 system_call_exception+0x164/0x2d0 system_call_common+0xec/0x278 The simplest fix is to change the order in which we patch the instructions, so that the sequence is always safe to execute. For the non-fallback flushes it doesn't matter what order we patch in. Fixes: `bd573a8131` ("powerpc/mm/64s: Allow STRICT_KERNEL_RWX again") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210513140800.1391706-1-mpe@ellerman.id.au	2021-05-14 17:27:36 +10:00
Michael Ellerman	aec86b052d	powerpc/64s: Fix crashes when toggling entry flush barrier The entry flush mitigation can be enabled/disabled at runtime via a debugfs file (entry_flush), which causes the kernel to patch itself to enable/disable the relevant mitigations. However depending on which mitigation we're using, it may not be safe to do that patching while other CPUs are active. For example the following crash: sleeper[15639]: segfault (11) at c000000000004c20 nip c000000000004c20 lr c000000000004c20 Shows that we returned to userspace with a corrupted LR that points into the kernel, due to executing the partially patched call to the fallback entry flush (ie. we missed the LR restore). Fix it by doing the patching under stop machine. The CPUs that aren't doing the patching will be spinning in the core of the stop machine logic. That is currently sufficient for our purposes, because none of the patching we do is to that code or anywhere in the vicinity. Fixes: `f79643787e` ("powerpc/64s: flush L1D on kernel entry") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210506044959.1298123-2-mpe@ellerman.id.au	2021-05-14 17:27:36 +10:00
Michael Ellerman	8ec7791bae	powerpc/64s: Fix crashes when toggling stf barrier The STF (store-to-load forwarding) barrier mitigation can be enabled/disabled at runtime via a debugfs file (stf_barrier), which causes the kernel to patch itself to enable/disable the relevant mitigations. However depending on which mitigation we're using, it may not be safe to do that patching while other CPUs are active. For example the following crash: User access of kernel address (c00000003fff5af0) - exploit attempt? (uid: 0) segfault (11) at c00000003fff5af0 nip 7fff8ad12198 lr 7fff8ad121f8 code 1 code: 40820128 e93c00d0 e9290058 7c292840 40810058 38600000 4bfd9a81 e8410018 code: 2c030006 41810154 3860ffb6 e9210098 <e94d8ff0> 7d295279 39400000 40820a3c Shows that we returned to userspace without restoring the user r13 value, due to executing the partially patched STF exit code. Fix it by doing the patching under stop machine. The CPUs that aren't doing the patching will be spinning in the core of the stop machine logic. That is currently sufficient for our purposes, because none of the patching we do is to that code or anywhere in the vicinity. Fixes: `a048a07d7f` ("powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210506044959.1298123-1-mpe@ellerman.id.au	2021-05-14 17:27:36 +10:00
Filipe Manana	54a40fc3a1	btrfs: fix removed dentries still existing after log is synced When we move one inode from one directory to another and both the inode and its previous parent directory were logged before, we are not supposed to have the dentry for the old parent if we have a power failure after the log is synced. Only the new dentry is supposed to exist. Generally this works correctly, however there is a scenario where this is not currently working, because the old parent of the file/directory that was moved is not authoritative for a range that includes the dir index and dir item keys of the old dentry. This case is better explained with the following example and reproducer: # The test requires a very specific layout of keys and items in the # fs/subvolume btree to trigger the bug. So we want to make sure that # on whatever platform we are, we have the same leaf/node size. # # Currently in btrfs the node/leaf size can not be smaller than the page # size (but it can be greater than the page size). So use the largest # supported node/leaf size (64K). $ mkfs.btrfs -f -n 65536 /dev/sdc $ mount /dev/sdc /mnt # "testdir" is inode 257. $ mkdir /mnt/testdir $ chmod 755 /mnt/testdir # Create several empty files to have the directory "testdir" with its # items spread over several leaves (7 in this case). $ for ((i = 1; i <= 1200; i++)); do echo -n > /mnt/testdir/file$i done # Create our test directory "dira", inode number 1458, which gets all # its items in leaf 7. # # The BTRFS_DIR_ITEM_KEY item for inode 257 ("testdir") that points to # the entry named "dira" is in leaf 2, while the BTRFS_DIR_INDEX_KEY # item that points to that entry is in leaf 3. # # For this particular filesystem node size (64K), file count and file # names, we endup with the directory entry items from inode 257 in # leaves 2 and 3, as previously mentioned - what matters for triggering # the bug exercised by this test case is that those items are not placed # in leaf 1, they must be placed in a leaf different from the one # containing the inode item for inode 257. # # The corresponding BTRFS_DIR_ITEM_KEY and BTRFS_DIR_INDEX_KEY items for # the parent inode (257) are the following: # # item 460 key (257 DIR_ITEM 3724298081) itemoff 48344 itemsize 34 # location key (1458 INODE_ITEM 0) type DIR # transid 6 data_len 0 name_len 4 # name: dira # # and: # # item 771 key (257 DIR_INDEX 1202) itemoff 36673 itemsize 34 # location key (1458 INODE_ITEM 0) type DIR # transid 6 data_len 0 name_len 4 # name: dira $ mkdir /mnt/testdir/dira # Make sure everything done so far is durably persisted. $ sync # Now do a change to inode 257 ("testdir") that does not result in # COWing leaves 2 and 3 - the leaves that contain the directory items # pointing to inode 1458 (directory "dira"). # # Changing permissions, the owner/group, updating or adding a xattr, # etc, will not change (COW) leaves 2 and 3. So for the sake of # simplicity change the permissions of inode 257, which results in # updating its inode item and therefore change (COW) only leaf 1. $ chmod 700 /mnt/testdir # Now fsync directory inode 257. # # Since only the first leaf was changed/COWed, we log the inode item of # inode 257 and only the dentries found in the first leaf, all have a # key type of BTRFS_DIR_ITEM_KEY, and no keys of type # BTRFS_DIR_INDEX_KEY, because they sort after the former type and none # exist in the first leaf. # # We also log 3 items that represent ranges for dir items and dir # indexes for which the log is authoritative: # # 1) a key of type BTRFS_DIR_LOG_ITEM_KEY, which indicates the log is # authoritative for all BTRFS_DIR_ITEM_KEY keys that have an offset # in the range [0, 2285968570] (the offset here is the crc32c of the # dentry's name). The value 2285968570 corresponds to the offset of # the first key of leaf 2 (which is of type BTRFS_DIR_ITEM_KEY); # # 2) a key of type BTRFS_DIR_LOG_ITEM_KEY, which indicates the log is # authoritative for all BTRFS_DIR_ITEM_KEY keys that have an offset # in the range [4293818216, (u64)-1] (the offset here is the crc32c # of the dentry's name). The value 4293818216 corresponds to the # offset of the highest key of type BTRFS_DIR_ITEM_KEY plus 1 # (4293818215 + 1), which is located in leaf 2; # # 3) a key of type BTRFS_DIR_LOG_INDEX_KEY, with an offset of 1203, # which indicates the log is authoritative for all keys of type # BTRFS_DIR_INDEX_KEY that have an offset in the range # [1203, (u64)-1]. The value 1203 corresponds to the offset of the # last key of type BTRFS_DIR_INDEX_KEY plus 1 (1202 + 1), which is # located in leaf 3; # # Also, because "testdir" is a directory and inode 1458 ("dira") is a # child directory, we log inode 1458 too. $ xfs_io -c "fsync" /mnt/testdir # Now move "dira", inode 1458, to be a child of the root directory # (inode 256). # # Because this inode was previously logged, when "testdir" was fsynced, # the log is updated so that the old inode reference, referring to inode # 257 as the parent, is deleted and the new inode reference, referring # to inode 256 as the parent, is added to the log. $ mv /mnt/testdir/dira /mnt # Now change some file and fsync it. This guarantees the log changes # made by the previous move/rename operation are persisted. We do not # need to do any special modification to the file, just any change to # any file and sync the log. $ xfs_io -c "pwrite -S 0xab 0 64K" -c "fsync" /mnt/testdir/file1 # Simulate a power failure and then mount again the filesystem to # replay the log tree. We want to verify that we are able to mount the # filesystem, meaning log replay was successful, and that directory # inode 1458 ("dira") only has inode 256 (the filesystem's root) as # its parent (and no longer a child of inode 257). # # It used to happen that during log replay we would end up having # inode 1458 (directory "dira") with 2 hard links, being a child of # inode 257 ("testdir") and inode 256 (the filesystem's root). This # resulted in the tree checker detecting the issue and causing the # mount operation to fail (with -EIO). # # This happened because in the log we have the new name/parent for # inode 1458, which results in adding the new dentry with inode 256 # as the parent, but the previous dentry, under inode 257 was never # removed - this is because the ranges for dir items and dir indexes # of inode 257 for which the log is authoritative do not include the # old dir item and dir index for the dentry of inode 257 referring to # inode 1458: # # - for dir items, the log is authoritative for the ranges # [0, 2285968570] and [4293818216, (u64)-1]. The dir item at inode 257 # pointing to inode 1458 has a key of (257 DIR_ITEM 3724298081), as # previously mentioned, so the dir item is not deleted when the log # replay procedure processes the authoritative ranges, as 3724298081 # is outside both ranges; # # - for dir indexes, the log is authoritative for the range # [1203, (u64)-1], and the dir index item of inode 257 pointing to # inode 1458 has a key of (257 DIR_INDEX 1202), as previously # mentioned, so the dir index item is not deleted when the log # replay procedure processes the authoritative range. <power failure> $ mount /dev/sdc /mnt mount: /mnt: can't read superblock on /dev/sdc. $ dmesg (...) [87849.840509] BTRFS info (device sdc): start tree-log replay [87849.875719] BTRFS critical (device sdc): corrupt leaf: root=5 block=30539776 slot=554 ino=1458, invalid nlink: has 2 expect no more than 1 for dir [87849.878084] BTRFS info (device sdc): leaf 30539776 gen 7 total ptrs 557 free space 2092 owner 5 [87849.879516] BTRFS info (device sdc): refs 1 lock_owner 0 current 2099108 [87849.880613] item 0 key (1181 1 0) itemoff 65275 itemsize 160 [87849.881544] inode generation 6 size 0 mode 100644 [87849.882692] item 1 key (1181 12 257) itemoff 65258 itemsize 17 (...) [87850.562549] item 556 key (1458 12 257) itemoff 16017 itemsize 14 [87850.563349] BTRFS error (device dm-0): block=30539776 write time tree block corruption detected [87850.564386] ------------[ cut here ]------------ [87850.564920] WARNING: CPU: 3 PID: 2099108 at fs/btrfs/disk-io.c:465 csum_one_extent_buffer+0xed/0x100 [btrfs] [87850.566129] Modules linked in: btrfs dm_zero dm_snapshot (...) [87850.573789] CPU: 3 PID: 2099108 Comm: mount Not tainted 5.12.0-rc8-btrfs-next-86 #1 (...) [87850.587481] Call Trace: [87850.587768] btree_csum_one_bio+0x244/0x2b0 [btrfs] [87850.588354] ? btrfs_bio_fits_in_stripe+0xd8/0x110 [btrfs] [87850.589003] btrfs_submit_metadata_bio+0xb7/0x100 [btrfs] [87850.589654] submit_one_bio+0x61/0x70 [btrfs] [87850.590248] submit_extent_page+0x91/0x2f0 [btrfs] [87850.590842] write_one_eb+0x175/0x440 [btrfs] [87850.591370] ? find_extent_buffer_nolock+0x1c0/0x1c0 [btrfs] [87850.592036] btree_write_cache_pages+0x1e6/0x610 [btrfs] [87850.592665] ? free_debug_processing+0x1d5/0x240 [87850.593209] do_writepages+0x43/0xf0 [87850.593798] ? __filemap_fdatawrite_range+0xa4/0x100 [87850.594391] __filemap_fdatawrite_range+0xc5/0x100 [87850.595196] btrfs_write_marked_extents+0x68/0x160 [btrfs] [87850.596202] btrfs_write_and_wait_transaction.isra.0+0x4d/0xd0 [btrfs] [87850.597377] btrfs_commit_transaction+0x794/0xca0 [btrfs] [87850.598455] ? _raw_spin_unlock_irqrestore+0x32/0x60 [87850.599305] ? kmem_cache_free+0x15a/0x3d0 [87850.600029] btrfs_recover_log_trees+0x346/0x380 [btrfs] [87850.601021] ? replay_one_extent+0x7d0/0x7d0 [btrfs] [87850.601988] open_ctree+0x13c9/0x1698 [btrfs] [87850.602846] btrfs_mount_root.cold+0x13/0xed [btrfs] [87850.603771] ? kmem_cache_alloc_trace+0x7c9/0x930 [87850.604576] ? vfs_parse_fs_string+0x5d/0xb0 [87850.605293] ? kfree+0x276/0x3f0 [87850.605857] legacy_get_tree+0x30/0x50 [87850.606540] vfs_get_tree+0x28/0xc0 [87850.607163] fc_mount+0xe/0x40 [87850.607695] vfs_kern_mount.part.0+0x71/0x90 [87850.608440] btrfs_mount+0x13b/0x3e0 [btrfs] (...) [87850.629477] ---[ end trace 68802022b99a1ea0 ]--- [87850.630849] BTRFS: error (device sdc) in btrfs_commit_transaction:2381: errno=-5 IO failure (Error while writing out transaction) [87850.632422] BTRFS warning (device sdc): Skipping commit of aborted transaction. [87850.633416] BTRFS: error (device sdc) in cleanup_transaction:1978: errno=-5 IO failure [87850.634553] BTRFS: error (device sdc) in btrfs_replay_log:2431: errno=-5 IO failure (Failed to recover log tree) [87850.637529] BTRFS error (device sdc): open_ctree failed In this example the inode we moved was a directory, so it was easy to detect the problem because directories can only have one hard link and the tree checker immediately detects that. If the moved inode was a file, then the log replay would succeed and we would end up having both the new hard link (/mnt/foo) and the old hard link (/mnt/testdir/foo) present, but only the new one should be present. Fix this by forcing re-logging of the old parent directory when logging the new name during a rename operation. This ensures we end up with a log that is authoritative for a range covering the keys for the old dentry, therefore causing the old dentry do be deleted when replaying the log. A test case for fstests will follow up soon. Fixes: `64d6b281ba` ("btrfs: remove unnecessary check_parent_dirs_for_sync()") CC: stable@vger.kernel.org # 5.12+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-14 01:23:04 +02:00
Boris Burkov	15c7745c9a	btrfs: return whole extents in fiemap `xfs_io -c 'fiemap <off> <len>' <file>` can give surprising results on btrfs that differ from xfs. btrfs prints out extents trimmed to fit the user input. If the user's fiemap request has an offset, then rather than returning each whole extent which intersects that range, we also trim the start extent to not have start < off. Documentation in filesystems/fiemap.txt and the xfs_io man page suggests that returning the whole extent is expected. Some cases which all yield the same fiemap in xfs, but not btrfs: dd if=/dev/zero of=$f bs=4k count=1 sudo xfs_io -c 'fiemap 0 1024' $f 0: [0..7]: 26624..26631 sudo xfs_io -c 'fiemap 2048 1024' $f 0: [4..7]: 26628..26631 sudo xfs_io -c 'fiemap 2048 4096' $f 0: [4..7]: 26628..26631 sudo xfs_io -c 'fiemap 3584 512' $f 0: [7..7]: 26631..26631 sudo xfs_io -c 'fiemap 4091 5' $f 0: [7..6]: 26631..26630 I believe this is a consequence of the logic for merging contiguous extents represented by separate extent items. That logic needs to track the last offset as it loops through the extent items, which happens to pick up the start offset on the first iteration, and trim off the beginning of the full extent. To fix it, start `off` at 0 rather than `start` so that we keep the iteration/merging intact without cutting off the start of the extent. after the fix, all the above commands give: 0: [0..7]: 26624..26631 The merging logic is exercised by fstest generic/483, and I have written a new fstest for checking we don't have backwards or zero-length fiemaps for cases like those above. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-14 01:23:00 +02:00
Josef Bacik	71795ee590	btrfs: avoid RCU stalls while running delayed iputs Generally a delayed iput is added when we might do the final iput, so usually we'll end up sleeping while processing the delayed iputs naturally. However there's no guarantee of this, especially for small files. In production we noticed 5 instances of RCU stalls while testing a kernel release overnight across 1000 machines, so this is relatively common: host count: 5 rcu: INFO: rcu_sched self-detected stall on CPU rcu: ....: (20998 ticks this GP) idle=59e/1/0x4000000000000002 softirq=12333372/12333372 fqs=3208 (t=21031 jiffies g=27810193 q=41075) NMI backtrace for cpu 1 CPU: 1 PID: 1713 Comm: btrfs-cleaner Kdump: loaded Not tainted 5.6.13-0_fbk12_rc1_5520_gec92bffc1ec9 #1 Call Trace: <IRQ> dump_stack+0x50/0x70 nmi_cpu_backtrace.cold.6+0x30/0x65 ? lapic_can_unplug_cpu.cold.30+0x40/0x40 nmi_trigger_cpumask_backtrace+0xba/0xca rcu_dump_cpu_stacks+0x99/0xc7 rcu_sched_clock_irq.cold.90+0x1b2/0x3a3 ? trigger_load_balance+0x5c/0x200 ? tick_sched_do_timer+0x60/0x60 ? tick_sched_do_timer+0x60/0x60 update_process_times+0x24/0x50 tick_sched_timer+0x37/0x70 __hrtimer_run_queues+0xfe/0x270 hrtimer_interrupt+0xf4/0x210 smp_apic_timer_interrupt+0x5e/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:queued_spin_lock_slowpath+0x17d/0x1b0 RSP: 0018:ffffc9000da5fe48 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000000 RBX: ffff889fa81d0cd8 RCX: 0000000000000029 RDX: ffff889fff86c0c0 RSI: 0000000000080000 RDI: ffff88bfc2da7200 RBP: ffff888f2dcdd768 R08: 0000000001040000 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff82a55560 R12: ffff88bfc2da7200 R13: 0000000000000000 R14: ffff88bff6c2a360 R15: ffffffff814bd870 ? kzalloc.constprop.57+0x30/0x30 list_lru_add+0x5a/0x100 inode_lru_list_add+0x20/0x40 iput+0x1c1/0x1f0 run_delayed_iput_locked+0x46/0x90 btrfs_run_delayed_iputs+0x3f/0x60 cleaner_kthread+0xf2/0x120 kthread+0x10b/0x130 Fix this by adding a cond_resched_lock() to the loop processing delayed iputs so we can avoid these sort of stalls. CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-14 01:22:53 +02:00
Johannes Thumshirn	d6f67afbdf	btrfs: return 0 for dev_extent_hole_check_zoned hole_start in case of error Commit `7000babdda` ("btrfs: assign proper values to a bool variable in dev_extent_hole_check_zoned") assigned false to the hole_start parameter of dev_extent_hole_check_zoned(). The hole_start parameter is not boolean and returns the start location of the found hole. Fixes: `7000babdda` ("btrfs: assign proper values to a bool variable in dev_extent_hole_check_zoned") Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2021-05-14 01:22:48 +02:00
Dave Airlie	08f0cfbf73	Merge tag 'amd-drm-fixes-5.13-2021-05-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.13-2021-05-13: amdgpu: - Fixes for flexible array conversions - Fix sysfs attribute init - Harvesting fixes - VCN CG/PG fixes for Picasso radeon: - Fixes for flexible array conversions - Fix for flickering on Oland with multiple 4K displays Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210513163228.3963-1-alexander.deucher@amd.com	2021-05-14 09:20:04 +10:00
Dave Airlie	1db7aa269a	A BO list maintainance fix for TTM, removing an unused function and a MAINTAINERS update. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCYJ0rMAAKCRDj7w1vZxhR xXy7AQCoRykDurhX833kRT3RuMZpxNdxVybC/P8buYVGtHwC3QD8D7yXL+DY8/mt iJGaVnb+ZBp8XsO034ROlVux/pB1DA8= =3uhR -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2021-05-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Remove an unused function and a MAINTAINERS update. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210513133617.xq77wwrehpuh7yn2@hendrix	2021-05-14 09:19:38 +10:00
Greg Kroah-Hartman	27b57bb76a	Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference"" This reverts commit `4667a6fc17`. Takashi writes: I have already started working on the bigger cleanup of this driver code based on 5.13-rc1, so could you drop this revert? I missed our previous discussion about this, my fault for applying it. Reported-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-13 21:40:38 +02:00
Linus Torvalds	315d993181	Power management fixes for 5.13-rc2 - Make intel_pstate work as expected on systems where the platform firmware enables HWP even though the HWP EPP support is not advertised (Rafael Wysocki). - Fix possible runtime PM child count imbalance that may occur if other runtime PM functions are called after invoking pm_runtime_force_suspend() and before pm_runtime_force_resume() is called (Tony Lindgren). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmCddyMSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxMRkP/jcq9A6eWOQoGBtT4vrIANt1/S+YcjNi PFNa8BIUN80lyA+em3v0t8KUcfFMQhvbXK+KDOMqE3PpzgxbVZvY5t7Lxr/dhtOI ZQRAuhcspt133hSlBzafonm1k/y2BVI4mEVivB+ElDHyY14Ll9vEmbvBcsFNJha7 E6DvSOBRSqDsFSUKIFnhz9FuFa/HaUcrcILTu4wRGcCGEt4kF+XrPaVvAwWveNfq 8TJJj3MK6QdpoVG+hXebzj7/A2iWlVymFoT/2+B71tQgdYabVbeFbkyV467h5IKz rLaiUtDnSPTletnbJ2WNphgY902LFHBPJ99fpnfk61BcHiDXmf7PP2z8uxIwU4S4 1+ggNbGwxOBjr3FRLmoZdg9lZG0zEf9GjcO78Ivq2u1pqz0UYWrSyGzl0Ye/DOmc 9bGZsZrf6vvL64Ye60VpW68bevH0oDhZ+84ihTIN+Clz6+8bJVLr3m4qGFdXd0gm 8kBb4n3UAT9VR4Wj/Joj/Ocv+lO82CeEi/Ti/xjA+27p7SFRm3manFLP9w4SllUQ ky6sywUATVjNi76WkIHk4XnCjAQjR83N41xwPX8WEXhHxwXGbeCG9LIjTZ5r2pVD Ps4FUeMB2q9E5NZmv1L2j+r79YdWWYtYY9D5kdoto6WXuJRR5OB1oKScBFNc9MS8 pFVLW8kPBNAj =0GZZ -----END PGP SIGNATURE----- Merge tag 'pm-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These close a coverage gap in the intel_pstate driver and fix runtime PM child count imbalance related to interactions with system-wide suspend. Specifics: - Make intel_pstate work as expected on systems where the platform firmware enables HWP even though the HWP EPP support is not advertised (Rafael Wysocki). - Fix possible runtime PM child count imbalance that may occur if other runtime PM functions are called after invoking pm_runtime_force_suspend() and before pm_runtime_force_resume() is called (Tony Lindgren)" * tag 'pm-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Fix unpaired parent child_count for force_resume cpufreq: intel_pstate: Use HWP if enabled by platform firmware	2021-05-13 12:28:10 -07:00
Linus Torvalds	2df38a8e9b	ACPI fixes for 5.13-rc2 - Revert a revert of a recent ACPI power management change that does not need to be reverted after all (Rafael Wysocki). - Add missing fan device ID to the list of device IDs for which the devices should not be put into the ACPI PM domain (Sumeet Pawnikar). - Fix possible memory leak in an error path in the ACPI device enumeration code (Christophe JAILLET). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmCddrASHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxaz8P/iYD+HR24PG8tTDrPlHYJEAsR7ZPLTDM XzxI5yt6rnN+lfhS/y5HZnQZJ89GonIm+MqAMiaL9TludHi5uzE0a66LkpWfrRGH z7C507sXTF+jbId5xPhwx0+++EDrYBTevdPLvRHwWVh2MOvxpEw7ja5qyTIpwhth z+q+F2KYFU/jpOSXkdTtaKxi6ncHhSpT3NDZ0UCpIPd7kTQ6KBjj1CfQoTMj2y3/ lsctuAd4gUCwiPWhSQYwRtABkdxhqRd/zRs8Bo61M6FH1aTo5NcjGR6lHZjeaC1Z ObMkEEz3/Dq+zFB4oCY6ehaOOBnO6DsNmtoNxzENLL1adDO5qfDGNrAN8qXS+uhk pMhwzw0yPR8YVFmxZ0wP1cHxKr2lnqm3qomyc+MMyifN+xXjo3xFqREoonkp92AD QQw1apfd/HOlsSLVWrKViJzOCgTRtJjAnHr4DVZmjTP5I46rg206pthi8PSlXm2j AyRASrrkQbjDM7M/5Enc+ztZ6GHQc0DHu9awQ9ezt0ONMRI2UZ/yY/4Tra+4mWOO 1/tCjxjup0Z+tz0/ZuTbXgcFYn6Gejjcq4sJzor236mWXUwAi+byni9j0W5GuVTw fFs3WtSehVBiukweLI4IZJXpMxoGZ9p+weKeURJ5brfbL/oT+cfbnPW3Kqdv5wD3 qPdzAz7Bem1s =N/vn -----END PGP SIGNATURE----- Merge tag 'acpi-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These revert an unnecessary revert of an ACPI power management commit, add a missing device ID to one of the lists and fix a possible memory leak in an error path. Specifics: - Revert a revert of a recent ACPI power management change that does not need to be reverted after all (Rafael Wysocki). - Add missing fan device ID to the list of device IDs for which the devices should not be put into the ACPI PM domain (Sumeet Pawnikar). - Fix possible memory leak in an error path in the ACPI device enumeration code (Christophe JAILLET)" * tag 'acpi-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PM: Add ACPI ID of Alder Lake Fan ACPI: scan: Fix a memory leak in an error handling path Revert "Revert "ACPI: scan: Turn off unused power resources during initialization""	2021-05-13 12:22:01 -07:00
Mikulas Patocka	bc8f3d4647	dm integrity: fix sparse warnings Use the types __le* instead of __u* to fix sparse warnings. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-05-13 14:53:49 -04:00
Mikulas Patocka	dbae70d452	dm integrity: revert to not using discard filler when recalulating Revert the commit `7a5b96b478` ("dm integrity: use discard support when recalculating"). There's a bug that when we write some data beyond the current recalculate boundary, the checksum will be rewritten with the discard filler later. And the data will no longer have integrity protection. There's no easy fix for this case. Also, another problematic case is if dm-integrity is used to detect bitrot (random device errors, bit flips, etc); dm-integrity should detect that even for unused sectors. With commit `7a5b96b478` it can happen that such change is undetected (because discard filler is not a valid checksum). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-05-13 14:53:48 -04:00
Jim Cromie	a3626bcf5f	dyndbg: drop uninformative vpr_info Remove a vpr_info which I added in 2012, when I knew even less than now. In 2020, a simpler pr_fmt stripped it of context, and any remaining value. no functional change. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20210504222235.1033685-3-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-13 20:50:23 +02:00
Jim Cromie	640d1eaff2	dyndbg: avoid calling dyndbg_emit_prefix when it has no work Wrap function in a static-inline one, which checks flags to avoid calling the function unnecessarily. And hoist its output-buffer initialization to the grand-caller, which is already allocating the buffer on the stack, and can trivially initialize it too. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20210504222235.1033685-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-13 20:50:23 +02:00
Mikulas Patocka	c699a0db2d	dm snapshot: fix crash with transient storage and zero chunk size The following commands will crash the kernel: modprobe brd rd_size=1048576 dmsetup create o --table "0 `blockdev --getsize /dev/ram0` snapshot-origin /dev/ram0" dmsetup create s --table "0 `blockdev --getsize /dev/ram0` snapshot /dev/ram0 /dev/ram1 N 0" The reason is that when we test for zero chunk size, we jump to the label bad_read_metadata without setting the "r" variable. The function snapshot_ctr destroys all the structures and then exits with "r == 0". The kernel then crashes because it falsely believes that snapshot_ctr succeeded. In order to fix the bug, we set the variable "r" to -EINVAL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2021-05-13 14:42:52 -04:00
Rafael J. Wysocki	fd38651716	Merge branch 'acpi-pm' * acpi-pm: ACPI: PM: Add ACPI ID of Alder Lake Fan Revert "Revert "ACPI: scan: Turn off unused power resources during initialization""	2021-05-13 20:39:58 +02:00
Rafael J. Wysocki	78a6948bba	Merge branch 'pm-core' * pm-core: PM: runtime: Fix unpaired parent child_count for force_resume	2021-05-13 20:39:07 +02:00
Luca Stefani	ced081a436	binder: Return EFAULT if we fail BINDER_ENABLE_ONEWAY_SPAM_DETECTION All the other ioctl paths return EFAULT in case the copy_from_user/copy_to_user call fails, make oneway spam detection follow the same paradigm. Fixes: `a7dc1e6f99` ("binder: tell userspace to dump current backtrace when detected oneway spamming") Acked-by: Todd Kjos <tkjos@google.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com> Link: https://lore.kernel.org/r/20210506193726.45118-1-luca.stefani.ge1@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-13 20:35:26 +02:00
Steven Rostedt (VMware)	eb01f5353b	tracing: Handle %.s in trace_check_vprintf() If a trace event uses the %.s notation, the trace_check_vprintf() will fail and will warn about a bad processing of strings, because it does not take into account the length field when processing the star (*) part. Have it handle this case as well. Link: https://lore.kernel.org/linux-nfs/238C0E2D-C2A4-4578-ADD2-C565B3B99842@oracle.com/ Reported-by: Chuck Lever III <chuck.lever@oracle.com> Fixes: `9a6944fee6` ("tracing: Add a verifier to check string pointers for trace events") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-05-13 14:20:33 -04:00
Linus Torvalds	adc12a7407	Merge branch 'resizex' (patches from Maciej) Merge VT_RESIZEX fixes from Maciej Rozycki: "I got to the bottom of the issue with VT_RESIZEX recently discussed and came up with this small patch series, fixing an additional issue that I originally thought might be broken VGA hardware emulation with my laptop, which however turned out to be intertwined with the original problem and also a regression introduced somewhat later. The fix for that because the first patch, and then to make backporting feasible I had to put a revert of the offending change from last September next, followed by a proper fix for the framebuffer issue that change had tried to address. See individual change descriptions for details. These have been verified with true VGA hardware (a Trident TVGA8900 ISA video adapter) using various combinations of `svgatextmode' and `setfont' command invocations to change both the VT size and the font size, and also switching between the text console and X11, both by starting/stopping the X server and by switching between VTs. All this to ensure bringing the behaviour of VGA text console back to correct operation as it used to be with Linux 2.6.18" * emailed patches from Maciej W. Rozycki <macro@orcam.me.uk>: vt: Fix character height handling with VT_RESIZEX vt_ioctl: Revert VT_RESIZEX parameter handling removal vgacon: Record video mode changes with VT_RESIZEX	2021-05-13 11:12:51 -07:00
Maciej W. Rozycki	860dafa902	vt: Fix character height handling with VT_RESIZEX Restore the original intent of the VT_RESIZEX ioctl's `v_clin' parameter which is the number of pixel rows per character (cell) rather than the height of the font used. For framebuffer devices the two values are always the same, because the former is inferred from the latter one. For VGA used as a true text mode device these two parameters are independent from each other: the number of pixel rows per character is set in the CRT controller, while font height is in fact hardwired to 32 pixel rows and fonts of heights below that value are handled by padding their data with blanks when loaded to hardware for use by the character generator. One can change the setting in the CRT controller and it will update the screen contents accordingly regardless of the font loaded. The `v_clin' parameter is used by the `vgacon' driver to set the height of the character cell and then the cursor position within. Make the parameter explicit then, by defining a new `vc_cell_height' struct member of `vc_data', set it instead of `vc_font.height' from `v_clin' in the VT_RESIZEX ioctl, and then use it throughout the `vgacon' driver except where actual font data is accessed which as noted above is independent from the CRTC setting. This way the framebuffer console driver is free to ignore the `v_clin' parameter as irrelevant, as it always should have, avoiding any issues attempts to give the parameter a meaning there could have caused, such as one that has led to commit `988d076336` ("vt_ioctl: make VT_RESIZEX behave like VT_RESIZE"): "syzbot is reporting UAF/OOB read at bit_putcs()/soft_cursor() [1][2], for vt_resizex() from ioctl(VT_RESIZEX) allows setting font height larger than actual font height calculated by con_font_set() from ioctl(PIO_FONT). Since fbcon_set_font() from con_font_set() allocates minimal amount of memory based on actual font height calculated by con_font_set(), use of vt_resizex() can cause UAF/OOB read for font data." The problem first appeared around Linux 2.5.66 which predates our repo history, but the origin could be identified with the old MIPS/Linux repo also at: <git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux.git> as commit 9736a3546de7 ("Merge with Linux 2.5.66."), where VT_RESIZEX code in `vt_ioctl' was updated as follows: if (clin) - video_font_height = clin; + vc->vc_font.height = clin; making the parameter apply to framebuffer devices as well, perhaps due to the use of "font" in the name of the original `video_font_height' variable. Use "cell" in the new struct member then to avoid ambiguity. References: [1] https://syzkaller.appspot.com/bug?id=32577e96d88447ded2d3b76d71254fb855245837 [2] https://syzkaller.appspot.com/bug?id=6b8355d27b2b94fb5cedf4655e3a59162d9e48e3 Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org # v2.6.12+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-13 11:00:18 -07:00
Maciej W. Rozycki	a90c275eb1	vt_ioctl: Revert VT_RESIZEX parameter handling removal Revert the removal of code handling extra VT_RESIZEX ioctl's parameters beyond those that VT_RESIZE supports, fixing a functional regression causing `svgatextmode' not to resize the VT anymore. As a consequence of the reverted change when the video adapter is reprogrammed from the original say 80x25 text mode using a 9x16 character cell (720x400 pixel resolution) to say 80x37 text mode and the same character cell (720x592 pixel resolution), the VT geometry does not get updated and only upper two thirds of the screen are used for the VT, and the lower part remains blank. The proportions change according to text mode geometries chosen. Revert the change verbatim then, bringing back previous VT resizing. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: `988d076336` ("vt_ioctl: make VT_RESIZEX behave like VT_RESIZE") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-05-13 11:00:18 -07:00

... 3 4 5 6 7 ...

1014051 Commits All Branches Search

1014051 Commits

All Branches