OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Laurent Pinchart	1967c51fa4	drm/panel: ilitek-ili9881c: Fix warning with GPIO controllers that sleep [ Upstream commit ee7860cd8b5763017f8dc785c2851fecb7a0c565 ] The ilitek-ili9881c controls the reset GPIO using the non-sleeping gpiod_set_value() function. This complains loudly when the GPIO controller needs to sleep. As the caller can sleep, use gpiod_set_value_cansleep() to fix the issue. This fixes CVE-2024-42087 Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240317154839.21260-1-laurent.pinchart@ideasonboard.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240317154839.21260-1-laurent.pinchart@ideasonboard.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:19 +08:00
Hagar Hemdan	a2e47aed04	pinctrl: fix deadlock in create_pinctrl() when handling -EPROBE_DEFER [ Upstream commit adec57ff8e66aee632f3dd1f93787c13d112b7a1 ] In create_pinctrl(), pinctrl_maps_mutex is acquired before calling add_setting(). If add_setting() returns -EPROBE_DEFER, create_pinctrl() calls pinctrl_free(). However, pinctrl_free() attempts to acquire pinctrl_maps_mutex, which is already held by create_pinctrl(), leading to a potential deadlock. This patch resolves the issue by releasing pinctrl_maps_mutex before calling pinctrl_free(), preventing the deadlock. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. This fixes CVE-2024-42090 Fixes: `42fed7ba44` ("pinctrl: move subsystem mutex to pinctrl_dev struct") Suggested-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Link: https://lore.kernel.org/r/20240604085838.3344-1-hagarhem@amazon.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:19 +08:00
Vasileios Amoiridis	3b7d09e9a6	iio: chemical: bme680: Fix overflows in compensate() functions commit fdd478c3ae98c3f13628e110dce9b6cfb0d9b3c8 upstream. There are cases in the compensate functions of the driver that there could be overflows of variables due to bit shifting ops. These implications were initially discussed here [1] and they were mentioned in log message of Commit `1b3bd85927` ("iio: chemical: Add support for Bosch BME680 sensor"). [1]: https://lore.kernel.org/linux-iio/20180728114028.3c1bbe81@archlinux/ This fixes CVE-2024-42086 Fixes: `1b3bd85927` ("iio: chemical: Add support for Bosch BME680 sensor") Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com> Link: https://lore.kernel.org/r/20240606212313.207550-4-vassilisamir@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:19 +08:00
Ma Ke	789a7a4e5b	drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modes commit 66edf3fb331b6c55439b10f9862987b0916b3726 upstream. In nv17_tv_get_ld_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. This fixes CVE-2024-41095 Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081828.2620794-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:19 +08:00
Niklas Cassel	29aa9f89a4	ata: libata-core: Fix null pointer dereference on error [ Upstream commit 5d92c7c566dc76d96e0e19e481d926bbe6631c1e ] If the ata_port_alloc() call in ata_host_alloc() fails, ata_host_release() will get called. However, the code in ata_host_release() tries to free ata_port struct members unconditionally, which can lead to the following: BUG: unable to handle page fault for address: 0000000000003990 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 PID: 594 Comm: (udev-worker) Not tainted 6.10.0-rc5 #44 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:ata_host_release.cold+0x2f/0x6e [libata] Code: e4 4d 63 f4 44 89 e2 48 c7 c6 90 ad 32 c0 48 c7 c7 d0 70 33 c0 49 83 c6 0e 41 RSP: 0018:ffffc90000ebb968 EFLAGS: 00010246 RAX: 0000000000000041 RBX: ffff88810fb52e78 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88813b3218c0 RDI: ffff88813b3218c0 RBP: ffff88810fb52e40 R08: 0000000000000000 R09: 6c65725f74736f68 R10: ffffc90000ebb738 R11: 73692033203a746e R12: 0000000000000004 R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000006 FS: 00007f6cc55b9980(0000) GS:ffff88813b300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000003990 CR3: 00000001122a2000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2f0 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? ata_host_release.cold+0x2f/0x6e [libata] ? ata_host_release.cold+0x2f/0x6e [libata] release_nodes+0x35/0xb0 devres_release_group+0x113/0x140 ata_host_alloc+0xed/0x120 [libata] ata_host_alloc_pinfo+0x14/0xa0 [libata] ahci_init_one+0x6c9/0xd20 [ahci] Do not access ata_port struct members unconditionally. This fixes CVE-2024-41098 Fixes: `633273a3ed` ("libata-pmp: hook PMP support and enable it") Cc: stable@vger.kernel.org Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240629124210.181537-7-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Stable-dep-of: f6549f538fe0 ("ata,scsi: libata-core: Do not leak memory for ata_port struct members") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:19 +08:00
Nikita Zhandarovich	e9289a8a33	usb: atm: cxacru: fix endpoint checking in cxacru_bind() commit 2eabb655a968b862bc0c31629a09f0fbf3c80d51 upstream. Syzbot is still reporting quite an old issue [1] that occurs due to incomplete checking of present usb endpoints. As such, wrong endpoints types may be used at urb sumbitting stage which in turn triggers a warning in usb_submit_urb(). Fix the issue by verifying that required endpoint types are present for both in and out endpoints, taking into account cmd endpoint type. Unfortunately, this patch has not been tested on real hardware. [1] Syzbot report: usb 1-1: BOGUS urb xfer, pipe 1 != type 3 WARNING: CPU: 0 PID: 8667 at drivers/usb/core/urb.c:502 usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502 Modules linked in: CPU: 0 PID: 8667 Comm: kworker/0:4 Not tainted 5.14.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event RIP: 0010:usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502 ... Call Trace: cxacru_cm+0x3c0/0x8e0 drivers/usb/atm/cxacru.c:649 cxacru_card_status+0x22/0xd0 drivers/usb/atm/cxacru.c:760 cxacru_bind+0x7ac/0x11a0 drivers/usb/atm/cxacru.c:1209 usbatm_usb_probe+0x321/0x1ae0 drivers/usb/atm/usbatm.c:1055 cxacru_usb_probe+0xdf/0x1e0 drivers/usb/atm/cxacru.c:1363 usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396 call_driver_probe drivers/base/dd.c:517 [inline] really_probe+0x23c/0xcd0 drivers/base/dd.c:595 __driver_probe_device+0x338/0x4d0 drivers/base/dd.c:747 driver_probe_device+0x4c/0x1a0 drivers/base/dd.c:777 __device_attach_driver+0x20b/0x2f0 drivers/base/dd.c:894 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:427 __device_attach+0x228/0x4a0 drivers/base/dd.c:965 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:487 device_add+0xc2f/0x2180 drivers/base/core.c:3354 usb_set_configuration+0x113a/0x1910 drivers/usb/core/message.c:2170 usb_generic_driver_probe+0xba/0x100 drivers/usb/core/generic.c:238 usb_probe_device+0xd9/0x2c0 drivers/usb/core/driver.c:293 This fixes CVE-2024-41097 Reported-and-tested-by: syzbot+00c18ee8497dd3be6ade@syzkaller.appspotmail.com Fixes: `902ffc3c70` ("USB: cxacru: Use a bulk/int URB to access the command endpoint") Cc: stable <stable@kernel.org> Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://lore.kernel.org/r/20240609131546.3932-1-n.zhandarovich@fintech.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:19 +08:00
Ma Ke	46d31bc9c5	drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modes commit 6d411c8ccc0137a612e0044489030a194ff5c843 upstream. In nv17_tv_get_hd_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). The same applies to drm_cvt_mode(). Add a check to avoid null pointer dereference. This fixes CVE-2024-41089 Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081029.2619437-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:18 +08:00
Dirk Behme	40e7251e07	drivers: core: synchronize really_probe() and dev_uevent() commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0 upstream. Synchronize the dev->driver usage in really_probe() and dev_uevent(). These can run in different threads, what can result in the following race condition for dev->driver uninitialization: Thread #1: ========== really_probe() { ... probe_failed: ... device_unbind_cleanup(dev) { ... dev->driver = NULL; // <= Failed probe sets dev->driver to NULL ... } ... } Thread #2: ========== dev_uevent() { ... if (dev->driver) // If dev->driver is NULLed from really_probe() from here on, // after above check, the system crashes add_uevent_var(env, "DRIVER=%s", dev->driver->name); ... } really_probe() holds the lock, already. So nothing needs to be done there. dev_uevent() is called with lock held, often, too. But not always. What implies that we can't add any locking in dev_uevent() itself. So fix this race by adding the lock to the non-protected path. This is the path where above race is observed: dev_uevent+0x235/0x380 uevent_show+0x10c/0x1f0 <= Add lock here dev_attr_show+0x3a/0xa0 sysfs_kf_seq_show+0x17c/0x250 kernfs_seq_show+0x7c/0x90 seq_read_iter+0x2d7/0x940 kernfs_fop_read_iter+0xc6/0x310 vfs_read+0x5bc/0x6b0 ksys_read+0xeb/0x1b0 __x64_sys_read+0x42/0x50 x64_sys_call+0x27ad/0x2d30 do_syscall_64+0xcd/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Similar cases are reported by syzkaller in https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a But these are regarding the initialization of dev->driver dev->driver = drv; As this switches dev->driver to non-NULL these reports can be considered to be false-positives (which should be "fixed" by this commit, as well, though). The same issue was reported and tried to be fixed back in 2015 in https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/ already. This fixes CVE-2024-39501 Fixes: `239378f16a` ("Driver core: add uevent vars for devices of a class") Cc: stable <stable@kernel.org> Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com Cc: Ashish Sangwan <a.sangwan@samsung.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:18 +08:00
Sicong Huang	aa39a529d1	greybus: Fix use-after-free bug in gb_interface_release due to race condition. commit 5c9c5d7f26acc2c669c1dcf57d1bb43ee99220ce upstream. In gb_interface_create, &intf->mode_switch_completion is bound with gb_interface_mode_switch_work. Then it will be started by gb_interface_request_mode_switch. Here is the relevant code. if (!queue_work(system_long_wq, &intf->mode_switch_work)) { ... } If we call gb_interface_release to make cleanup, there may be an unfinished work. This function will call kfree to free the object "intf". However, if gb_interface_mode_switch_work is scheduled to run after kfree, it may cause use-after-free error as gb_interface_mode_switch_work will use the object "intf". The possible execution flow that may lead to the issue is as follows: CPU0 CPU1 \| gb_interface_create \| gb_interface_request_mode_switch gb_interface_release \| kfree(intf) (free) \| \| gb_interface_mode_switch_work \| mutex_lock(&intf->mutex) (use) Fix it by canceling the work before kfree. This fixes CVE-2024-39495 Signed-off-by: Sicong Huang <congei42@163.com> Link: https://lore.kernel.org/r/20240416080313.92306-1-congei42@163.com Cc: Ronnie Sahlberg <rsahlberg@ciq.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Huang Cun <cunhuang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:09:18 +08:00
Ding Hui	43ab28ee6f	SUNRPC: Fix UAF in svc_tcp_listen_data_ready() commit `fc80fc2d4e` upstream. After the listener svc_sock is freed, and before invoking svc_tcp_accept() for the established child sock, there is a window that the newsock retaining a freed listener svc_sock in sk_user_data which cloning from parent. In the race window, if data is received on the newsock, we will observe use-after-free report in svc_tcp_listen_data_ready(). Reproduce by two tasks: 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done 2. while :; do echo "" \| ncat -4 127.0.0.1 2049 ; done KASAN report: ================================================================== BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] Read of size 8 at addr ffff888139d96228 by task nc/102553 CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 Call Trace: <IRQ> dump_stack_lvl+0x33/0x50 print_address_description.constprop.0+0x27/0x310 print_report+0x3e/0x70 kasan_report+0xae/0xe0 svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] tcp_data_queue+0x9f4/0x20e0 tcp_rcv_established+0x666/0x1f60 tcp_v4_do_rcv+0x51c/0x850 tcp_v4_rcv+0x23fc/0x2e80 ip_protocol_deliver_rcu+0x62/0x300 ip_local_deliver_finish+0x267/0x350 ip_local_deliver+0x18b/0x2d0 ip_rcv+0x2fb/0x370 __netif_receive_skb_one_core+0x166/0x1b0 process_backlog+0x24c/0x5e0 __napi_poll+0xa2/0x500 net_rx_action+0x854/0xc90 __do_softirq+0x1bb/0x5de do_softirq+0xcb/0x100 </IRQ> <TASK> ... </TASK> Allocated by task 102371: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x7b/0x90 svc_setup_socket+0x52/0x4f0 [sunrpc] svc_addsock+0x20d/0x400 [sunrpc] __write_ports_addfd+0x209/0x390 [nfsd] write_ports+0x239/0x2c0 [nfsd] nfsctl_transaction_write+0xac/0x110 [nfsd] vfs_write+0x1c3/0xae0 ksys_write+0xed/0x1c0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 102551: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x50 __kasan_slab_free+0x106/0x190 __kmem_cache_free+0x133/0x270 svc_xprt_free+0x1e2/0x350 [sunrpc] svc_xprt_destroy_all+0x25a/0x440 [sunrpc] nfsd_put+0x125/0x240 [nfsd] nfsd_svc+0x2cb/0x3c0 [nfsd] write_threads+0x1ac/0x2a0 [nfsd] nfsctl_transaction_write+0xac/0x110 [nfsd] vfs_write+0x1c3/0xae0 ksys_write+0xed/0x1c0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready() if state != TCP_LISTEN, that will avoid dereferencing svsk for all child socket. Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/ Fixes: `fa9251afc3` ("SUNRPC: Call the default socket callbacks instead of open coding") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:08:55 +08:00
Zhihao Cheng	aecbc001c6	Revert "ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path" [ Upstream commit 7bed61a1cf166b5c113047fc8f60ff22dcb04893 ] Fix CVE: CVE-2024-26972 This reverts commit 6379b44cdcd67f5f5d986b73953e99700591edfa. Commit 1e022216dcd2 ("ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path") is applied again in commit 6379b44cdcd6 ("ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path"), which changed ubifs_mknod (It won't become a real problem). Just revert it. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:38 +08:00
Zhihao Cheng	8f6794c1ee	ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path [ Upstream commit 1e022216dcd248326a5bb95609d12a6815bca4e2 ] Fix CVE: CVE-2024-26972 For error handling path in ubifs_symlink(), inode will be marked as bad first, then iput() is invoked. If inode->i_link is initialized by fscrypt_encrypt_symlink() in encryption scenario, inode->i_link won't be freed by callchain ubifs_free_inode -> fscrypt_free_inode in error handling path, because make_bad_inode() has changed 'inode->i_mode' as 'S_IFREG'. Following kmemleak is easy to be reproduced by injecting error in ubifs_jnl_update() when doing symlink in encryption scenario: unreferenced object 0xffff888103da3d98 (size 8): comm "ln", pid 1692, jiffies 4294914701 (age 12.045s) backtrace: kmemdup+0x32/0x70 __fscrypt_encrypt_symlink+0xed/0x1c0 ubifs_symlink+0x210/0x300 [ubifs] vfs_symlink+0x216/0x360 do_symlinkat+0x11a/0x190 do_syscall_64+0x3b/0xe0 There are two ways fixing it: 1. Remove make_bad_inode() in error handling path. We can do that because ubifs_evict_inode() will do same processes for good symlink inode and bad symlink inode, for inode->i_nlink checking is before is_bad_inode(). 2. Free inode->i_link before marking inode bad. Method 2 is picked, it has less influence, personally, I think. Cc: stable@vger.kernel.org Fixes: `2c58d548f5` ("fscrypt: cache decrypted symlink target in ->i_link") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:38 +08:00
Ryusuke Konishi	df2791979b	nilfs2: fix failure to detect DAT corruption in btree and direct mappings [ Upstream commit f2f26b4a84a0ef41791bd2d70861c8eac748f4ba ] Fix CVE: CVE-2024-26956 Patch series "nilfs2: fix kernel bug at submit_bh_wbc()". This resolves a kernel BUG reported by syzbot. Since there are two flaws involved, I've made each one a separate patch. The first patch alone resolves the syzbot-reported bug, but I think both fixes should be sent to stable, so I've tagged them as such. This patch (of 2): Syzbot has reported a kernel bug in submit_bh_wbc() when writing file data to a nilfs2 file system whose metadata is corrupted. There are two flaws involved in this issue. The first flaw is that when nilfs_get_block() locates a data block using btree or direct mapping, if the disk address translation routine nilfs_dat_translate() fails with internal code -ENOENT due to DAT metadata corruption, it can be passed back to nilfs_get_block(). This causes nilfs_get_block() to misidentify an existing block as non-existent, causing both data block lookup and insertion to fail inconsistently. The second flaw is that nilfs_get_block() returns a successful status in this inconsistent state. This causes the caller __block_write_begin_int() or others to request a read even though the buffer is not mapped, resulting in a BUG_ON check for the BH_Mapped flag in submit_bh_wbc() failing. This fixes the first issue by changing the return value to code -EINVAL when a conversion using DAT fails with code -ENOENT, avoiding the conflicting condition that leads to the kernel bug described above. Here, code -EINVAL indicates that metadata corruption was detected during the block lookup, which will be properly handled as a file system error and converted to -EIO when passing through the nilfs2 bmap layer. Link: https://lkml.kernel.org/r/20240313105827.5296-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240313105827.5296-2-konishi.ryusuke@gmail.com Fixes: `c3a7abf06c` ("nilfs2: support contiguous lookup of blocks") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+cfed5b56649bddf80d6e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:37 +08:00
Ryusuke Konishi	88f0af427d	nilfs2: prevent kernel bug at submit_bh_wbc() [ Upstream commit 269cdf353b5bdd15f1a079671b0f889113865f20 ] Fix CVE: CVE-2024-26955 Fix a bug where nilfs_get_block() returns a successful status when searching and inserting the specified block both fail inconsistently. If this inconsistent behavior is not due to a previously fixed bug, then an unexpected race is occurring, so return a temporary error -EAGAIN instead. This prevents callers such as __block_write_begin_int() from requesting a read into a buffer that is not mapped, which would cause the BUG_ON check for the BH_Mapped flag in submit_bh_wbc() to fail. Link: https://lkml.kernel.org/r/20240313105827.5296-3-konishi.ryusuke@gmail.com Fixes: `1f5abe7e7d` ("nilfs2: replace BUG_ON and BUG calls triggerable from ioctl") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:37 +08:00
Zhihao Cheng	56c6bc73e2	ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path [ Upstream commit 6379b44cdcd67f5f5d986b73953e99700591edfa ] Fix CVE: CVE-2024-26972 For error handling path in ubifs_symlink(), inode will be marked as bad first, then iput() is invoked. If inode->i_link is initialized by fscrypt_encrypt_symlink() in encryption scenario, inode->i_link won't be freed by callchain ubifs_free_inode -> fscrypt_free_inode in error handling path, because make_bad_inode() has changed 'inode->i_mode' as 'S_IFREG'. Following kmemleak is easy to be reproduced by injecting error in ubifs_jnl_update() when doing symlink in encryption scenario: unreferenced object 0xffff888103da3d98 (size 8): comm "ln", pid 1692, jiffies 4294914701 (age 12.045s) backtrace: kmemdup+0x32/0x70 __fscrypt_encrypt_symlink+0xed/0x1c0 ubifs_symlink+0x210/0x300 [ubifs] vfs_symlink+0x216/0x360 do_symlinkat+0x11a/0x190 do_syscall_64+0x3b/0xe0 There are two ways fixing it: 1. Remove make_bad_inode() in error handling path. We can do that because ubifs_evict_inode() will do same processes for good symlink inode and bad symlink inode, for inode->i_nlink checking is before is_bad_inode(). 2. Free inode->i_link before marking inode bad. Method 2 is picked, it has less influence, personally, I think. Cc: stable@vger.kernel.org Fixes: `2c58d548f5` ("fscrypt: cache decrypted symlink target in ->i_link") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:37 +08:00
Jan Kara	10d52afa49	fat: fix uninitialized field in nostale filehandles [ Upstream commit fde2497d2bc3a063d8af88b258dbadc86bd7b57c ] Fix CVE: CVE-2024-26973 When fat_encode_fh_nostale() encodes file handle without a parent it stores only first 10 bytes of the file handle. However the length of the file handle must be a multiple of 4 so the file handle is actually 12 bytes long and the last two bytes remain uninitialized. This is not great at we potentially leak uninitialized information with the handle to userspace. Properly initialize the full handle length. Link: https://lkml.kernel.org/r/20240205122626.13701-1-jack@suse.cz Reported-by: syzbot+3ce5dea5b1539ff36769@syzkaller.appspotmail.com Fixes: `ea3983ace6` ("fat: restructure export_operations") Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Amir Goldstein <amir73il@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:37 +08:00
Navid Emamdoost	b8bd6eb711	nbd: null check for nla_nest_start [ Upstream commit 31edf4bbe0ba27fd03ac7d87eb2ee3d2a231af6d ] Fix CVE: CVE-2024-27025 nla_nest_start() may fail and return NULL. Insert a check and set errno based on other call sites within the same source code. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Fixes: `47d902b90a` ("nbd: add a status netlink command") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20240218042534.it.206-kees@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Haisu Wang <haisuwang@tencent.com>	2024-11-28 15:08:37 +08:00
Gabor Juhos	81bc79fff4	clk: qcom: mmcc-msm8974: fix terminating of frequency table arrays commit e2c02a85bf53ae86d79b5fccf0a75ac0b78e0c96 upstream Fix: CVE-2024-26965 The frequency table arrays are supposed to be terminated with an empty element. Add such entry to the end of the arrays where it is missing in order to avoid possible out-of-bound access when the table is traversed by functions like qcom_find_freq() or qcom_find_freq_floor(). Only compile tested. Fixes: `d8b212014e` ("clk: qcom: Add support for MSM8974's multimedia clock controller (MMCC)") Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240229-freq-table-terminator-v1-7-074334f0905c@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Rongwei Wang <zigiwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:07:13 +08:00
Dmitry Bogdanov	f32abf2f2a	scsi: target: core: Add TMF to tmr_list handling commit 83ab68168a3d990d5ff39ab030ad5754cbbccb25 upstream An abort that is responded to by iSCSI itself is added to tmr_list but does not go to target core. A LUN_RESET that goes through tmr_list takes a refcounter on the abort and waits for completion. However, the abort will be never complete because it was not started in target core. Unable to locate ITT: 0x05000000 on CID: 0 Unable to locate RefTaskTag: 0x05000000 on CID: 0. wait_for_tasks: Stopping tmf LUN_RESET with tag 0x0 ref_task_tag 0x0 i_state 34 t_state ISTATE_PROCESSING refcnt 2 transport_state active,stop,fabric_stop wait for tasks: tmf LUN_RESET with tag 0x0 ref_task_tag 0x0 i_state 34 t_state ISTATE_PROCESSING refcnt 2 transport_state active,stop,fabric_stop ... INFO: task kworker/0:2:49 blocked for more than 491 seconds. task:kworker/0:2 state:D stack: 0 pid: 49 ppid: 2 flags:0x00000800 Workqueue: events target_tmr_work [target_core_mod] Call Trace: __switch_to+0x2c4/0x470 _schedule+0x314/0x1730 schedule+0x64/0x130 schedule_timeout+0x168/0x430 wait_for_completion+0x140/0x270 target_put_cmd_and_wait+0x64/0xb0 [target_core_mod] core_tmr_lun_reset+0x30/0xa0 [target_core_mod] target_tmr_work+0xc8/0x1b0 [target_core_mod] process_one_work+0x2d4/0x5d0 worker_thread+0x78/0x6c0 To fix this, only add abort to tmr_list if it will be handled by target core. This fixes CVE-2024-26845 Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Link: https://lore.kernel.org/r/20240111125941.8688-1-d.bogdanov@yadro.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Rongwei Wang <zigiwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:07:13 +08:00
Linus Torvalds	012ef63727	Revert "memcg: enable accounting for file lock caches" commit `3754707bcc` upstream. This reverts commit `0f12156dff`. The kernel test robot reports a sizeable performance regression for this commit, and while it clearly does the rigth thing in theory, we'll need to look at just how to avoid or minimize the performance overhead of the memcg accounting. People already have suggestions on how to do that, but it's "future work". So revert it for now. This is the revert of CVE-2022-0480. Link: https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/ Acked-by: Jens Axboe <axboe@kernel.dk> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: katrinzhou <katrinzhou@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:05:50 +08:00
Chun-Yi Lee	ae9754712d	aoe: fix the potential use-after-free problem in more places [ Upstream commit 6d6e54fc71ad1ab0a87047fd9c211e75d86084a3 ] Fix CVE: CVE-2023-6270 and Fix CVE: CVE-2024-26898 For fixing CVE-2023-6270, f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts") makes tx() calling dev_put() instead of doing in aoecmd_cfg_pkts(). It avoids that the tx() runs into use-after-free. Then Nicolai Stange found more places in aoe have potential use-after-free problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe() and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push packet to tx queue. So they should also use dev_hold() to increase the refcnt of skb->dev. On the other hand, moving dev_put() to tx() causes that the refcnt of skb->dev be reduced to a negative value, because corresponding dev_hold() are not called in revalidate(), aoecmd_ata_rw(), resend(), probe(), and aoecmd_cfg_rsp(). This patch fixed this issue. Cc: stable@vger.kernel.org Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270 Fixes: f98364e92662 ("aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts") Reported-by: Nicolai Stange <nstange@suse.com> Signed-off-by: Chun-Yi Lee <jlee@suse.com> Link: https://lore.kernel.org/stable/20240624064418.27043-1-jlee%40suse.com Link: https://lore.kernel.org/r/20241002035458.24401-1-jlee@suse.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:08 +08:00
Alexey Dobriyan	74f3b63c7b	block: fix integer overflow in BLKSECDISCARD [ Upstream commit 697ba0b6ec4ae04afb67d3911799b5e2043b4455 ] Fix CVE: CVE-2024-36917 Partly ported to tk4, since blk_ioctl_secure_erase() introduced in v5.19 I independently rediscovered commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155 block: fix overflow in blk_ioctl_discard() but for secure erase. Same problem: uint64_t r[2] = {512, 18446744073709551104ULL}; ioctl(fd, BLKSECDISCARD, r); will enter near infinite loop inside blkdev_issue_secure_erase(): a.out: attempt to access beyond end of device loop0: rw=5, sector=3399043073, nr_sectors = 1024 limit=2048 bio_check_eod: 3286214 callbacks suppressed Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Link: https://lore.kernel.org/r/9e64057f-650a-46d1-b9f7-34af391536ef@p183 Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:08 +08:00
Jann Horn	101164c2be	filelock: Fix fcntl/close race recovery compat path [ Upstream commit f8138f2ad2f745b9a1c696a05b749eabe44337ea ] Fix CVE: CVE-2024-41012 When I wrote commit 3cad1bc01041 ("filelock: Remove locks reliably when fcntl/close race is detected"), I missed that there are two copies of the code I was patching: The normal version, and the version for 64-bit offsets on 32-bit kernels. Thanks to Greg KH for stumbling over this while doing the stable backport... Apply exactly the same fix to the compat path for 32-bit kernels. Fixes: `c293621bbf` ("[PATCH] stale POSIX lock handling") Cc: stable@kernel.org Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2563 Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20240723-fs-lock-recover-compatfix-v1-1-148096719529@google.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Jann Horn	2133be65b1	filelock: Remove locks reliably when fcntl/close race is detected [ Upstream commit 3cad1bc010416c6dd780643476bc59ed742436b9 ] Fix CVE: CVE-2024-41012 When fcntl_setlk() races with close(), it removes the created lock with do_lock_file_wait(). However, LSMs can allow the first do_lock_file_wait() that created the lock while denying the second do_lock_file_wait() that tries to remove the lock. In theory (but AFAIK not in practice), posix_lock_file() could also fail to remove a lock due to GFP_KERNEL allocation failure (when splitting a range in the middle). After the bug has been triggered, use-after-free reads will occur in lock_get_status() when userspace reads /proc/locks. This can likely be used to read arbitrary kernel memory, but can't corrupt kernel memory. This only affects systems with SELinux / Smack / AppArmor / BPF-LSM in enforcing mode and only works from some security contexts. Fix it by calling locks_remove_posix() instead, which is designed to reliably get rid of POSIX locks associated with the given file and files_struct and is also used by filp_flush(). Fixes: `c293621bbf` ("[PATCH] stale POSIX lock handling") Cc: stable@kernel.org Link: https://bugs.chromium.org/p/project-zero/issues/detail?id=2563 Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20240702-fs-lock-recover-2-v1-1-edd456f63789@google.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org> [stable fixup: ->c.flc_type was ->fl_type in older kernels] Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Su Yue	a79dc0008e	ocfs2: fix races between hole punching and AIO+DIO [ Upstream commit 952b023f06a24b2ad6ba67304c4c84d45bea2f18 ] Fix CVE: CVE-2024-40943 After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block", fstests/generic/300 become from always failed to sometimes failed: ======================================================================== [ 473.293420 ] run fstests generic/300 [ 475.296983 ] JBD2: Ignoring recovery information on journal [ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode. [ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found [ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted. [ 494.292018 ] OCFS2: File system is now read-only. [ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30 [ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3 fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072 ========================================================================= In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten extents to a list. extents are also inserted into extent tree in ocfs2_write_begin_nolock. Then another thread call fallocate to puch a hole at one of the unwritten extent. The extent at cpos was removed by ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list found there is no such extent at the cpos. T1 T2 T3 inode lock ... insert extents ... inode unlock ocfs2_fallocate __ocfs2_change_file_space inode lock lock ip_alloc_sem ocfs2_remove_inode_range inode ocfs2_remove_btree_range ocfs2_remove_extent ^---remove the extent at cpos 78723 ... unlock ip_alloc_sem inode unlock ocfs2_dio_end_io ocfs2_dio_end_io_write lock ip_alloc_sem ocfs2_mark_extent_written ocfs2_change_extent_flag ocfs2_search_extent_list ^---failed to find extent ... unlock ip_alloc_sem In most filesystems, fallocate is not compatible with racing with AIO+DIO, so fix it by adding to wait for all dio before fallocate/punch_hole like ext4. Link: https://lkml.kernel.org/r/20240408082041.20925-3-glass.su@suse.com Fixes: `b25801038d` ("ocfs2: Support xfs style space reservation ioctls") Signed-off-by: Su Yue <glass.su@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Alex Deucher	9549182266	drm/radeon: fix UBSAN warning in kv_dpm.c [ Upstream commit a498df5421fd737d11bfd152428ba6b1c8538321 ] Fix CVE: CVE-2024-40988 Adds bounds check for sumo_vid_mapping_entry. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Phillip Lougher	ccffe51361	Squashfs: check the inode number is not the invalid value of zero [ Upstream commit 9253c54e01b6505d348afbc02abaa4d9f8a01395 ] Fix CVE: CVE-2024-26982 Syskiller has produced an out of bounds access in fill_meta_index(). That out of bounds access is ultimately caused because the inode has an inode number with the invalid value of zero, which was not checked. The reason this causes the out of bounds access is due to following sequence of events: 1. Fill_meta_index() is called to allocate (via empty_meta_index()) and fill a metadata index. It however suffers a data read error and aborts, invalidating the newly returned empty metadata index. It does this by setting the inode number of the index to zero, which means unused (zero is not a valid inode number). 2. When fill_meta_index() is subsequently called again on another read operation, locate_meta_index() returns the previous index because it matches the inode number of 0. Because this index has been returned it is expected to have been filled, and because it hasn't been, an out of bounds access is performed. This patch adds a sanity check which checks that the inode number is not zero when the inode is created and returns -EINVAL if it is. [phillip@squashfs.org.uk: whitespace fix] Link: https://lkml.kernel.org/r/20240409204723.446925-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20240408220206.435788-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: "Ubisectech Sirius" <bugreport@ubisectech.com> Closes: https://lore.kernel.org/lkml/87f5c007-b8a5-41ae-8b57-431e924c5915.bugreport@ubisectech.com/ Cc: Christian Brauner <brauner@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Daniel Borkmann	effb291aab	bpf: Use correct permission flag for mixed signed bounds arithmetic [ Upstream commit `9601148392` ] Fix CVE: CVE-2021-46908 We forbid adding unknown scalars with mixed signed bounds due to the spectre v1 masking mitigation. Hence this also needs bypass_spec_v1 flag instead of allow_ptr_leaks. Fixes: `2c78ee898d` ("bpf: Implement CAP_BPF") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Hao Sun	88c84e098a	bpf: Reject variable offset alu on PTR_TO_FLOW_KEYS [ Upstream commit 22c7fa171a02d310e3a3f6ed46a698ca8a0060ed ] Fix CVE: CVE-2024-26589 For PTR_TO_FLOW_KEYS, check_flow_keys_access() only uses fixed off for validation. However, variable offset ptr alu is not prohibited for this ptr kind. So the variable offset is not checked. The following prog is accepted: func#0 @0 0: R1=ctx() R10=fp0 0: (bf) r6 = r1 ; R1=ctx() R6_w=ctx() 1: (79) r7 = (u64 )(r6 +144) ; R6_w=ctx() R7_w=flow_keys() 2: (b7) r8 = 1024 ; R8_w=1024 3: (37) r8 /= 1 ; R8_w=scalar() 4: (57) r8 &= 1024 ; R8_w=scalar(smin=smin32=0, smax=umax=smax32=umax32=1024,var_off=(0x0; 0x400)) 5: (0f) r7 += r8 mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r8 stack= before 4: (57) r8 &= 1024 mark_precise: frame0: regs=r8 stack= before 3: (37) r8 /= 1 mark_precise: frame0: regs=r8 stack= before 2: (b7) r8 = 1024 6: R7_w=flow_keys(smin=smin32=0,smax=umax=smax32=umax32=1024,var_off =(0x0; 0x400)) R8_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=1024, var_off=(0x0; 0x400)) 6: (79) r0 = (u64 )(r7 +0) ; R0_w=scalar() 7: (95) exit This prog loads flow_keys to r7, and adds the variable offset r8 to r7, and finally causes out-of-bounds access: BUG: unable to handle page fault for address: ffffc90014c80038 [...] Call Trace: <TASK> bpf_dispatcher_nop_func include/linux/bpf.h:1231 [inline] __bpf_prog_run include/linux/filter.h:651 [inline] bpf_prog_run include/linux/filter.h:658 [inline] bpf_prog_run_pin_on_cpu include/linux/filter.h:675 [inline] bpf_flow_dissect+0x15f/0x350 net/core/flow_dissector.c:991 bpf_prog_test_run_flow_dissector+0x39d/0x620 net/bpf/test_run.c:1359 bpf_prog_test_run kernel/bpf/syscall.c:4107 [inline] __sys_bpf+0xf8f/0x4560 kernel/bpf/syscall.c:5475 __do_sys_bpf kernel/bpf/syscall.c:5561 [inline] __se_sys_bpf kernel/bpf/syscall.c:5559 [inline] __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:5559 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b Fix this by rejecting ptr alu with variable offset on flow_keys. Applying the patch rejects the program with "R7 pointer arithmetic on flow_keys prohibited". Fixes: `d58e468b11` ("flow_dissector: implements flow dissector BPF hook") Signed-off-by: Hao Sun <sunhao.th@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240115082028.9992-1-sunhao.th@gmail.com Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Li Nan	1140966e24	block: fix overflow in blk_ioctl_discard() [ Upstream commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155 ] Fix CVE: CVE-2024-36917 There is no check for overflow of 'start + len' in blk_ioctl_discard(). Hung task occurs if submit an discard ioctl with the following param: start = 0x80000000000ff000, len = 0x8000000000fff000; Add the overflow validation now. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240329012319.2034550-1-linan666@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:04:07 +08:00
Lu Baolu	26f2f1d33c	iommu: Return right value in iommu_sva_bind_device() [ Upstream commit 89e8a2366e3bce584b6c01549d5019c5cda1205e ] iommu_sva_bind_device() should return either a sva bond handle or an ERR_PTR value in error cases. Existing drivers (idxd and uacce) only check the return value with IS_ERR(). This could potentially lead to a kernel NULL pointer dereference issue if the function returns NULL instead of an error pointer. In reality, this doesn't cause any problems because iommu_sva_bind_device() only returns NULL when the kernel is not configured with CONFIG_IOMMU_SVA. In this case, iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) will return an error, and the device drivers won't call iommu_sva_bind_device() at all. This fixes CVE-2024-40945. Fixes: `26b25a2b98` ("iommu: Bind process address spaces to devices") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240528042528.71396-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Sinong Chen <costinchen@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:03:50 +08:00
Aleksandr Mishin	912b1ff6f3	crypto: bcm - Fix pointer arithmetic [ Upstream commit 2b3460cbf454c6b03d7429e9ffc4fe09322eb1a9 ] In spu2_dump_omd() value of ptr is increased by ciph_key_len instead of hash_iv_len which could lead to going beyond the buffer boundaries. Fix this bug by changing ciph_key_len to hash_iv_len. Found by Linux Verification Center (linuxtesting.org) with SVACE. This fixes CVE-2024-38579. Fixes: `9d12ba86f8` ("crypto: brcm - Add Broadcom SPU driver") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Sinong Chen <costinchen@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:03:50 +08:00
Nikita Kiryushin	9c6d6533d1	ACPI: video: check for error while searching for backlight device parent [ Upstream commit ccd45faf4973746c4f30ea41eec864e5cf191099 ] [tapd] https://tapd.woa.com/TS4Q/prong/stories/view/1020422414118043047 This fixed CVE-2023-52693 If acpi_get_parent() called in acpi_video_dev_register_backlight() fails, for example, because acpi_ut_acquire_mutex() fails inside acpi_get_parent), this can lead to incorrect (uninitialized) acpi_parent handle being passed to acpi_get_pci_dev() for detecting the parent pci device. Check acpi_get_parent() result and set parent device only in case of success. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `9661e92c10` ("acpi: tie ACPI backlight devices to PCI devices if possible") Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: yilingjin <yilingjin@tencent.com> Reviewed-by: Hui Li <caelli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:03:20 +08:00
Nikita Kiryushin	e2bdd28047	ACPI: LPIT: Avoid u32 multiplication overflow [ Upstream commit 56d2eeda87995245300836ee4dbd13b002311782 ] [tapd] https://tapd.woa.com/TS4Q/prong/stories/view/1020422414118043047 This fixes CVE-2023-52683 In lpit_update_residency() there is a possibility of overflow in multiplication, if tsc_khz is large enough (> UINT_MAX/1000). Change multiplication to mul_u32_u32(). Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `eeb2d80d50` ("ACPI / LPIT: Add Low Power Idle Table (LPIT) support") Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: yilingjin <yilingjin@tencent.com> Reviewed-by: Hui Li <caelli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:03:06 +08:00
felix	5b4ca848c4	SUNRPC: Fix RPC client cleaned up the freed pipefs dentries [ Upstream commit bfca5fb4e97c46503ddfc582335917b0cc228264 ] Fix CVE: CVE-2023-52803 RPC client pipefs dentries cleanup is in separated rpc_remove_pipedir() workqueue,which takes care about pipefs superblock locking. In some special scenarios, when kernel frees the pipefs sb of the current client and immediately alloctes a new pipefs sb, rpc_remove_pipedir function would misjudge the existence of pipefs sb which is not the one it used to hold. As a result, the rpc_remove_pipedir would clean the released freed pipefs dentries. To fix this issue, rpc_remove_pipedir should check whether the current pipefs sb is consistent with the original pipefs sb. This error can be catched by KASAN: ========================================================= [ 250.497700] BUG: KASAN: slab-use-after-free in dget_parent+0x195/0x200 [ 250.498315] Read of size 4 at addr ffff88800a2ab804 by task kworker/0:18/106503 [ 250.500549] Workqueue: events rpc_free_client_work [ 250.501001] Call Trace: [ 250.502880] kasan_report+0xb6/0xf0 [ 250.503209] ? dget_parent+0x195/0x200 [ 250.503561] dget_parent+0x195/0x200 [ 250.503897] ? __pfx_rpc_clntdir_depopulate+0x10/0x10 [ 250.504384] rpc_rmdir_depopulate+0x1b/0x90 [ 250.504781] rpc_remove_client_dir+0xf5/0x150 [ 250.505195] rpc_free_client_work+0xe4/0x230 [ 250.505598] process_one_work+0x8ee/0x13b0 ... [ 22.039056] Allocated by task 244: [ 22.039390] kasan_save_stack+0x22/0x50 [ 22.039758] kasan_set_track+0x25/0x30 [ 22.040109] __kasan_slab_alloc+0x59/0x70 [ 22.040487] kmem_cache_alloc_lru+0xf0/0x240 [ 22.040889] __d_alloc+0x31/0x8e0 [ 22.041207] d_alloc+0x44/0x1f0 [ 22.041514] __rpc_lookup_create_exclusive+0x11c/0x140 [ 22.041987] rpc_mkdir_populate.constprop.0+0x5f/0x110 [ 22.042459] rpc_create_client_dir+0x34/0x150 [ 22.042874] rpc_setup_pipedir_sb+0x102/0x1c0 [ 22.043284] rpc_client_register+0x136/0x4e0 [ 22.043689] rpc_new_client+0x911/0x1020 [ 22.044057] rpc_create_xprt+0xcb/0x370 [ 22.044417] rpc_create+0x36b/0x6c0 ... [ 22.049524] Freed by task 0: [ 22.049803] kasan_save_stack+0x22/0x50 [ 22.050165] kasan_set_track+0x25/0x30 [ 22.050520] kasan_save_free_info+0x2b/0x50 [ 22.050921] __kasan_slab_free+0x10e/0x1a0 [ 22.051306] kmem_cache_free+0xa5/0x390 [ 22.051667] rcu_core+0x62c/0x1930 [ 22.051995] __do_softirq+0x165/0x52a [ 22.052347] [ 22.052503] Last potentially related work creation: [ 22.052952] kasan_save_stack+0x22/0x50 [ 22.053313] __kasan_record_aux_stack+0x8e/0xa0 [ 22.053739] __call_rcu_common.constprop.0+0x6b/0x8b0 [ 22.054209] dentry_free+0xb2/0x140 [ 22.054540] __dentry_kill+0x3be/0x540 [ 22.054900] shrink_dentry_list+0x199/0x510 [ 22.055293] shrink_dcache_parent+0x190/0x240 [ 22.055703] do_one_tree+0x11/0x40 [ 22.056028] shrink_dcache_for_umount+0x61/0x140 [ 22.056461] generic_shutdown_super+0x70/0x590 [ 22.056879] kill_anon_super+0x3a/0x60 [ 22.057234] rpc_kill_sb+0x121/0x200 Fixes: `0157d021d2` ("SUNRPC: handle RPC client pipefs dentries by network namespace aware routines") Signed-off-by: felix <fuzhen5@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Honglin Li <honglinli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:02:46 +08:00
Gavrilov Ilia	dd4e8efc2d	calipso: fix memory leak in netlbl_calipso_add_pass() [ Upstream commit ec4e9d630a64df500641892f4e259e8149594a99 ] Fix CVE: CVE-2023-52698 If IPv6 support is disabled at boot (ipv6.disable=1), the calipso_init() -> netlbl_calipso_ops_register() function isn't called, and the netlbl_calipso_ops_get() function always returns NULL. In this case, the netlbl_calipso_add_pass() function allocates memory for the doi_def variable but doesn't free it with the calipso_doi_free(). BUG: memory leak unreferenced object 0xffff888011d68180 (size 64): comm "syz-executor.1", pid 10746, jiffies 4295410986 (age 17.928s) hex dump (first 32 bytes): 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<...>] kmalloc include/linux/slab.h:552 [inline] [<...>] netlbl_calipso_add_pass net/netlabel/netlabel_calipso.c:76 [inline] [<...>] netlbl_calipso_add+0x22e/0x4f0 net/netlabel/netlabel_calipso.c:111 [<...>] genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739 [<...>] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] [<...>] genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800 [<...>] netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2515 [<...>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:811 [<...>] netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] [<...>] netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1339 [<...>] netlink_sendmsg+0x90a/0xdf0 net/netlink/af_netlink.c:1934 [<...>] sock_sendmsg_nosec net/socket.c:651 [inline] [<...>] sock_sendmsg+0x157/0x190 net/socket.c:671 [<...>] ____sys_sendmsg+0x712/0x870 net/socket.c:2342 [<...>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2396 [<...>] __sys_sendmsg+0xea/0x1b0 net/socket.c:2429 [<...>] do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46 [<...>] entry_SYSCALL_64_after_hwframe+0x61/0xc6 Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with Syzkaller Fixes: `cb72d38211` ("netlabel: Initial support for the CALIPSO netlink protocol.") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> [PM: merged via the LSM tree at Jakub Kicinski request] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Honglin Li <honglinli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:02:46 +08:00
Zheng Yejian	b56c896428	netlabel: remove unused parameter in netlbl_netlink_auditinfo() [ Upstream commit `f7e0318a31` ] loginuid/sessionid/secid have been read from 'current' instead of struct netlink_skb_parms, the parameter 'skb' seems no longer needed. Fixes: `c53fa1ed92` ("netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: ec4e9d630a64 ("calipso: fix memory leak in netlbl_calipso_add_pass()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Honglin Li <honglinli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:02:46 +08:00
Andrew Lunn	01568bba78	net: netlabel: Fix kerneldoc warnings [ Upstream commit `294ea29113` ] net/netlabel/netlabel_calipso.c:376: warning: Function parameter or member 'ops' not described in 'netlbl_calipso_ops_register' Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://lore.kernel.org/r/20201028005350.930299-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: ec4e9d630a64 ("calipso: fix memory leak in netlbl_calipso_add_pass()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Honglin Li <honglinli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:02:46 +08:00
Ping-Ke Shih	5bd9faa710	wifi: mac80211: don't return unset power in ieee80211_get_tx_power() [ Upstream commit e160ab85166e77347d0cbe5149045cb25e83937f ] Fix CVE: CVE-2023-52832 We can get a UBSAN warning if ieee80211_get_tx_power() returns the INT_MIN value mac80211 internally uses for "unset power level". UBSAN: signed-integer-overflow in net/wireless/nl80211.c:3816:5 -2147483648 * 100 cannot be represented in type 'int' CPU: 0 PID: 20433 Comm: insmod Tainted: G WC OE Call Trace: dump_stack+0x74/0x92 ubsan_epilogue+0x9/0x50 handle_overflow+0x8d/0xd0 __ubsan_handle_mul_overflow+0xe/0x10 nl80211_send_iface+0x688/0x6b0 [cfg80211] [...] cfg80211_register_wdev+0x78/0xb0 [cfg80211] cfg80211_netdev_notifier_call+0x200/0x620 [cfg80211] [...] ieee80211_if_add+0x60e/0x8f0 [mac80211] ieee80211_register_hw+0xda5/0x1170 [mac80211] In this case, simply return an error instead, to indicate that no data is available. Cc: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://lore.kernel.org/r/20230203023636.4418-1-pkshih@realtek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Honglin Li <honglinli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:02:45 +08:00
Zhipeng Lu	adc0a5378d	drm/amd/pm: fix a double-free in si_dpm_init [ Upstream commit ac16667237a82e2597e329eb9bc520d1cf9dff30 ] [tapd] https://tapd.woa.com/TS4Q/prong/stories/view/1020422414118043047 This fixes CVE-2023-52691 When the allocation of adev->pm.dpm.dyn_state.vddc_dependency_on_dispclk.entries fails, amdgpu_free_extended_power_table is called to free some fields of adev. However, when the control flow returns to si_dpm_sw_init, it goes to label dpm_failed and calls si_dpm_fini, which calls amdgpu_free_extended_power_table again and free those fields again. Thus a double-free is triggered. Fixes: `841686df9f` ("drm/amdgpu: add SI DPM support (v4)") Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: yilingjin <yilingjin@tencent.com> Reviewed-by: Hui Li <caelli@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 15:02:09 +08:00
Li Nan	b12db7754c	md/raid10: use dereference_rdev_and_rrdev() to get devices commit `673643490b` upstream. Commit `2ae6aaf769` ("md/raid10: fix io loss while replacement replace rdev") reads replacement first to prevent io loss. However, there are same issue in wait_blocked_dev() and raid10_handle_discard(), too. Fix it by using dereference_rdev_and_rrdev() to get devices. Fixes: `d30588b273` ("md/raid10: improve raid10 discard request") Fixes: `f2e7e269a7` ("md/raid10: pull the code that wait for blocked dev into one function") Signed-off-by: Li Nan <linan122@huawei.com> Link: https://lore.kernel.org/r/20230701080529.2684932-4-linan666@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:45 +08:00
Li Nan	b740cbc51e	md/raid10: check replacement and rdev to prevent submit the same io twice commit `7e85c41b9e` upstream. After commit `4ca40c2ce0` ("md/raid10: Allow replacement device to be replace old drive."), 'rdev' and 'replacement' could appear to be identical. There are already checks for that in wait_blocked_dev() and raid10_write_request(). Add check for raid10_handle_discard() now. Signed-off-by: Li Nan <linan122@huawei.com> Link: https://lore.kernel.org/r/20230701080529.2684932-2-linan666@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:45 +08:00
Li Nan	7fd98791df	md/raid10: factor out dereference_rdev_and_rrdev() commit `b99f8fd2d9` upstream. Factor out a helper to get 'rdev' and 'replacement' from config->mirrors. Just to make code cleaner and prepare to fix the bug of io loss while 'replacement' replace 'rdev'. There is no functional change. Signed-off-by: Li Nan <linan122@huawei.com> Link: https://lore.kernel.org/r/20230701080529.2684932-3-linan666@huaweicloud.com Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
Li Nan	5b33212acc	md/raid10: fix io loss while replacement replace rdev commit `2ae6aaf769` upstream. When removing a disk with replacement, the replacement will be used to replace rdev. During this process, there is a brief window in which both rdev and replacement are read as NULL in raid10_write_request(). This will result in io not being submitted but it should be. //remove //write raid10_remove_disk raid10_write_request mirror->rdev = NULL read rdev -> NULL mirror->rdev = mirror->replacement mirror->replacement = NULL read replacement -> NULL Fix it by reading replacement first and rdev later, meanwhile, use smp_mb() to prevent memory reordering. Fixes: `475b0321a4` ("md/raid10: writes should get directed to replacement as well as original.") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230602091839.743798-3-linan666@huaweicloud.com Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
Xiao Ni	27463f7cf5	md/raid10: Remove unnecessary rcu_dereference in raid10_handle_discard commit `46d4703b1d` upstream. We are seeing the following warning in raid10_handle_discard. [ 695.110751] ============================= [ 695.131439] WARNING: suspicious RCU usage [ 695.151389] 4.18.0-319.el8.x86_64+debug #1 Not tainted [ 695.174413] ----------------------------- [ 695.192603] drivers/md/raid10.c:1776 suspicious rcu_dereference_check() usage! [ 695.225107] other info that might help us debug this: [ 695.260940] rcu_scheduler_active = 2, debug_locks = 1 [ 695.290157] no locks held by mkfs.xfs/10186. In the first loop of function raid10_handle_discard. It already determines which disk need to handle discard request and add the rdev reference count rdev->nr_pending. So the conf->mirrors will not change until all bios come back from underlayer disks. It doesn't need to use rcu_dereference to get rdev. Cc: stable@vger.kernel.org Fixes: `d30588b273` ('md/raid10: improve raid10 discard request') Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
Xiao Ni	9b5e6e7f4c	md/raid10: improve discard request for far layout commit `254c271da0` upstream. For far layout, the discard region is not continuous on disks. So it needs far copies r10bio to cover all regions. It needs a way to know all r10bios have finish or not. Similar with raid10_sync_request, only the first r10bio master_bio records the discard bio. Other r10bios master_bio record the first r10bio. The first r10bio can finish after other r10bios finish and then return the discard bio. Tested-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
Mike Snitzer	7c40eabb60	dm raid: remove unnecessary discard limits for raid0 and raid10 commit `ca4a4e9a55` upstream. Commit `29efc390b9` ("md/md0: optimize raid0 discard handling") and commit `d30588b273` ("md/raid10: improve raid10 discard request") remove MD raid0's and raid10's inability to properly handle large discards. So eliminate associated constraints from dm-raid's support. Depends-on: `29efc390b9` ("md/md0: optimize raid0 discard handling") Depends-on: `d30588b273` ("md/raid10: improve raid10 discard request") Reported-by: Matthew Ruffell <matthew.ruffell@canonical.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
caelli	b6b80d52ca	replace submit_bio_noacct with generic_make_request When backport 'd30588b2731f("md/raid10: improve raid10 discard reques")', submit_bio_noacct is not implemented in current kernel, replace this with generic_make_request. Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
Xiao Ni	8203078cad	md/raid10: improve raid10 discard request commit `d30588b273` upstream. Now the discard request is split by chunk size. So it takes a long time to finish mkfs on disks which support discard function. This patch improve handling raid10 discard request. It uses the similar way with patch `29efc390b` (md/md0: optimize raid0 discard handling). But it's a little complex than raid0. Because raid10 has different layout. If raid10 is offset layout and the discard request is smaller than stripe size. There are some holes when we submit discard bio to underlayer disks. For example: five disks (disk1 - disk5) D01 D02 D03 D04 D05 D05 D01 D02 D03 D04 D06 D07 D08 D09 D10 D10 D06 D07 D08 D09 The discard bio just wants to discard from D03 to D10. For disk3, there is a hole between D03 and D08. For disk4, there is a hole between D04 and D09. D03 is a chunk, raid10_write_request can handle one chunk perfectly. So the part that is not aligned with stripe size is still handled by raid10_write_request. If reshape is running when discard bio comes and the discard bio spans the reshape position, raid10_write_request is responsible to handle this discard bio. I did a test with this patch set. Without patch: time mkfs.xfs /dev/md0 real4m39.775s user0m0.000s sys0m0.298s With patch: time mkfs.xfs /dev/md0 real0m0.105s user0m0.000s sys0m0.007s nvme3n1 259:1 0 477G 0 disk └─nvme3n1p1 259:10 0 50G 0 part nvme4n1 259:2 0 477G 0 disk └─nvme4n1p1 259:11 0 50G 0 part nvme5n1 259:6 0 477G 0 disk └─nvme5n1p1 259:12 0 50G 0 part nvme2n1 259:9 0 477G 0 disk └─nvme2n1p1 259:15 0 50G 0 part nvme0n1 259:13 0 477G 0 disk └─nvme0n1p1 259:14 0 50G 0 part Reviewed-by: Coly Li <colyli@suse.de> Reviewed-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Tested-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00
Xiao Ni	faf0e8965f	md/raid10: pull the code that wait for blocked dev into one function commit `f2e7e269a7` upstream. The following patch will reuse these logics, so pull the same codes into one function. Tested-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Jinliang Zheng <alexjlzheng@tencent.com> Signed-off-by: Jianping Liu <frankjpliu@tencent.com>	2024-11-28 14:58:44 +08:00

1 2 3 4 5 ...

873984 Commits All Branches Search

873984 Commits

All Branches