OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Rafael J. Wysocki	8362f4dbb9	intel_idle: Relocate definitions of cpuidle callbacks commit `30a996fbb3` upstream. Move the definitions of intel_idle() and intel_idle_s2idle() before the definitions of cpuidle_state structures referring to them to avoid having to use additional declarations of them (and drop those declarations). No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:23 +08:00
Rafael J. Wysocki	c26b45ea7d	intel_idle: Clean up definitions of cpuidle callbacks commit `bc721c1e45` upstream. Add proper kerneldoc descriptions to intel_idle() and intel_idle_s2idle(), annotate the latter with __cpuidle and reorder the declarations of local variables in both of them to reflect the mwait_idle_with_hints() arguments order. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:23 +08:00
Rafael J. Wysocki	525c095aab	intel_idle: Simplify LAPIC timer reliability checks commit `40ab82e08d` upstream. The lapic_timer_always_reliable variable really takes only two values and some arithmetic in intel_idle() related to comparing it with the target C-state's MWAIT hint value is unnecessary. Simplify the code by replacing lapic_timer_always_reliable with a bool variable lapic_timer_always_reliable and dropping the LAPIC_TIMER_ALWAYS_RELIABLE symbol along with the excess computations in intel_idle(). While at it, add a comment explaining the branch taken in intel_idle() if the LAPIC timer is only reliable in C1 and modify the related debug message in intel_idle_init() accordingly (the modification of this message in the only expected functional impact of the change made here). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:23 +08:00
Rafael J. Wysocki	29626393d1	intel_idle: Introduce 'states_off' module parameter commit `4dcb78ee57` upstream. In certain system configurations it may not be desirable to use some C-states assumed to be available by intel_idle and the driver needs to be prevented from using them even before the cpuidle sysfs interface becomes accessible to user space. Currently, the only way to achieve that is by setting the 'max_cstate' module parameter to a value lower than the index of the shallowest of the C-states in question, but that may be overly intrusive, because it effectively makes all of the idle states deeper than the 'max_cstate' one go away (and the C-state to avoid may be in the middle of the range normally regarded as available). To allow that limitation to be overcome, introduce a new module parameter called 'states_off' to represent a list of idle states to be disabled by default in the form of a bitmask and update the documentation to cover it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:22 +08:00
Rafael J. Wysocki	b84c30be8f	intel_idle: Introduce 'use_acpi' module parameter commit `3a5be9b8f4` upstream. For diagnostics, it is generally useful to be able to make intel_idle take the system's ACPI tables into consideration even if that is not required for the processor model in there, so introduce a new module parameter, 'use_acpi', to make that happen and update the documentation to cover it. While at it, fix the 'no_acpi' module parameter name in the documentation. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:22 +08:00
Rafael J. Wysocki	25bca34f26	intel_idle: Clean up irtl_2_usec() commit `86e9466ae6` upstream. Move the irtl_ns_units[] definition into irtl_2_usec() which is the only user of it, use div_u64() for the division in there (as the divisor is small enough) and use the NSEC_PER_USEC symbol for the divisor. Also convert the irtl_2_usec() comment to a proper kerneldo one. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:21 +08:00
Rafael J. Wysocki	7d0d3657be	intel_idle: Move 3 functions closer to their callers commit `1aefbd7aeb` upstream. Move intel_idle_verify_cstate(), auto_demotion_disable() and c1e_promotion_disable() closer to their callers. While at it, annotate intel_idle_verify_cstate() with __init, as it is only used during the initialization of the driver. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:21 +08:00
Rafael J. Wysocki	3796f4ee32	intel_idle: Annotate initialization code and data structures commit `095928ae48` upstream. Annotate the functions that are only used at the initialization time with __init and the data structures used by them with __initdata or __initconst. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:21 +08:00
Rafael J. Wysocki	af099e57e6	intel_idle: Rearrange intel_idle_cpuidle_driver_init() commit `3d3a1ae9b4` upstream. Notice that intel_idle_state_table_update() only needs to be called if icpu is not NULL, so fold it into intel_idle_init_cstates_icpu(), and pass a pointer to the driver object to intel_idle_cpuidle_driver_init() as an argument instead of referencing it locally in there. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:20 +08:00
Rafael J. Wysocki	c82844ca5f	intel_idle: Fold intel_idle_probe() into intel_idle_init() commit `a6c86e3362` upstream. There is no particular reason why intel_idle_probe() needs to be a separate function and folding it into intel_idle_init() causes the code to be somewhat easier to follow, so do just that. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:20 +08:00
Rafael J. Wysocki	cd6efdfb7c	intel_idle: Eliminate __setup_broadcast_timer() commit `cbd2c4c25d` upstream. The __setup_broadcast_timer() static function is only called in one place and "true" is passed to it as the argument in there, so effectively it is a wrapper arround tick_broadcast_enable(). To simplify the code, call tick_broadcast_enable() directly instead of __setup_broadcast_timer() and drop the latter. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chen Zhuo <sagazchen@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:51:20 +08:00
Menglong Dong	ed011e007b	net: tcp: add sysctl_tcp_wnd_shrink Add the 'sysctl_tcp_wnd_shrink' to control the enable/disable of TCP window shrink. By default, it is disabled. Signed-off-by: Menglong Dong <imagedong@tencent.com>	2024-06-11 20:51:19 +08:00
mengensun	e1bf1991a5	net/tcp: switch to GSO being always on when open gso, tcp Write queues have less overhead, and make some app run faster. test of redis-benchmark like follow: Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Mengen Sun <mengensun@tencent.com>	2024-06-11 20:51:19 +08:00
Menglong Dong	f0d423d51c	net: tcp: raise zero-window probe without check wnd_end In the origin logic, zero-window probe can not only be raised on 0 window, but also in other case, such as MTU probe fails. Therefore, we need modify tcp_probe0_needed() to make it compatible with origin logic. Signed-off-by: Menglong Dong <imagedong@tencent.com>	2024-06-11 20:51:19 +08:00
Linus Torvalds	980a335360	mm: make wait_on_page_writeback() wait for multiple pending writebacks upstream commit: `c2407cf7d2` Ever since commit `2a9127fcf2` ("mm: rewrite wait_on_page_bit_common() logic") we've had some very occasional reports of BUG_ON(PageWriteback) in write_cache_pages(), which we thought we already fixed in commit `073861ed77` ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)"). But syzbot just reported another one, even with that commit in place. And it turns out that there's a simpler way to trigger the BUG_ON() than the one Hugh found with page re-use. It all boils down to the fact that the page writeback is ostensibly serialized by the page lock, but that isn't actually really true. Yes, the people _setting_ writeback all do so under the page lock, but the actual clearing of the bit - and waking up any waiters - happens without any page lock. This gives us this fairly simple race condition: CPU1 = end previous writeback CPU2 = start new writeback under page lock CPU3 = write_cache_pages() CPU1 CPU2 CPU3 ---- ---- ---- end_page_writeback() test_clear_page_writeback(page) ... delayed... lock_page(); set_page_writeback() unlock_page() lock_page() wait_on_page_writeback(); wake_up_page(page, PG_writeback); .. wakes up CPU3 .. BUG_ON(PageWriteback(page)); where the BUG_ON() happens because we woke up the PG_writeback bit becasue of the _previous_ writeback, but a new one had already been started because the clearing of the bit wasn't actually atomic wrt the actual wakeup or serialized by the page lock. The reason this didn't use to happen was that the old logic in waiting on a page bit would just loop if it ever saw the bit set again. The nice proper fix would probably be to get rid of the whole "wait for writeback to clear, and then set it" logic in the writeback path, and replace it with an atomic "wait-to-set" (ie the same as we have for page locking: we set the page lock bit with a single "lock_page()", not with "wait for lock bit to clear and then set it"). However, out current model for writeback is that the waiting for the writeback bit is done by the generic VFS code (ie write_cache_pages()), but the actual setting of the writeback bit is done much later by the filesystem ".writepages()" function. IOW, to make the writeback bit have that same kind of "wait-to-set" behavior as we have for page locking, we'd have to change our roughly ~50 different writeback functions. Painful. Instead, just make "wait_on_page_writeback()" loop on the very unlikely situation that the PG_writeback bit is still set, basically re-instating the old behavior. This is very non-optimal in case of contention, but since we only ever set the bit under the page lock, that situation is controlled. Reported-by: syzbot+2fc0712f8f8b8b8fa0ef@syzkaller.appspotmail.com Fixes: `2a9127fcf2` ("mm: rewrite wait_on_page_bit_common() logic") Acked-by: Hugh Dickins <hughd@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:18 +08:00
Paolo Bonzini	e31375a771	KVM: Do not leak memory for duplicate debugfs directories commit `85cd39af14` upstream. KVM creates a debugfs directory for each VM in order to store statistics about the virtual machine. The directory name is built from the process pid and a VM fd. While generally unique, it is possible to keep a file descriptor alive in a way that causes duplicate directories, which manifests as these messages: [ 471.846235] debugfs: Directory '20245-4' with parent 'kvm' already present! Even though this should not happen in practice, it is more or less expected in the case of KVM for testcases that call KVM_CREATE_VM and close the resulting file descriptor repeatedly and in parallel. When this happens, debugfs_create_dir() returns an error but kvm_create_vm_debugfs() goes on to allocate stat data structs which are later leaked. The slow memory leak was spotted by syzkaller, where it caused OOM reports. Since the issue only affects debugfs, do a lookup before calling debugfs_create_dir, so that the message is downgraded and rate-limited. While at it, ensure kvm->debugfs_dentry is NULL rather than an error if it is not created. This fixes kvm_destroy_vm_debugfs, which was not checking IS_ERR_OR_NULL correctly. Cc: stable@vger.kernel.org Fixes: `536a6f88c4` ("KVM: Create debugfs dir and stat files for each VM") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:18 +08:00
Andrew Sy Kim	f2f46e7af4	ipvs: queue delayed work to expire no destination connections if expire_nodest_conn=1 [upstream commit `35dfb01314`] When expire_nodest_conn=1 and a destination is deleted, IPVS does not expire the existing connections until the next matching incoming packet. If there are many connection entries from a single client to a single destination, many packets may get dropped before all the connections are expired (more likely with lots of UDP traffic). An optimization can be made where upon deletion of a destination, IPVS queues up delayed work to immediately expire any connections with a deleted destination. This ensures any reused source ports from a client (within the IPVS timeouts) are scheduled to new real servers instead of silently dropped. Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 20:51:18 +08:00
Menglong Dong	0fe440f03b	net: tcp: handle window shrink properly Window shrink is not allowed and also not handled for now, but it's needed in some case. In the origin logic, 0 probe is triggered only when there is no any data in the retrans queue and the receive window can't hold the data of the 1th packet in the send queue. Now, let's change it and trigger the 0 probe in such cases: - if the retrans queue has data and the 1th packet in it is not within the receive window - no data in the retrans queue and the 1th packet in the send queue is out of the end of the receive window Signed-off-by: Menglong Dong <imagedong@tencent.com>	2024-06-11 20:51:17 +08:00
Menglong Dong	bb00f4ca4c	net: tcp: send zero-window when no memory For now, skb will be dropped when no memory, which makes client keep retrans util timeout and it's not friendly to the users. Therefore, now we force to receive one packet on current socket when the protocol memory is out of the limitation. Then, this socket will stay in 'no mem' status, util protocol memory is available. When a socket is in 'no mem' status, it's receive window will become 0, which means window shrink happens. And the sender need to handle such window shrink properly, which is done in the next commit. Signed-off-by: Menglong Dong <imagedong@tencent.com>	2024-06-11 20:51:17 +08:00
caelli	1ea94d5505	driver: update e1000e to 3.8.4 E1000e driver is update to 3.8.4 on x86, arm64 still use 3.2.6. Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:17 +08:00
Liu Chun	73b70ea3f0	kdump: the capture kernel can't use dma memory In arm64 system, when the memory that less than 4G is a little, the capture kernel cannot use dma memory. Therefore, it is necessary to enable CONFIG_EXEC_FILE and fixes the issue of reserved memory to pass low memory to the kdump kernel. Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com> Signed-off-by: Liu Chun <kaicliu@tencent.com>	2024-06-11 20:51:11 +08:00
Liu Chun	d151d105a1	drm: Fixed system hang caused by memory failure When the dma memory is insufficient, the wrong release of resources will cause the system to hang. [ 35.975823] [TTM] Initializing pool allocator [ 35.980166] [TTM] Initializing DMA pool allocator [ 35.984864] [drm:hibmc_mm_init [hibmc_drm]] ERROR Error initializing VRAM MM; -12 [ 35.992517] ------------[ cut here ]------------ [ 35.997154] WARNING: CPU: 0 PID: 116 at drivers/gpu/drm/drm_modeset_lock.c:266 drm_modeset_lock+0xd8/0xf8 [drm] [ 36.007192] Modules linked in: hibmc_drm(+) drm_vram_helper ttm drm_kms_helper drm autofs4 overlay squashfs [ 36.016890] CPU: 0 PID: 116 Comm: kworker/0:2 Not tainted 5.4.119-0.20230227git9d7d3558a64d.19 #1 [ 36.025719] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.05 09/18/2019 [ 36.033173] Workqueue: events work_for_cpu_fn [ 36.037510] pstate: a0800009 (NzCv daif -PAN +UAO) [ 36.042297] pc : drm_modeset_lock+0xd8/0xf8 [drm] [ 36.046995] lr : drm_modeset_lock+0x44/0xf8 [drm] [ 36.051676] sp : ffff80005462fc30 [ 36.054974] x29: ffff80005462fc30 x28: 0000000000000000 [ 36.060260] x27: ffff2057ebe20000 x26: 0000000000000000 [ 36.065546] x25: 0000000000000000 x24: ffff80004cf6f8e8 [ 36.070833] x23: 0000000000000000 x22: ffff2057f4739800 [ 36.076119] x21: ffff800049803908 x20: ffff2057f4739998 [ 36.081405] x19: ffff80005462fcc0 x18: 0000000000000010 [ 36.086690] x17: 0000000000000000 x16: ffff800048b61b88 [ 36.091976] x15: ffffffffffffffff x14: 204d41525620676e [ 36.097261] x13: 697a696c61697469 x12: 6e6920726f727245 [ 36.102547] x11: 202a524f5252452a x10: 205d5d6d72645f63 [ 36.107832] x9 : ffff800048b61bcc x8 : ffff800048703a60 [ 36.113118] x7 : 065448] work_for_cpu_fn+0x20/0x30 [ 36.169181] process_one_work+0x1f8/0x488 [ 36.173173] worker_thread+0x248/0x528 [ 36.176906] kthread+0x124/0x128 [ 36.180121] ret_from_fork+0x10/0x18 [ 36.183679] ---[ end trace aae0476f91651f5d ]--- [ 36.188284] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 [ 36.197028] Mem abort info: [ 36.199809] ESR = 0x96000005 [ 36.202851] EC = 0x25: DABT (current EL), IL = 32 bits [ 36.208137] SET = 0, FnV = 0 [ 36.211176] EA = 0, S1PTW = 0 [ 36.214303] Data abort info: [ 36.217172] ISV = 0, ISS = 0x00000005 [ 36.220990] CM = 0, WnR = 0 [ 36.223946] user pgtable: 64k pages, 48-bit VAs, pgdp=00002057df5a0600 [ 36.230444] [0000000000000018] pgd=0000000000000000, pud=0000000000000000 [ 36.237200] Internal error: Oops: 96000005 [#1] SMP [ 36.242056] Modules linked in: hibmc_drm(+) drm_vram_helper ttm drm_kms_helper drm autofs4 overlay squashfs [ 36.251751] CPU: 0 PID: 116 Comm: kworker/0:2 Tainted: G W 5.4.119-0.20230227git9d7d3558a64d.19 #1 [ 36.261963] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.05 09/18/2019 [ 36.269411] Workqueue: events work_for_cpu_fn [ 36.273747] pstate: a0800009 (NzCv daif -PAN +UAO) [ 36.278518] pc : ww_mutex_lock+0x2c/0x70 [ 36.282439] lr : drm_modeset_lock+0x44/0xf8 [drm] [ 36.287120] sp : ffff80005462fc20 [ 36.290418] x29: ffff80005462fc20 x28: 0000000000000000 [ 36.295704] x27: ffff2057ebe20000 x26: 0000000000000000 [ 36.300989] x25: 0000000000000000 x24: ffff80004cf6f8e8 [ 36.306274] x23: 0000000000000000 x22: ffff2057f4739800 [ 36.311560] x21: ffff2057f4739af8 x20: 0000000000000018 [ 36.316846] x19: ffff80005462fcc0 x18: 0000000000000010 [ 36.322131] x17: 0000000000000000 x16: ffff800048b61b88 [ 36.327418] x15: ffffffffffffffff x14: 204d41525620676e [ 36.332703] x13: 697a696c61697469 x12: 6e6920726f727245 [ 36.337989] x11: 202a524f5252452a x10: 205d5d6d72645f63 [ 36.343274] x9 : ffff800008ce4594 x8 : ffff800048703a60 [ 36.348560] x7 : 0000000000000469 x6 : ffff80004998e5e6 [ 36.353845] x5 : 0000000000000000 x4 : ffff80005462fcc0 [ 36.359131] x3 : 0000000000000018 x2 : ffff2057ebe30000 [ 36.364417] x1 : 0000000000000000 x0 : 0000000000000018 [ 36.369703] Call trace: [ 36.372138] ww_mutex_lock+0x2c/0x70 [ 36.375712] drm_modeset_lock+0x44/0xf8 [drm] [ 36.380064] drm_modeset_lock_all_ctx+0x68/0xf8 [drm] [ 36.385100] drm_atomic_helper_shutdown+0x54/0xd0 [drm_kms_helper] [ 36.391251] hibmc_unload+0x2c/0xa8 [hibmc_drm] [ 36.395762] hibmc_pci_probe+0x318/0x430 [hibmc_drm] [ 36.400703] local_pci_probe+0x44/0xa8 [ 36.404435] work_for_cpu_fn+0x20/0x30 [ 36.408167] process_one_work+0x1f8/0x488 [ 36.412158] worker_thread+0x248/0x528 [ 36.415890] kthread+0x124/0x128 [ 36.419103] ret_from_fork+0x10/0x18 [ 36.422662] Code: d503201f d503201f d2800001 aa0103e5 (c8e57c02) [ 36.428727] ---[ end trace aae0476f91651f5e ]--- [ 37.169300] systemd-udevd[307]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. Signed-off-by: Chun Liu <kaicliu@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:02 +08:00
Kairui Song	1b7d9fa70f	arm64: kexec_file: add crash dump support Upstream: `3751e728ce` Link: `40e94ab32e` commit `3751e728ce` Author: AKASHI Takahiro <takahiro.akashi@linaro.org> Date: Mon Dec 16 11:12:47 2019 +0900 arm64: kexec_file: add crash dump support Enabling crash dump (kdump) includes * prepare contents of ELF header of a core dump file, /proc/vmcore, using crash_prepare_elf64_headers(), and * add two device tree properties, "linux,usable-memory-range" and "linux,elfcorehdr", which represent respectively a memory range to be used by crash dump kernel and the header's location Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Tested-and-reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:02 +08:00
Kairui Song	b7e9b568c2	libfdt: include fdt_addresses.c Upstream: `c273a2bd8a` Link: `887436bdb7` commit `c273a2bd8a` Author: AKASHI Takahiro <takahiro.akashi@linaro.org> Date: Mon Dec 9 12:03:44 2019 +0900 libfdt: include fdt_addresses.c In the implementation of kexec_file_loaded-based kdump for arm64, fdt_appendprop_addrrange() will be needed. So include fdt_addresses.c in making libfdt. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:02 +08:00
Kairui Song	6fb78c4cc6	arm64: kdump: remove dependency on arm64_dma32_phys_limit From: Yi Li <adamliyi@msn.com> Link: `696027f109` The patch `b2da6ad294` (arm64: kdump: reimplement crashkernel=X) depends on commit `1a8e1cef76` ("arm64: use both ZONE_DMA and ZONE_DMA32"). Commit `1a8e1cef76` is not ported to 5.4 kernel. So use arm64_dma_phys_limit. Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:01 +08:00
Kairui Song	fd02a1b5bc	kdump: update Documentation about crashkernel From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `023deaec32` For arm64, the behavior of crashkernel=X has been changed, which tries low allocation in DMA zone or DMA32 zone if CONFIG_ZONE_DMA is disabled, and fall back to high allocation if it fails. We can also use "crashkernel=X,high" to select a high region above DMA zone, which also tries to allocate at least 256M low memory in DMA zone automatically (or the DMA32 zone if CONFIG_ZONE_DMA is disabled). "crashkernel=Y,low" can be used to allocate specified size low memory. So update the Documentation. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:01 +08:00
Kairui Song	3fd41ff677	arm64: kdump: add memory for devices by DT property linux,usable-memory-range From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `2012a3b392` When reserving crashkernel in high memory, some low memory is reserved for crash dump kernel devices and never mapped by the first kernel. This memory range is advertised to crash dump kernel via DT property under /chosen, linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]> We reused the DT property linux,usable-memory-range and made the low memory region as the second range "BASE2 SIZE2", which keeps compatibility with existing user-space and older kdump kernels. Crash dump kernel reads this property at boot time and call memblock_add() to add the low memory region after memblock_cap_memory_range() has been called. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:01 +08:00
Kairui Song	d001dccf2b	x86, arm64: Add ARCH_WANT_RESERVE_CRASH_KERNEL config From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `c8013ee6cd` We make the functions reserve_crashkernel[_low]() as generic for x86 and arm64. Since reserve_crashkernel[_low]() implementations are quite similar on other architectures as well, we can have more users of this later. So have CONFIG_ARCH_WANT_RESERVE_CRASH_KERNEL in arch/Kconfig and select this by X86 and ARM64. Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:00 +08:00
Kairui Song	bd482067c3	arm64: kdump: reimplement crashkernel=X From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `70e586365f` There are following issues in arm64 kdump: 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail when there is no enough low memory. 2. If reserving crashkernel above 4G, in this case, crash dump kernel will boot failure because there is no low memory available for allocation. 3. Since commit `1a8e1cef76` ("arm64: use both ZONE_DMA and ZONE_DMA32"), if the memory reserved for crash dump kernel falled in ZONE_DMA32, the devices in crash dump kernel need to use ZONE_DMA will alloc fail. To solve these issues, change the behavior of crashkernel=X and introduce crashkernel=X,[high,low]. crashkernel=X tries low allocation in DMA zone or DMA32 zone if CONFIG_ZONE_DMA is disabled, and fall back to high allocation if it fails. We can also use "crashkernel=X,high" to select a region above DMA zone, which also tries to allocate at least 256M in DMA zone automatically (or the DMA32 zone if CONFIG_ZONE_DMA is disabled). "crashkernel=Y,low" can be used to allocate specified size low memory. Another minor change, there may be two regions reserved for crash dump kernel, in order to distinct from the high region and make no effect to the use of existing kexec-tools, rename the low region as "Crash kernel (low)". Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:00 +08:00
Kairui Song	f30355b620	arm64: kdump: introduce some macroes for crash kernel reservation From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `667118f8c1` Introduce macro CRASH_ALIGN for alignment, macro CRASH_ADDR_LOW_MAX for upper bound of low crash memory, macro CRASH_ADDR_HIGH_MAX for upper bound of high crash memory, use macroes instead. Besides, keep consistent with x86, use CRASH_ALIGN as the lower bound of crash kernel reservation. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:51:00 +08:00
Kairui Song	21ff8ff8f3	x86/elf: Move vmcore_elf_check_arch_cross to arch/x86/include/asm/elf.h From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `b332ab8970` Move macro vmcore_elf_check_arch_cross from arch/x86/include/asm/kexec.h to arch/x86/include/asm/elf.h to fix the following compiling warning: In file included from arch/x86/kernel/setup.c:39:0: ./arch/x86/include/asm/kexec.h:77:0: warning: "vmcore_elf_check_arch_cross" redefined # define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64) In file included from arch/x86/kernel/setup.c:9:0: ./include/linux/crash_dump.h:39:0: note: this is the location of the previous definition #define vmcore_elf_check_arch_cross(x) 0 The root cause is that vmcore_elf_check_arch_cross under CONFIG_CRASH_CORE depend on CONFIG_KEXEC_CORE. Commit 532b66d2279d ("x86: kdump: move reserve_crashkernel[_low]() into crash_core.c") triggered the issue. Suggested by Mike, simply move vmcore_elf_check_arch_cross from arch/x86/include/asm/kexec.h to arch/x86/include/asm/elf.h to fix the warning. Fixes: 532b66d2279d ("x86: kdump: move reserve_crashkernel[_low]() into crash_core.c") Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:59 +08:00
Kairui Song	3177fa46ec	x86: kdump: move reserve_crashkernel[_low]() into crash_core.c From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `8cb8686864` Make the functions reserve_crashkernel[_low]() as generic. Arm64 will use these to reimplement crashkernel=X. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:59 +08:00
Kairui Song	e5eac006f6	x86: kdump: move xen_pv_domain() check and insert_resource() to setup_arch() From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `8ec4a816f2` We will make the functions reserve_crashkernel() as generic, the xen_pv_domain() check in reserve_crashkernel() is relevant only to x86, the same as insert_resource() in reserve_crashkernel[_low](). So move xen_pv_domain() check and insert_resource() to setup_arch() to keep them in x86. Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:59 +08:00
Kairui Song	2fbb10e99b	x86: kdump: use macro CRASH_ADDR_LOW_MAX in functions reserve_crashkernel() From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `a2e0b4351d` To make the functions reserve_crashkernel() as generic, replace some hard-coded numbers with macro CRASH_ADDR_LOW_MAX. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:58 +08:00
Kairui Song	cc6803d7d8	x86: kdump: make the lower bound of crash kernel reservation consistent From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `8882ba540e` The lower bounds of crash kernel reservation and crash kernel low reservation are different, use the consistent value CRASH_ALIGN. Suggested-by: Dave Young <dyoung@redhat.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:58 +08:00
Kairui Song	b3ab0276fe	x86: kdump: replace the hard-coded alignment with macro CRASH_ALIGN From: Chen Zhou <chenzhou10@huawei.com> Link: https://lkml.org/lkml/2021/1/30/53 Link: `873384fe79` Move CRASH_ALIGN to header asm/kexec.h for later use. Besides, the alignment of crash kernel regions in x86 is 16M(CRASH_ALIGN), but function reserve_crashkernel() also used 1M alignment. So just replace hard-coded alignment 1M with macro CRASH_ALIGN. Suggested-by: Dave Young <dyoung@redhat.com> Suggested-by: Baoquan He <bhe@redhat.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Tested-by: John Donnelly <John.p.donnelly@oracle.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Acked-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:57 +08:00
Kairui Song	47c1cc9217	arm64: remove the hard coded crashkernel address limit This conflicts with upstream's kdump high reservation support, and we already have CONFIG_ZONE_DMA32 set, so we have: ARCH_LOW_ADDRESS_LIMIT = min(offset + (1ULL << 32), memblock_end_of_DRAM()); Which limits the address below 4G, so this hard code limit is redundant. Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:57 +08:00
Alex Shi	6bc9581ddd	Revert "gup: document and work around "COW can break either way" issue" This reverts commit 918f50807eccd63d482ef4cf778b1d2b416770a9. the commit force COW to write model, which force COW breaking, and cause page usage increase a lot. On upstream, commit `376a34efa` ("mm/gup: refactor and de-duplicate gup_fast() code") give another way to fix fork secuirty issue of COW, and then revert the buggy commit by commit `a308c71bf1` ("mm/gup: Remove enfornced COW mechanism") Signed-off-by: Alex Shi <alexsshi@tencent.com>	2024-06-11 20:50:57 +08:00
Peter Xu	2453865ed4	mm/ksm: Remove reuse_ksm_page() Remove the function as the last reference has gone away with the do_wp_page() changes. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit `1a0cf26323`) Signed-off-by: Alex Shi <alexsshi@tencent.com>	2024-06-11 20:50:56 +08:00
Linus Torvalds	0fb4d8fd75	mm: do_wp_page() simplification commit `09854ba94c` upstrem How about we just make sure we're the only possible valid user fo the page before we bother to reuse it? Simplify, simplify, simplify. And get rid of the nasty serialization on the page lock at the same time. [peterx: add subject prefix] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit `09854ba94c`) Signed-off-by: Alex Shi <alexsshi@tencent.com> Conflicts: mm/memory.c	2024-06-11 20:50:56 +08:00
Yuehong Wu	85ba10e6ef	config: enable BFQ io scheduler Enable CONFIG_IOSCHED_BFQ,CONFIG_BFQ_GROUP_IOSCHED for ARM to support bfq io-scheduler. Signed-off-by: Yuehong Wu <yuehongwu@tencent.com> Signed-off-by: Bin Lai <robinlai@tencent.com>	2024-06-11 20:50:53 +08:00
Ni Xun	3c048b0f89	config: change CONFIG_CONFIGFS_FS to Y for default conf CONFIG_CONFIGFS_FS from M to Y for arm default config Signed-off-by: Ni Xun <richardni@tencent.com>	2024-06-11 20:50:40 +08:00
KP Singh	5e0977fd08	security: Fix hook iteration for secid_to_secctx [upstream commit `0550cfe8c2`] secid_to_secctx is not stackable, and since the BPF LSM registers this hook by default, the call_int_hook logic is not suitable which "bails-on-fail" and casues issues when other LSMs register this hook and eventually breaks Audit. In order to fix this, directly iterate over the security hooks instead of using call_int_hook as suggested in: https: //lore.kernel.org/bpf/9d0eb6c6-803a-ff3a-5603-9ad6d9edfc00@schaufler-ca.com/#t Fixes: `98e828a065` ("security: Refactor declaration of LSM hooks") Fixes: `625236ba38` ("security: Fix the default value of secid_to_secctx hook") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200520125616.193765-1-kpsingh@chromium.org Signed-off-by: Menglong Dong <imagedong@tencent.com>	2024-06-11 20:50:15 +08:00
soonflywang	f1f1da34d4	arm64: fix NEON/VFP reentrant in fast_copy_page Add fixup in fast_copy_page, this feature is disabled by default, set vm.fast_copy_page_enabled to enable it. Signed-off-by: soonflywang <soonflywang@tencent.com> Signed-off-by: caelli <caelli@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:50:14 +08:00
soonflywang	ccb2a062d5	arm64: implemented a fast copy_page version while NEON/VFP is met When running on Arm server, usually there is NEON/VFP extension on Arm server CPU, this patch levearges SIMD instructions to speed up the efficiency of current copy_page(). Signed-off-by: soonflywang <soonflywang@tencent.com> Signed-off-by: Chengdong Li <chengdongli@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:50:14 +08:00
Anders Roxell	c7ff1ae2e7	security: Fix the default value of secid_to_secctx hook Upstream commit `625236ba38` security_secid_to_secctx is called by the bpf_lsm hook and a successful return value (i.e 0) implies that the parameter will be consumed by the LSM framework. The current behaviour return success when the pointer isn't initialized when CONFIG_BPF_LSM is enabled, with the default return from kernel/bpf/bpf_lsm.c. This is the internal error: [ 1229.341488][ T2659] usercopy: Kernel memory exposure attempt detected from null address (offset 0, size 280)! [ 1229.374977][ T2659] ------------[ cut here ]------------ [ 1229.376813][ T2659] kernel BUG at mm/usercopy.c:99! [ 1229.378398][ T2659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 1229.380348][ T2659] Modules linked in: [ 1229.381654][ T2659] CPU: 0 PID: 2659 Comm: systemd-journal Tainted: G B W 5.7.0-rc5-next-20200511-00019-g864e0c6319b8-dirty #13 [ 1229.385429][ T2659] Hardware name: linux,dummy-virt (DT) [ 1229.387143][ T2659] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 1229.389165][ T2659] pc : usercopy_abort+0xc8/0xcc [ 1229.390705][ T2659] lr : usercopy_abort+0xc8/0xcc [ 1229.392225][ T2659] sp : ffff000064247450 [ 1229.393533][ T2659] x29: ffff000064247460 x28: 0000000000000000 [ 1229.395449][ T2659] x27: 0000000000000118 x26: 0000000000000000 [ 1229.397384][ T2659] x25: ffffa000127049e0 x24: ffffa000127049e0 [ 1229.399306][ T2659] x23: ffffa000127048e0 x22: ffffa000127048a0 [ 1229.401241][ T2659] x21: ffffa00012704b80 x20: ffffa000127049e0 [ 1229.403163][ T2659] x19: ffffa00012704820 x18: 0000000000000000 [ 1229.405094][ T2659] x17: 0000000000000000 x16: 0000000000000000 [ 1229.407008][ T2659] x15: 0000000000000000 x14: 003d090000000000 [ 1229.408942][ T2659] x13: ffff80000d5b25b2 x12: 1fffe0000d5b25b1 [ 1229.410859][ T2659] x11: 1fffe0000d5b25b1 x10: ffff80000d5b25b1 [ 1229.412791][ T2659] x9 : ffffa0001034bee0 x8 : ffff00006ad92d8f [ 1229.414707][ T2659] x7 : 0000000000000000 x6 : ffffa00015eacb20 [ 1229.416642][ T2659] x5 : ffff0000693c8040 x4 : 0000000000000000 [ 1229.418558][ T2659] x3 : ffffa0001034befc x2 : d57a7483a01c6300 [ 1229.420610][ T2659] x1 : 0000000000000000 x0 : 0000000000000059 [ 1229.422526][ T2659] Call trace: [ 1229.423631][ T2659] usercopy_abort+0xc8/0xcc [ 1229.425091][ T2659] __check_object_size+0xdc/0x7d4 [ 1229.426729][ T2659] put_cmsg+0xa30/0xa90 [ 1229.428132][ T2659] unix_dgram_recvmsg+0x80c/0x930 [ 1229.429731][ T2659] sock_recvmsg+0x9c/0xc0 [ 1229.431123][ T2659] ____sys_recvmsg+0x1cc/0x5f8 [ 1229.432663][ T2659] ___sys_recvmsg+0x100/0x160 [ 1229.434151][ T2659] __sys_recvmsg+0x110/0x1a8 [ 1229.435623][ T2659] __arm64_sys_recvmsg+0x58/0x70 [ 1229.437218][ T2659] el0_svc_common.constprop.1+0x29c/0x340 [ 1229.438994][ T2659] do_el0_svc+0xe8/0x108 [ 1229.440587][ T2659] el0_svc+0x74/0x88 [ 1229.441917][ T2659] el0_sync_handler+0xe4/0x8b4 [ 1229.443464][ T2659] el0_sync+0x17c/0x180 [ 1229.444920][ T2659] Code: aa1703e2 aa1603e1 910a8260 97ecc860 (d4210000) [ 1229.447070][ T2659] ---[ end trace 400497d91baeaf51 ]--- [ 1229.448791][ T2659] Kernel panic - not syncing: Fatal exception [ 1229.450692][ T2659] Kernel Offset: disabled [ 1229.452061][ T2659] CPU features: 0x240002,20002004 [ 1229.453647][ T2659] Memory Limit: none [ 1229.455015][ T2659] ---[ end Kernel panic - not syncing: Fatal exception ]--- Rework the so the default return value is -EOPNOTSUPP. There are likely other callbacks such as security_inode_getsecctx() that may have the same problem, and that someone that understand the code better needs to audit them. Thank you Arnd for helping me figure out what went wrong. Fixes: `98e828a065` ("security: Refactor declaration of LSM hooks") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Chun Liu <kaicliu@tencent.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/bpf/20200512174607.9630-1-anders.roxell@linaro.org	2024-06-11 20:49:57 +08:00
Xinghui Li	5c899e5403	firmware: fix one UAF issue There could be the use after free issue in dmi_sysfs_register_handle. During handling specializations process, the entry->child could be free if the error occurs. However, it will be kobject_put after free. So, we set the entry->child to NULL to avoid above case. Reported-by: loydlv <loydlv@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com>	2024-06-11 20:49:56 +08:00
Xinghui Li	84daaf3511	media:cec:fix double free and uaf issue when cancel data during noblocking data could be free when it is not completed during transmit if the opt is nonblocking.In this case,the regular free could lead to double-free.So, add the return value '-EPERM' to mark the above case. Reported-by: loydlv <loydlv@tencent.com> Signed-off-by: Xinghui Li <korantli@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com>	2024-06-11 20:49:56 +08:00
Lv Yunlong	ca589c18a1	gpu/xen: Fix a use after free in xen_drm_drv_init commit `52762efa2b` upstream. In function displback_changed, has the call chain displback_connect(front_info)->xen_drm_drv_init(front_info). We can see that drm_info is assigned to front_info->drm_info and drm_info is freed in fail branch in xen_drm_drv_init(). Later displback_disconnect(front_info) is called and it calls xen_drm_drv_fini(front_info) cause a use after free by drm_info = front_info->drm_info statement. My patch has done two things. First fixes the fail label which drm_info = kzalloc() failed and still free the drm_info. Second sets front_info->drm_info to NULL to avoid uaf. Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210323014656.10068-1-lyl2019@mail.ustc.edu.cn Signed-off-by: Xinghui Li <korantli@tencent.com> Reviewed-by: Robinlai <robinlai@tencent.com>	2024-06-11 20:49:56 +08:00
Vinicius Costa Gomes	627ec74d6a	igc: Fix use-after-free error during reset upstream commit: `56ea7ed103` Cleans the next descriptor to watch (next_to_watch) when cleaning the TX ring. Failure to do so can cause invalid memory accesses. If igc_poll() runs while the controller is being reset this can lead to the driver try to free a skb that was already freed. Log message: [ 101.525242] refcount_t: underflow; use-after-free. [ 101.525251] WARNING: CPU: 1 PID: 646 at lib/refcount.c:28 refcount_warn_saturate+0xab/0xf0 [ 101.525259] Modules linked in: sch_etf(E) sch_mqprio(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) binfmt_misc(E) kvm_intel(E) kvm(E) irqbypass(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) mei_wdt(E) libaes(E) crypto_simd(E) cryptd(E) glue_helper(E) snd_hda_codec_hdmi(E) rapl(E) intel_cstate(E) snd_hda_intel(E) snd_intel_dspcfg(E) sg(E) soundwire_intel(E) intel_uncore(E) at24(E) soundwire_generic_allocation(E) iTCO_wdt(E) soundwire_cadence(E) intel_pmc_bxt(E) serio_raw(E) snd_hda_codec(E) iTCO_vendor_support(E) watchdog(E) snd_hda_core(E) snd_hwdep(E) snd_soc_core(E) snd_compress(E) snd_pcsp(E) soundwire_bus(E) snd_pcm(E) evdev(E) snd_timer(E) mei_me(E) snd(E) soundcore(E) mei(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc_t10dif(E) crct10dif_generic(E) i915(E) ahci(E) libahci(E) ehci_pci(E) igb(E) xhci_pci(E) ehci_hcd(E) [ 101.525303] drm_kms_helper(E) dca(E) xhci_hcd(E) libata(E) crct10dif_pclmul(E) cec(E) crct10dif_common(E) tsn(E) igc(E) e1000e(E) ptp(E) i2c_i801(E) crc32c_intel(E) psmouse(E) i2c_algo_bit(E) i2c_smbus(E) scsi_mod(E) lpc_ich(E) pps_core(E) usbcore(E) drm(E) button(E) video(E) [ 101.525318] CPU: 1 PID: 646 Comm: irq/37-enp7s0-T Tainted: G E 5.10.30-rt37-tsn1-rt-ipipe #ipipe [ 101.525320] Hardware name: SIEMENS AG SIMATIC IPC427D/A5E31233588, BIOS V17.02.09 03/31/2017 [ 101.525322] RIP: 0010:refcount_warn_saturate+0xab/0xf0 [ 101.525325] Code: 05 31 48 44 01 01 e8 f0 c6 42 00 0f 0b c3 80 3d 1f 48 44 01 00 75 90 48 c7 c7 78 a8 f3 a6 c6 05 0f 48 44 01 01 e8 d1 c6 42 00 <0f> 0b c3 80 3d fe 47 44 01 00 0f 85 6d ff ff ff 48 c7 c7 d0 a8 f3 [ 101.525327] RSP: 0018:ffffbdedc0917cb8 EFLAGS: 00010286 [ 101.525329] RAX: 0000000000000000 RBX: ffff98fd6becbf40 RCX: 0000000000000001 [ 101.525330] RDX: 0000000000000001 RSI: ffffffffa6f2700c RDI: 00000000ffffffff [ 101.525332] RBP: ffff98fd6becc14c R08: ffffffffa7463d00 R09: ffffbdedc0917c50 [ 101.525333] R10: ffffffffa74c3578 R11: 0000000000000034 R12: 00000000ffffff00 [ 101.525335] R13: ffff98fd6b0b1000 R14: 0000000000000039 R15: ffff98fd6be35c40 [ 101.525337] FS: 0000000000000000(0000) GS:ffff98fd6e240000(0000) knlGS:0000000000000000 [ 101.525339] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 101.525341] CR2: 00007f34135a3a70 CR3: 0000000150210003 CR4: 00000000001706e0 [ 101.525343] Call Trace: [ 101.525346] sock_wfree+0x9c/0xa0 [ 101.525353] unix_destruct_scm+0x7b/0xa0 [ 101.525358] skb_release_head_state+0x40/0x90 [ 101.525362] skb_release_all+0xe/0x30 [ 101.525364] napi_consume_skb+0x57/0x160 [ 101.525367] igc_poll+0xb7/0xc80 [igc] [ 101.525376] ? sched_clock+0x5/0x10 [ 101.525381] ? sched_clock_cpu+0xe/0x100 [ 101.525385] net_rx_action+0x14c/0x410 [ 101.525388] __do_softirq+0xe9/0x2f4 [ 101.525391] __local_bh_enable_ip+0xe3/0x110 [ 101.525395] ? irq_finalize_oneshot.part.47+0xe0/0xe0 [ 101.525398] irq_forced_thread_fn+0x6a/0x80 [ 101.525401] irq_thread+0xe8/0x180 [ 101.525403] ? wake_threads_waitq+0x30/0x30 [ 101.525406] ? irq_thread_check_affinity+0xd0/0xd0 [ 101.525408] kthread+0x183/0x1a0 [ 101.525412] ? kthread_park+0x80/0x80 [ 101.525415] ret_from_fork+0x22/0x30 Fixes: `13b5b7fd6a` ("igc: Add support for Tx/Rx rings") Reported-by: Erez Geva <erez.geva.ext@siemens.com> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: jackjunliu <jackjunliu@tencent.com>	2024-06-11 20:49:55 +08:00

1 2 3 4 5 ...

873723 Commits All Branches Search

873723 Commits

All Branches