OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Juergen Gross	7521d9b90c	xen/xenbus: don't let xenbus_grant_ring() remove grants in error case Commit `3777ea7bac` upstream. Letting xenbus_grant_ring() tear down grants in the error case is problematic, as the other side could already have used these grants. Calling gnttab_end_foreign_access_ref() without checking success is resulting in an unclear situation for any caller of xenbus_grant_ring() as in the error case the memory pages of the ring page might be partially mapped. Freeing them would risk unwanted foreign access to them, while not freeing them would leak memory. In order to remove the need to undo any gnttab_grant_foreign_access() calls, use gnttab_alloc_grant_references() to make sure no further error can occur in the loop granting access to the ring pages. It should be noted that this way of handling removes leaking of grant entries in the error case, too. This is CVE-2022-23040 / part of XSA-396. Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:34 +08:00
Juergen Gross	3fd2171f05	xen/gntalloc: don't use gnttab_query_foreign_access() Commit `d3b6372c58` upstream. Using gnttab_query_foreign_access() is unsafe, as it is racy by design. The use case in the gntalloc driver is not needed at all. While at it replace the call of gnttab_end_foreign_access_ref() with a call of gnttab_end_foreign_access(), which is what is really wanted there. In case the grant wasn't used due to an allocation failure, just free the grant via gnttab_free_grant_reference(). This is CVE-2022-23039 / part of XSA-396. Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:33 +08:00
Juergen Gross	30379972d3	xen/grant-table: add gnttab_try_end_foreign_access() Commit `6b1775f26a` upstream. Add a new grant table function gnttab_try_end_foreign_access(), which will remove and free a grant if it is not in use. Its main use case is to either free a grant if it is no longer in use, or to take some other action if it is still in use. This other action can be an error exit, or (e.g. in the case of blkfront persistent grant feature) some special handling. This is CVE-2022-23036, CVE-2022-23038 / part of XSA-396. Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:33 +08:00
Juergen Gross	500d9c8472	xen/console: harden hvc_xen against event channel storms commit `fe415186b4` upstream. The Xen console driver is still vulnerable for an attack via excessive number of events sent by the backend. Fix that by using a lateeoi event channel. For the normal domU initial console this requires the introduction of bind_evtchn_to_irq_lateeoi() as there is no xenbus device available at the time the event channel is bound to the irq. As the decision whether an interrupt was spurious or not requires to test for bytes having been read from the backend, move sending the event into the if statement, as sending an event without having found any bytes to be read is making no sense at all. This is part of XSA-391 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:32 +08:00
Greg Kroah-Hartman	49dad487c9	usb: gadget: rndis: check size of RNDIS_MSG_SET command commit `38ea1eac7d` upstream. Check the size of the RNDIS_MSG_SET command given to us before attempting to respond to an invalid message size. Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com> Cc: stable@kernel.org Tested-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:32 +08:00
Szymon Heidrich	3425449eb4	USB: gadget: validate interface OS descriptor requests commit `75e5b4849b` upstream. Stall the control endpoint in case provided index exceeds array size of MAX_CONFIG_INTERFACES or when the retrieved function pointer is null. Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:32 +08:00
Hangyu Hua	1a4b25b4d5	usb: gadget: don't release an existing dev->buf commit `89f3594d0d` upstream. dev->buf does not need to be released if it already exists before executing dev_config. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Link: https://lore.kernel.org/r/20211231172138.7993-2-hbh25y@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:31 +08:00
Hans de Goede	3c6773d64f	HID: asus: Add depends on USB_HID to HID_ASUS Kconfig option commit `c4f0126d48` upstream. Since commit `4bc43a4212` ("HID: asus: Add hid_is_using_ll_driver(usb_hid_driver) check") the hid-asus.c depends on the usb_hid_driver symbol. Add a depends on USB_HID to Kconfig to fix missing symbols errors in hid-asus when USB_HID is not enabled. Fixes: `4bc43a4212` ("HID: asus: Add hid_is_using_ll_driver(usb_hid_driver) check") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Jason Self <jason@bluehome.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:31 +08:00
Benjamin Tissoires	def3308a8e	HID: holtek: fix mouse probing commit `93a2207c25` upstream. An overlook from the previous commit: we don't even parse or start the device, meaning that the device is not presented to user space. Fixes: `93020953d0` ("HID: check for valid USB device for many HID drivers") Cc: stable@vger.kernel.org Link: https://bugs.archlinux.org/task/73048 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215341 Link: https://lore.kernel.org/r/e4efbf13-bd8d-0370-629b-6c80c0044b15@leemhuis.info/ Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:31 +08:00
Greg Kroah-Hartman	86c935bd12	HID: check for valid USB device for many HID drivers commit `93020953d0` upstream. Many HID drivers assume that the HID device assigned to them is a USB device as that was the only way HID devices used to be able to be created in Linux. However, with the additional ways that HID devices can be created for many different bus types, that is no longer true, so properly check that we have a USB device associated with the HID device before allowing a driver that makes this assumption to claim it. Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Michael Zaidman <michael.zaidman@gmail.com> Cc: Stefan Achatz <erazor_de@users.sourceforge.net> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: linux-input@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> [bentiss: amended for thrustmater.c hunk to apply] Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20211201183503.2373082-3-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:30 +08:00
Greg Kroah-Hartman	dae308fcab	HID: wacom: fix problems when device is not a valid USB device commit `720ac46720` upstream. The wacom driver accepts devices of more than just USB types, but some code paths can cause problems if the device being controlled is not a USB device due to a lack of checking. Add the needed checks to ensure that the USB device accesses are only happening on a "real" USB device, and not one on some other bus. Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: linux-input@vger.kernel.org Cc: stable@vger.kernel.org Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20211201183503.2373082-2-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:30 +08:00
Benjamin Tissoires	8e28f1143c	HID: bigbenff: prevent null pointer dereference commit `918aa1ef10` upstream. When emulating the device through uhid, there is a chance we don't have output reports and so report_field is null. Cc: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20211202095334.14399-3-benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:29 +08:00
Greg Kroah-Hartman	dac4a35233	HID: add USB_HID dependancy on some USB HID drivers commit `f237d9028f` upstream. Some HID drivers are only for USB drivers, yet did not depend on CONFIG_USB_HID. This was hidden by the fact that the USB functions were stubbed out in the past, but now that drivers are checking for USB devices properly, build errors can occur with some random configurations. Reported-by: kernel test robot <lkp@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20211202114819.2511954-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:29 +08:00
Greg Kroah-Hartman	d16750e2d1	HID: add USB_HID dependancy to hid-chicony commit `d080811f27` upstream. The chicony HID driver only controls USB devices, yet did not have a dependancy on USB_HID. This causes build errors on some configurations like sparc when building due to new changes to the chicony driver. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: stable@vger.kernel.org Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20211203075927.2829218-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:29 +08:00
Greg Kroah-Hartman	1a0d2341de	HID: add USB_HID dependancy to hid-prodikeys commit `30cb3c2ad2` upstream. The prodikeys HID driver only controls USB devices, yet did not have a dependancy on USB_HID. This causes build errors on some configurations like nios2 when building due to new changes to the prodikeys driver. Reported-by: kernel test robot <lkp@intel.com> Cc: stable@vger.kernel.org Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20211203081231.2856936-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:28 +08:00
Greg Kroah-Hartman	8c8e51fb23	HID: add hid_is_usb() function to make it simpler for USB detection commit `f83baa0cb6` upstream. A number of HID drivers already call hid_is_using_ll_driver() but only for the detection of if this is a USB device or not. Make this more obvious by creating hid_is_usb() and calling the function that way. Also converts the existing hid_is_using_ll_driver() functions to use the new call. Cc: Jiri Kosina <jikos@kernel.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: linux-input@vger.kernel.org Cc: stable@vger.kernel.org Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20211201183503.2373082-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:28 +08:00
Greg Kroah-Hartman	0d5a02c47a	USB: gadget: detect too-big endpoint 0 requests commit `153a2d7e33` upstream. Sometimes USB hosts can ask for buffers that are too large from endpoint 0, which should not be allowed. If this happens for OUT requests, stall the endpoint, but for IN requests, trim the request size to the endpoint buffer size. Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:28 +08:00
Takashi Iwai	1f9b52756c	ALSA: pcm: Fix races among concurrent hw_params and hw_free calls commit `92ee3c60ec` upstream. Currently we have neither proper check nor protection against the concurrent calls of PCM hw_params and hw_free ioctls, which may result in a UAF. Since the existing PCM stream lock can't be used for protecting the whole ioctl operations, we need a new mutex to protect those racy calls. This patch introduced a new mutex, runtime->buffer_mutex, and applies it to both hw_params and hw_free ioctl code paths. Along with it, the both functions are slightly modified (the mmap_count check is moved into the state-check block) for code simplicity. Reported-by: Hu Jiahui <kirin.say@gmail.com> Cc: <stable@vger.kernel.org> Reviewed-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20220322170720.3529-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> [OP: backport to 5.4: adjusted context] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:27 +08:00
Christian Löhle	033f9059dc	mmc: block: fix read single on recovery logic commit `54309fde1a` upstream. On reads with MMC_READ_MULTIPLE_BLOCK that fail, the recovery handler will use MMC_READ_SINGLE_BLOCK for each of the blocks, up to MMC_READ_SINGLE_RETRIES times each. The logic for this is fixed to never report unsuccessful reads as success to the block layer. On command error with retries remaining, blk_update_request was called with whatever value error was set last to. In case it was last set to BLK_STS_OK (default), the read will be reported as success, even though there was no data read from the device. This could happen on a CRC mismatch for the response, a card rejecting the command (e.g. again due to a CRC mismatch). In case it was last set to BLK_STS_IOERR, the error is reported correctly, but no retries will be attempted. Fixes: `81196976ed` ("mmc: block: Add blk-mq support") Cc: stable@vger.kernel.org Signed-off-by: Christian Loehle <cloehle@hyperstone.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/bc706a6ab08c4fe2834ba0c05a804672@hyperstone.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:27 +08:00
Jens Wiklander	e033a847a3	tee: handle lookup of shm with reference count 0 commit `dfd0743f1d` upstream. Since the tee subsystem does not keep a strong reference to its idle shared memory buffers, it races with other threads that try to destroy a shared memory through a close of its dma-buf fd or by unmapping the memory. In tee_shm_get_from_id() when a lookup in teedev->idr has been successful, it is possible that the tee_shm is in the dma-buf teardown path, but that path is blocked by the teedev mutex. Since we don't have an API to tell if the tee_shm is in the dma-buf teardown path or not we must find another way of detecting this condition. Fix this by doing the reference counting directly on the tee_shm using a new refcount_t refcount field. dma-buf is replaced by using anon_inode_getfd() instead, this separates the life-cycle of the underlying file from the tee_shm. tee_shm_put() is updated to hold the mutex when decreasing the refcount to 0 and then remove the tee_shm from teedev->idr before releasing the mutex. This means that the tee_shm can never be found unless it has a refcount larger than 0. Fixes: `967c9cca2c` ("tee: generic TEE subsystem") Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Lars Persson <larper@axis.com> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Reported-by: Patrik Lantz <patrik.lantz@axis.com> [JW: backport to 5.4-stable] Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:26 +08:00
Greg Kroah-Hartman	c045c85dd3	moxart: fix potential use-after-free on remove path commit `bd2db32e7c` upstream. It was reported that the mmc host structure could be accessed after it was freed in moxart_remove(), so fix this by saving the base register of the device and using it instead of the pointer dereference. Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Xin Xiong <xiongx18@fudan.edu.cn> Cc: Xin Tan <tanxin.ctf@gmail.com> Cc: Tony Lindgren <tony@atomide.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: linux-mmc@vger.kernel.org Cc: stable <stable@vger.kernel.org> Reported-by: whitehat002 <hackyzh002@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220127071638.4057899-1-gregkh@linuxfoundation.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:26 +08:00
Mathias Krause	1b1293c2b9	drm/vmwgfx: Fix stale file descriptors on failed usercopy commit `a0f90c8815` upstream. A failing usercopy of the fence_rep object will lead to a stale entry in the file descriptor table as put_unused_fd() won't release it. This enables userland to refer to a dangling 'file' object through that still valid file descriptor, leading to all kinds of use-after-free exploitation scenarios. Fix this by deferring the call to fd_install() until after the usercopy has succeeded. Fixes: `c906965dee` ("drm/vmwgfx: Add export fence to file descriptor support") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Zack Rusin <zackr@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:26 +08:00
Tvrtko Ursulin	de3f325935	drm/i915: Flush TLBs before releasing backing store commit `7938d61591` upstream. We need to flush TLBs before releasing backing store otherwise userspace is able to encounter stale entries if a) it is not declaring access to certain buffers and b) it races with the backing store release from a such undeclared execution already executing on the GPU in parallel. The approach taken is to mark any buffer objects which were ever bound to the GPU and to trigger a serialized TLB flush when their backing store is released. Alternatively the flushing could be done on VMA unbind, at which point we would be able to ascertain whether there is potential a parallel GPU execution (which could race), but essentially it boils down to paying the cost of TLB flushes potentially needlessly at VMA unbind time (when the backing store is not known to be going away so not needed for safety), versus potentially needlessly at backing store relase time (since we at that point cannot tell whether there is anything executing on the GPU which uses that object). Thereforce simplicity of implementation has been chosen for now with scope to benchmark and refine later as required. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: samuelliao <samuelliao@tencent.com>	2024-06-11 20:41:25 +08:00
johnnyaiai	b9837b4aca	Revert 'sched: adaptive default skew_tick value' Signed-off-by: johnnyaiai <johnnyaiai@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com>	2024-06-11 20:41:25 +08:00
johnnyaiai	c30056cab7	config/ARM64/performance: Enable CONFIG_ASYNC_PAGE_LOCKING Signed-off-by: johnnyaiai <johnnyaiai@tencent.com>	2024-06-11 20:41:25 +08:00
johnnyaiai	e9417015a3	ARM64/conf: Disable CONFIG_FS_ENCRYPTION [tapd] ID877978657 This configuration resulted in a 10% regression on unixbench's execl testing. Signed-off-by: johnnyaiai <johnnyaiai@tencent.com>	2024-06-11 20:41:24 +08:00
johnnyaiai	df14100ebc	ARM64/Kconfig: Add CONFIG_ASYNC_PAGE_LOCKING Signed-off-by: johnnyaiai <johnnyaiai@tencent.com> Reviewed-by: richardni <richardni@tencent.com>	2024-06-11 20:41:24 +08:00
johnnyaiai	3686f44e66	ARM64/conf: Disable CONFIG_RODATA_FULL_DEFAULT_ENABLED [tapd] ID877978657 This configuration resulted in a 15% regression on unixbench's execl testing. This additional enhancement can be turned on with rodata=full after this patch. Signed-off-by: johnnyaiai <johnnyaiai@tencent.com>	2024-06-11 20:41:24 +08:00
johnnyaiai	29ebe78817	ARM64/conf: Adjust PAGE_SIZE to 4K from 64K [tapd] ID880199289 Change default page size from 64K to 4K for ARM64. Following configs are involved: CONFIG_ARM64_PAGE_SHIFT=12 CONFIG_ARM64_CONT_SHIFT=4 CONFIG_ARCH_MMAP_RND_BITS_MIN=18 CONFIG_ARCH_MMAP_RND_BITS_MAX=33 CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=11 CONFIG_PGTABLE_LEVELS=4 CONFIG_ARM64_4K_PAGES=y CONFIG_ARM64_64K_PAGES=n CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y CONFIG_FORCE_MAX_ZONEORDER=11 Signed-off-by: johnnyaiai <johnnyaiai@tencent.com> Reviewed-by: flyingpeng <flyingpeng@tencent.com>	2024-06-11 20:41:23 +08:00
Jianping Liu	1c606f5c80	open CONFIG_JUMP_LABEL to optimize very unlikely/likely branches This option enables a transparent branch optimization that makes certain almost-always-true or almost-always-false branch conditions even cheaper to execute within the kernel. Signed-off-by: johnnyaiai <johnnyaiai@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:23 +08:00
johnnyaiai	793146009d	config/x86: Disable CONFIG_LATENCYTOP by default Performance degradation due to multi-core contending for global spinlock. Signed-off-by: johnnyaiai <johnnyaiai@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:22 +08:00
johnnyaiai	d3adf65bb5	config/performance: Enable CONFIG_ASYNC_PAGE_LOCKING Signed-off-by: johnnyaiai <johnnyaiai@tencent.com>	2024-06-11 20:41:22 +08:00
Ni Xun	ac0afcc6c6	mm/filemap/c: break generic_file_buffered_read up into multiple functions Patch series "generic_file_buffered_read() improvements", v2. upstream commit id: `723ef24b9b` generic_file_buffered_read() has turned into a real monstrosity to work with. And it's a major performance improvement, for both small random and large sequential reads. On my test box, 4k buffered random reads go from ~150k to ~250k iops, and the improvements to big sequential reads are even bigger. This incorporates the fix for IOCB_WAITQ handling that Jens just posted as well, also factors out lock_page_for_iocb() to improve handling of the various iocb flags. This patch (of 2): This is prep work for changing generic_file_buffered_read() to use find_get_pages_contig() to batch up all the pagecache lookups. This patch should be functionally identical to the existing code and changes as little as of the flow control as possible. More refactoring could be done, this patch is intended to be relatively minimal. Link: https://lkml.kernel.org/r/20201025212949.602194-1-kent.overstreet@gmail.com Link: https://lkml.kernel.org/r/20201025212949.602194-2-kent.overstreet@gmail.com Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:22 +08:00
Goldwyn Rodrigues	e02258e0ab	fs: export generic_file_buffered_read() upstream commit id: `d85dc2e116` Export generic_file_buffered_read() to be used to supplement incomplete direct reads. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:21 +08:00
Souptick Joarder	081fe59b26	mm/filemap.c: remove unused argument from shrink_readahead_size_eio() upstream commit id: `0f8e2db4ea` The first argument of shrink_readahead_size_eio() is not used. Hence remove it from the function definition and from all the callers. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1583868093-24342-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:21 +08:00
Ni Xun	82b5fd569d	mm: allow a controlled amount of unfairness in the page lock upstream commit id: `5ef64cc898` Commit `2a9127fcf2` ("mm: rewrite wait_on_page_bit_common() logic") made the page locking entirely fair, in that if a waiter came in while the lock was held, the lock would be transferred to the lockers strictly in order. That was intended to finally get rid of the long-reported watchdog failures that involved the page lock under extreme load, where a process could end up waiting essentially forever, as other page lockers stole the lock from under it. It also improved some benchmarks, but it ended up causing huge performance regressions on others, simply because fair lock behavior doesn't end up giving out the lock as aggressively, causing better worst-case latency, but potentially much worse average latencies and throughput. Instead of reverting that change entirely, this introduces a controlled amount of unfairness, with a sysctl knob to tune it if somebody needs to. But the default value should hopefully be good for any normal load, allowing a few rounds of lock stealing, but enforcing the strict ordering before the lock has been stolen too many times. There is also a hint from Matthieu Baerts that the fair page coloring may end up exposing an ABBA deadlock that is hidden by the usual optimistic lock stealing, and while the unfairness doesn't fix the fundamental issue (and I'm still looking at that), it avoids it in practice. The amount of unfairness can be modified by writing a new value to the 'sysctl_page_lock_unfairness' variable (default value of 5, exposed through /proc/sys/vm/page_lock_unfairness), but that is hopefully something we'd use mainly for debugging rather than being necessary for any deep system tuning. This whole issue has exposed just how critical the page lock can be, and how contended it gets under certain locks. And the main contention doesn't really seem to be anything related to IO (which was the origin of this lock), but for things like just verifying that the page file mapping is stable while faulting in the page into a page table. Link: https://lore.kernel.org/linux-fsdevel/ed8442fd-6f54-dd84-cd4a-941e8b7ee603@MichaelLarabel.com/ Link: https://www.phoronix.com/scan.php?page=article&item=linux-50-59&num=1 Link: https://lore.kernel.org/linux-fsdevel/c560a38d-8313-51fb-b1ec-e904bd8836bc@tessares.net/ Reported-and-tested-by: Michael Larabel <Michael@michaellarabel.com> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Cc: Dave Chinner <david@fromorbit.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Jan Kara <jack@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:21 +08:00
Linus Torvalds	6d4706063e	list: add "list_del_init_careful()" to go with "list_empty_careful()" upstream commit id: `c6fe44d96f` That gives us ordering guarantees around the pair. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:20 +08:00
Ni Xun	ad03dcb52e	mm: rewrite wait_on_page_bit_common() logic upstream commit id: `2a9127fcf2` It turns out that wait_on_page_bit_common() had several problems, ranging from just unfair behavioe due to re-queueing at the end of the wait queue when re-trying, and an outright bug that could result in missed wakeups (but probably never happened in practice). This rewrites the whole logic to avoid both issues, by simply moving the logic to check (and possibly take) the bit lock into the wakeup path instead. That makes everything much more straightforward, and means that we never need to re-queue the wait entry: if we get woken up, we'll be notified through WQ_FLAG_WOKEN, and the wait queue entry will have been removed, and everything will have been done for us. Link: https://lore.kernel.org/lkml/CAHk-=wjJA2Z3kUFb-5s=6+n0qbTs8ELqKFt9B3pH85a8fGD73w@mail.gmail.com/ Link: https://lore.kernel.org/lkml/alpine.LSU.2.11.2007221359450.1017@eggly.anvils/ Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:20 +08:00
Ni Xun	5a0b6562a2	fs: Add IOCB_NOIO flag for generic_file_read_iter upstream commit id: `41da51bce3` Add an IOCB_NOIO flag that indicates to generic_file_read_iter that it shouldn't trigger any filesystem I/O for the actual request or for readahead. This allows to do tentative reads out of the page cache as some filesystems allow, and to take the appropriate locks and retry the reads only if the requested pages are not cached. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:19 +08:00
Jens Axboe	5fbbe273c1	mm: mark async iocb read as NOWAIT once some data has been copied upstream commit id: `13bd691421` Once we've copied some data for an iocb that is marked with IOCB_WAITQ, we should no longer attempt to async lock a new page. Instead make sure we return the copied amount, and let the caller retry, instead of returning -EIOCBQUEUED for a new page. This should only be possible with read-ahead disabled on the below device, and multiple threads racing on the same file. Haven't been able to reproduce on anything else. Cc: stable@vger.kernel.org # v5.9 Fixes: `1a0a7853b9` ("mm: support async buffered reads in generic_file_buffered_read()") Reported-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:19 +08:00
Hugh Dickins	9e38b8351e	mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback) upstream commit id: `073861ed77` Twice now, when exercising ext4 looped on shmem huge pages, I have crashed on the PF_ONLY_HEAD check inside PageWaiters(): ext4_finish_bio() calling end_page_writeback() calling wake_up_page() on tail of a shmem huge page, no longer an ext4 page at all. The problem is that PageWriteback is not accompanied by a page reference (as the NOTE at the end of test_clear_page_writeback() acknowledges): as soon as TestClearPageWriteback has been done, that page could be removed from page cache, freed, and reused for something else by the time that wake_up_page() is reached. https://lore.kernel.org/linux-mm/20200827122019.GC14765@casper.infradead.org/ Matthew Wilcox suggested avoiding or weakening the PageWaiters() tail check; but I'm paranoid about even looking at an unreferenced struct page, lest its memory might itself have already been reused or hotremoved (and wake_up_page_bit() may modify that memory with its ClearPageWaiters()). Then on crashing a second time, realized there's a stronger reason against that approach. If my testing just occasionally crashes on that check, when the page is reused for part of a compound page, wouldn't it be much more common for the page to get reused as an order-0 page before reaching wake_up_page()? And on rare occasions, might that reused page already be marked PageWriteback by its new user, and already be waited upon? What would that look like? It would look like BUG_ON(PageWriteback) after wait_on_page_writeback() in write_cache_pages() (though I have never seen that crash myself). Matthew Wilcox explaining this to himself: "page is allocated, added to page cache, dirtied, writeback starts, --- thread A --- filesystem calls end_page_writeback() test_clear_page_writeback() --- context switch to thread B --- truncate_inode_pages_range() finds the page, it doesn't have writeback set, we delete it from the page cache. Page gets reallocated, dirtied, writeback starts again. Then we call write_cache_pages(), see PageWriteback() set, call wait_on_page_writeback() --- context switch back to thread A --- wake_up_page(page, PG_writeback); ... thread B is woken, but because the wakeup was for the old use of the page, PageWriteback is still set. Devious" And prior to `2a9127fcf2` ("mm: rewrite wait_on_page_bit_common() logic") this would have been much less likely: before that, wake_page_function()'s non-exclusive case would stop walking and not wake if it found Writeback already set again; whereas now the non-exclusive case proceeds to wake. I have not thought of a fix that does not add a little overhead: the simplest fix is for end_page_writeback() to get_page() before calling test_clear_page_writeback(), then put_page() after wake_up_page(). Was there a chance of missed wakeups before, since a page freed before reaching wake_up_page() would have PageWaiters cleared? I think not, because each waiter does hold a reference on the page. This bug comes when the old use of the page, the one we do TestClearPageWriteback on, had no waiters, so no additional page reference beyond the page cache (and whoever racily freed it). The reuse of the page has a waiter holding a reference, and its own PageWriteback set; but the belated wake_up_page() has woken the reuse to hit that BUG_ON(PageWriteback). Reported-by: syzbot+3622cea378100f45d59f@syzkaller.appspotmail.com Reported-by: Qian Cai <cai@lca.pw> Fixes: `2a9127fcf2` ("mm: rewrite wait_on_page_bit_common() logic") Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:19 +08:00
Jens Axboe	6dd74eaec7	mm: never attempt async page lock if we've transferred data already upstream commit id: `0abed7c69b` We catch the case where we enter generic_file_buffered_read() with data already transferred, but we also need to be careful not to allow an async page lock if we're looping transferring data. If not, we could be returning -EIOCBQUEUED instead of the transferred amount, and it could result in double waitqueue additions as well. Cc: stable@vger.kernel.org # v5.9 Fixes: `1a0a7853b9` ("mm: support async buffered reads in generic_file_buffered_read()") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:18 +08:00
Ni Xun	d0f72f7c0d	io_uring: fix async buffered reads when readahead is disabled upstream commit id: `c8d317aa18` The async buffered reads feature is not working when readahead is turned off. There are two things to concern: - when doing retry in io_read, not only the IOCB_WAITQ flag but also the IOCB_NOWAIT flag is still set, which makes it goes to would_block phase in generic_file_buffered_read() and then return -EAGAIN. After that, the io-wq thread work is queued, and later doing the async reads in the old way. - even if we remove IOCB_NOWAIT when doing retry, the feature is still not running properly, since in generic_file_buffered_read() it goes to lock_page_killable() after calling mapping->a_ops->readpage() to do IO, and thus causing process to sleep. Fixes: `1a0a7853b9` ("mm: support async buffered reads in generic_file_buffered_read()") Fixes: `3b2a4439e0` ("io_uring: get rid of kiocb_wait_page_queue_init()") Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:18 +08:00
Jens Axboe	6a252ef099	mm: support async buffered reads in generic_file_buffered_read() upstream commit id: `1a0a7853b9` Use the async page locking infrastructure, if IOCB_WAITQ is set in the passed in iocb. The caller must expect an -EIOCBQUEUED return value, which means that IO is started but not done yet. This is similar to how O_DIRECT signals the same operation. Once the callback is received by the caller for IO completion, the caller must retry the operation. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:18 +08:00
Ni Xun	e3930bc8fe	mm: add support for async page locking upstream commit id: `dd3e6d5039` Normally waiting for a page to become unlocked, or locking the page, requires waiting for IO to complete. Add support for lock_page_async() and wait_on_page_locked_async(), which are callback based instead. This allows a caller to get notified when a page becomes unlocked, rather than wait for it. We add a new iocb field, ki_waitq, to pass in the necessary data for this to happen. We can unionize this with ki_cookie, since that is only used for polled IO. Polled IO can never co-exist with async callbacks, as it is (by definition) polled completions. struct wait_page_key is made public, and we define struct wait_page_async as the interface between the caller and the core. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:17 +08:00
Jens Axboe	965811582e	mm: abstract out wake_page_match() from wake_page_function() upstream commit id: `c7510ab2cf` No functional changes in this patch, just in preparation for allowing more callers. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:17 +08:00
Ni Xun	60a94f4645	ext4: drop unnecessary journal handle in delalloc write mainline inclusion from mainline-5.15-rc4 commit `cc883236b7` category: perf bugzilla: NA CVE: NA --------------------------- After we factor out the inline data write procedure from ext4_da_write_end(), we don't need to start journal handle for the cases of both buffer overwrite and append-write. If we need to update i_disksize, mark_inode_dirty() do start handle and update inode buffer. So we could just remove all the journal handle codes in the delalloc write procedure. After this patch, we could get a lot of performance improvement. Below is the Unixbench comparison data test on my machine with 'Intel Xeon Gold 5120' CPU and nvme SSD backend. Test cmd: ./Run -c 56 -i 3 fstime fsbuffer fsdisk Before this patch: System Benchmarks Partial Index BASELINE RESULT INDEX File Copy 1024 bufsize 2000 maxblocks 3960.0 422965.0 1068.1 File Copy 256 bufsize 500 maxblocks 1655.0 105077.0 634.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 1429092.0 2464.0 ====== System Benchmarks Index Score (Partial Only) 1186.6 After this patch: System Benchmarks Partial Index BASELINE RESULT INDEX File Copy 1024 bufsize 2000 maxblocks 3960.0 732716.0 1850.3 File Copy 256 bufsize 500 maxblocks 1655.0 184940.0 1117.5 File Copy 4096 bufsize 8000 maxblocks 5800.0 2427152.0 4184.7 ====== System Benchmarks Index Score (Partial Only) 2053.0 Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210716122024.1105856-5-yi.zhang@huawei.com Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Ni Xun <richardni@tencent.com>	2024-06-11 20:41:16 +08:00
Ni Xun	1a5b328498	ext4: factor out write end code of inline file mainline inclusion from mainline-5.15-rc4 commit `6984aef598` category: perf bugzilla: NA CVE: NA --------------------------- Now that the inline_data file write end procedure are falled into the common write end functions, it is not clear. Factor them out and do some cleanup. This patch also drop ext4_da_write_inline_data_end() and switch to use ext4_write_inline_data_end() instead because we also need to do the same error processing if we failed to write data into inline entry. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210716122024.1105856-4-yi.zhang@huawei.com Conflicts: fs/ext4/inline.c fs/ext4/inode.c Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Ni Xun <richardni@tencent.com>	2024-06-11 20:41:16 +08:00
Ni Xun	cecb02f6de	ext4: correct the error path of ext4_write_inline_data_end() mainline inclusion from mainline-5.15-rc4 commit `55ce2f649b` category: perf bugzilla: NA CVE: NA --------------------------- Current error path of ext4_write_inline_data_end() is not correct. Firstly, it should pass out the error value if ext4_get_inode_loc() return fail, or else it could trigger infinite loop if we inject error here. And then it's better to add inode to orphan list if it return fail in ext4_journal_stop(), otherwise we could not restore inline xattr entry after power failure. Finally, we need to reset the 'ret' value if ext4_write_inline_data_end() return success in ext4_write_end() and ext4_journalled_write_end(), otherwise we could not get the error return value of ext4_journal_stop(). Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210716122024.1105856-3-yi.zhang@huawei.com Conflicts: fs/ext4/inode.c Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Ni Xun <richardni@tencent.com>	2024-06-11 20:41:16 +08:00
Zhang Yi	f61a0b1964	ext4: check and update i_disksize properly upstream commit id: `4df031ff58` After commit `3da40c7b08` ("ext4: only call ext4_truncate when size <= isize"), i_disksize could always be updated to i_size in ext4_setattr(), and we could sure that i_disksize <= i_size since holding inode lock and if i_disksize < i_size there are delalloc writes pending in the range upto i_size. If the end of the current write is <= i_size, there's no need to touch i_disksize since writeback will push i_disksize upto i_size eventually. So we can switch to check i_size instead of i_disksize in ext4_da_write_end() when write to the end of the file. we also could remove ext4_mark_inode_dirty() together because we defer inode dirtying to generic_write_end() or ext4_da_write_inline_data_end(). Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210716122024.1105856-2-yi.zhang@huawei.com Conflicts: fs/ext4/inode.c Reviewed-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Ni Xun <richardni@tencent.com> Reviewed-by: Alex Shi <alexsshi@tencent.com> Reviewed-by: robinlai <robinlai@tencent.com>	2024-06-11 20:41:15 +08:00

... 3 4 5 6 7 ...

873723 Commits All Branches Search

873723 Commits

All Branches