OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	015b3155c4	Misc fixes: - Fix more generic entry code ABI fallout - Fix debug register handling bugs - Fix vmalloc mappings on 32-bit kernels - Fix kprobes instrumentation output on 32-bit kernels - Fix over-eager WARN_ON_ONCE() on !SMAP hardware - Fix NUMA debugging - Fix Clang related crash on !RETPOLINE kernels Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl9UljIRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g1yA//VecoyJOw4jb43LdkeKDGtUjCsPVZlt4w fw55nT4taqqbgl9mQjrJQlh8thtk7LvAqcsrEGk/SH+1fp/hDvBG0i3etyI1mPJ2 t97MCVtD1bz2zyLpOtGN48tgiRxSazr4S9nZPCLTec+c75I3pmJssj44m/eJi/Z2 hoj/syiO4J0BPa7a1ou++Jeyag6J+PgXdJTOMyjuqi99vqai1aTVKo8GdWMInext +fJNYd0ZQRj1FxVdMusDfzxOk7N7b8nAzvd30iJN67R6QwoEazO12K1F4IYQmHSq 0rhHrwe0lTLtjmYdp/ef14kfzD7DRFN6Nv2gk/zyZsH+tjGflxTZConkFPnfoJEc 33cNHfigh0V9TSVNDDhHnkRyy6dzCHkYHEf33KFuX3amC236TgrCEL7+oWE2rcNp 9PJbPGlXCqNb2feNy2de4cY+KiZ2a1N/T4VcdMK6DEdENFh5T03EZgIChQEd0S99 LNBYHqTWJdQEKfkzfAXlR4Bd2hX1LWLMM6rNcXxInrH7rWDXUCS0X9m3gLZR9DIs 7/nXoK4OkaJdgH/D2CToDgwMNT5hlIiTGtVtB3H6Qz8eQQ4+fwTyboQDqpeG4Upy LfOH2h5Fo33FCgqnrua8IsgUKLwW2yJGdghJpcd9d0qfVUDEJuXGo6xe6SEHdSu/ VEiQtFUf50U= =EhRy -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - more generic entry code ABI fallout - debug register handling bugfixes - fix vmalloc mappings on 32-bit kernels - kprobes instrumentation output fix on 32-bit kernels - fix over-eager WARN_ON_ONCE() on !SMAP hardware - NUMA debugging fix - fix Clang related crash on !RETPOLINE kernels * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Unbreak 32bit fast syscall x86/debug: Allow a single level of #DB recursion x86/entry: Fix AC assertion tracing/kprobes, x86/ptrace: Fix regs argument order for i386 x86, fakenuma: Fix invalid starting node ID x86/mm/32: Bring back vmalloc faulting on x86_32 x86/cmdline: Disable jump tables for cmdline.c	2020-09-06 10:28:00 -07:00
Linus Torvalds	68beef5710	xen: branch for v5.9-rc4 -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCX1Rn1wAKCRCAXGG7T9hj vlEjAQC/KGC3wYw5TweWcY48xVzgvued3JLAQ6pcDlOe6osd6AEAzZcZKgL948cx oY0T98dxb/U+lUhbIzhpBr/30g8JbAQ= =Xcxp -----END PGP SIGNATURE----- Merge tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "A small series for fixing a problem with Xen PVH guests when running as backends (e.g. as dom0). Mapping other guests' memory is now working via ZONE_DEVICE, thus not requiring to abuse the memory hotplug functionality for that purpose" * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: add helpers to allocate unpopulated memory memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC xen/balloon: add header guard	2020-09-06 09:59:27 -07:00
Hans de Goede	f8bd54d219	drm/i915: panel: Use atomic PWM API for devs with an external PWM controller Now that the PWM drivers which we use have been converted to the atomic PWM API, we can move the i915 panel code over to using the atomic PWM API. The removes a long standing FIXME and this removes a flicker where the backlight brightness would jump to 100% when i915 loads even if using the fastset path. Note that this commit also simplifies pwm_disable_backlight(), by dropping the intel_panel_actually_set_backlight(..., 0) call. This call sets the PWM to 0% duty-cycle. I believe that this call was only present as a workaround for a bug in the pwm-crc.c driver where it failed to clear the PWM_OUTPUT_ENABLE bit. This is fixed by an earlier patch in this series. After the dropping of this workaround, the usleep call, which seems unnecessary to begin with, has no useful effect anymore, so drop that too. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-18-hdegoede@redhat.com	2020-09-06 15:53:37 +02:00
Hans de Goede	9a6ae5b354	drm/i915: panel: Honor the VBT PWM min setting for devs with an external PWM controller So far for devices using an external PWM controller (devices using pwm_setup_backlight()), we have been hardcoding the minimum allowed PWM level to 0. But several of these devices specify a non 0 minimum setting in their VBT. Change pwm_setup_backlight() to use get_backlight_min_vbt() to get the minimum level. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-17-hdegoede@redhat.com	2020-09-06 15:53:24 +02:00
Hans de Goede	6b51e7d23a	drm/i915: panel: Honor the VBT PWM frequency for devs with an external PWM controller So far for devices using an external PWM controller (devices using pwm_setup_backlight()), we have been hardcoding the period-time passed to pwm_config() to 21333 ns. I suspect this was done because many VBTs set the PWM frequency to 200 which corresponds to a period-time of 5000000 ns, which greatly exceeds the PWM_MAX_PERIOD_NS define in the Crystal Cove PMIC PWM driver, which used to be 21333. This PWM_MAX_PERIOD_NS define was actually based on a bug in the PWM driver where its period and duty-cycle times where off by a factor of 256. Due to this bug the hardcoded CRC_PMIC_PWM_PERIOD_NS value of 21333 would result in the PWM driver using its divider of 128, which would result in a PWM output frequency of 6000000 Hz / 256 / 128 = 183 Hz. So actually pretty close to the default VBT value of 200 Hz. Now that this bug in the pwm-crc driver is fixed, we can actually use the VBT defined frequency. This is important because: a) With the pwm-crc driver fixed it will now translate the hardcoded CRC_PMIC_PWM_PERIOD_NS value of 21333 ns / 46 Khz to a PWM output frequency of 23 KHz (the max it can do). b) The pwm-lpss driver used on many models has always honored the 21333 ns / 46 Khz request Some panels do not like such high output frequencies. E.g. on a Terra Pad 1061 tablet, using the LPSS PWM controller, the backlight would go from off to max, when changing the sysfs backlight brightness value from 90-100%, anything under aprox. 90% would turn the backlight fully off. Honoring the VBT specified PWM frequency will also hopefully fix the various bug reports which we have received about users perceiving the backlight to flicker after a suspend/resume cycle. Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-16-hdegoede@redhat.com	2020-09-06 15:53:08 +02:00
Hans de Goede	27a79cbc17	drm/i915: panel: Add get_vbt_pwm_freq() helper Factor the code which checks and drm_dbg_kms-s the VBT PWM frequency out of get_backlight_max_vbt(). This is a preparation patch for honering the VBT PWM frequency for devices which use an external PWM controller (devices using pwm_setup_backlight()). Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-15-hdegoede@redhat.com	2020-09-06 15:38:05 +02:00
Hans de Goede	c86b155da7	pwm: crc: Implement get_state() method Implement the pwm_ops.get_state() method to complete the support for the new atomic PWM API. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-14-hdegoede@redhat.com	2020-09-06 15:38:05 +02:00
Hans de Goede	9fccec8219	pwm: crc: Implement apply() method to support the new atomic PWM API Replace the enable, disable and config pwm_ops with an apply op, to support the new atomic PWM API. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-13-hdegoede@redhat.com	2020-09-06 15:38:04 +02:00
Hans de Goede	6fdefe6089	pwm: crc: Enable/disable PWM output on enable/disable The pwm-crc code is using 2 different enable bits: 1. bit 7 of the PWM0_CLK_DIV (PWM_OUTPUT_ENABLE) 2. bit 0 of the BACKLIGHT_EN register So far we've kept the PWM_OUTPUT_ENABLE bit set when disabling the PWM, this commit makes crc_pwm_disable() clear it on disable and makes crc_pwm_enable() set it again on re-enable. Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-12-hdegoede@redhat.com	2020-09-06 15:38:04 +02:00
Hans de Goede	6158231a84	pwm: crc: Fix period changes not having any effect The pwm-crc code is using 2 different enable bits: 1. bit 7 of the PWM0_CLK_DIV (PWM_OUTPUT_ENABLE) 2. bit 0 of the BACKLIGHT_EN register The BACKLIGHT_EN register at address 0x51 really controls a separate output-only GPIO which is earmarked to be used as output connected to the backlight-enable pin for LCD panels, this GPO is part of the PMIC's "Display Panel Control Block." . This pin should probably be moved over to a GPIO provider driver (and consumers modified accordingly), but that is something for an(other) patch. Enabling / disabling the actual PWM output is controlled by the PWM_OUTPUT_ENABLE bit of the PWM0_CLK_DIV register. As the comment in the old code already indicates we must disable the PWM before we can change the clock divider. But the crc_pwm_disable() and crc_pwm_enable() calls the old code make for this only change the BACKLIGHT_EN register; and the value of that register does not matter for changing the period / the divider. What does matter is that the PWM_OUTPUT_ENABLE bit must be cleared before a new value can be written. This commit modifies crc_pwm_config() to clear PWM_OUTPUT_ENABLE instead when changing the period, so that period changes actually work. Note this fix will cause a significant behavior change on some devices using the CRC PWM output to drive their backlight. Before the PWM would always run with the output frequency configured by the BIOS at boot, now the period time specified by the i915 driver will actually be honored. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-11-hdegoede@redhat.com	2020-09-06 15:38:03 +02:00
Hans de Goede	a05af71f0d	pwm: crc: Fix off-by-one error in the clock-divider calculations The CRC PWM controller has a clock-divider which divides the clock with a value between 1-128. But as can seen from the PWM_DIV_CLK_xxx defines, this range maps to a register value of 0-127. So after calculating the clock-divider we must subtract 1 to get the register value, unless the requested frequency was so high that the calculation has already resulted in a (rounded) divider value of 0. Note that before this fix, setting a period of PWM_MAX_PERIOD_NS which corresponds to the max. divider value of 128 could have resulted in a bug where the code would use 128 as divider-register value which would have resulted in an actual divider value of 0 (and the enable bit being set). A rounding error stopped this bug from actually happen. This same rounding error means that after the subtraction of 1 it is impossible to set the divider to 128. Also bump PWM_MAX_PERIOD_NS by 1 ns to allow setting a divider of 128 (register-value 127). Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-10-hdegoede@redhat.com	2020-09-06 15:38:02 +02:00
Hans de Goede	79e0899275	pwm: crc: Fix period / duty_cycle times being off by a factor of 256 While looking into adding atomic-pwm support to the pwm-crc driver I noticed something odd, there is a PWM_BASE_CLK define of 6 MHz and there is a clock-divider which divides this with a value between 1-128, and there are 256 duty-cycle steps. The pwm-crc code before this commit assumed that a clock-divider setting of 1 means that the PWM output is running at 6 MHZ, if that is true, where do these 256 duty-cycle steps come from? This would require an internal frequency of 256 * 6 MHz = 1.5 GHz, that seems unlikely for a PMIC which is using a silicon process optimized for power-switching transistors. It is way more likely that there is an 8 bit counter for the duty cycle which acts as an extra fixed divider wrt the PWM output frequency. The main user of the pwm-crc driver is the i915 GPU driver which uses it for backlight control. Lets compare the PWM register values set by the video-BIOS (the GOP), assuming the extra fixed divider is present versus the PWM frequency specified in the Video-BIOS-Tables: Device: PWM Hz set by BIOS PWM Hz specified in VBT Asus T100TA 200 200 Asus T100HA 200 200 Lenovo Miix 2 8 23437 20000 Toshiba WT8-A 23437 20000 So as we can see if we assume the extra division by 256 then the register values set by the GOP are an exact match for the VBT values, where as otherwise the values would be of by a factor of 256. This commit fixes the period / duty_cycle calculations to take the extra division by 256 into account. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-9-hdegoede@redhat.com	2020-09-06 15:38:02 +02:00
Hans de Goede	547d9e9261	pwm: lpss: Remove suspend/resume handlers PWM controller drivers should not restore the PWM state on resume. The convention is that PWM consumers do this by calling pwm_apply_state(), so that it can be done at the exact moment when the consumer needs the state to be stored, avoiding e.g. backlight flickering. The only in kernel consumers of the pwm-lpss code, the i915 driver and the pwm-class sysfs interface code both correctly restore the state on resume, so there is no need to do this in the pwm-lpss code. More-over the removed resume handler is buggy, since it blindly restores the ctrl-register contents without setting the update bit, which is necessary to get the controller to actually use/apply the restored base-unit and on-time-div values. Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-8-hdegoede@redhat.com	2020-09-06 15:38:01 +02:00
Hans de Goede	d6d54bacb1	pwm: lpss: Make pwm_lpss_apply() not rely on existing hardware state Before this commit pwm_lpss_apply() was assuming 2 pre-conditions were met by the existing hardware state: 1. That the base-unit and on-time-div read back from the control register are those actually in use, so that it can skip setting the update bit if the read-back value matches the desired values. 2. That the controller is enabled when the cached pwm_state.enabled says that the controller is enabled. As the long history of fixes for subtle (often suspend/resume) lpss-pwm issues shows, these assumptions are not necessary always true. 1. Specifically is not true on some () Cherry Trail devices with a nasty GFX0._PS3 method which: a. saves the ctrl reg value. b. sets the base-unit to 0 and writes the update bit to apply/commit c. restores the original ctrl value without setting the update bit, so that the 0 base-unit value is still in use. 2. Assumption 2. currently is true, but only because of the code which saves/restores the state on suspend/resume. By convention restoring the PWM state should be done by the PWM consumer and the presence of this code in the pmw-lpss driver is a bug. Therefor the save/restore code will be dropped in the next patch in this series, after which this assumption also is no longer true. This commit changes the pwm_lpss_apply() to not make any assumptions about the state the hardware is in. Instead it makes pwm_lpss_apply() always fully program the PWM controller, making it much less fragile. ) Seen on the Acer One 10 S1003, Lenovo Ideapad Miix 310 and 320 models and various Medion models. Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-7-hdegoede@redhat.com	2020-09-06 15:38:00 +02:00
Hans de Goede	092d83e3f5	pwm: lpss: Add pwm_lpss_prepare_enable() helper In the not-enabled -> enabled path pwm_lpss_apply() needs to get a runtime-pm reference; and then on any errors it needs to release it again. This leads to somewhat hard to read code. This commit introduces a new pwm_lpss_prepare_enable() helper and moves all the steps necessary for the not-enabled -> enabled transition there, so that we can error check the entire transition in a single place and only have one pm_runtime_put() on failure call site. While working on this I noticed that the enabled -> enabled (update settings) path was quite similar, so I've added an enable parameter to the new pwm_lpss_prepare_enable() helper, which allows using it in that path too. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-6-hdegoede@redhat.com	2020-09-06 15:38:00 +02:00
Hans de Goede	ef9f60daab	pwm: lpss: Add range limit check for the base_unit register value When the user requests a high enough period ns value, then the calculations in pwm_lpss_prepare() might result in a base_unit value of 0. But according to the data-sheet the way the PWM controller works is that each input clock-cycle the base_unit gets added to a N bit counter and that counter overflowing determines the PWM output frequency. Adding 0 to the counter is a no-op. The data-sheet even explicitly states that writing 0 to the base_unit bits will result in the PWM outputting a continuous 0 signal. When the user requestes a low enough period ns value, then the calculations in pwm_lpss_prepare() might result in a base_unit value which is bigger then base_unit_range - 1. Currently the codes for this deals with this by applying a mask: base_unit &= (base_unit_range - 1); But this means that we let the value overflow the range, we throw away the higher bits and store whatever value is left in the lower bits into the register leading to a random output frequency, rather then clamping the output frequency to the highest frequency which the hardware can do. This commit fixes both issues by clamping the base_unit value to be between 1 and (base_unit_range - 1). Fixes: `684309e504` ("pwm: lpss: Avoid potential overflow of base_unit") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-5-hdegoede@redhat.com	2020-09-06 15:37:59 +02:00
Hans de Goede	181f4d2f44	pwm: lpss: Fix off by one error in base_unit math in pwm_lpss_prepare() According to the data-sheet the way the PWM controller works is that each input clock-cycle the base_unit gets added to a N bit counter and that counter overflowing determines the PWM output frequency. So assuming e.g. a 16 bit counter this means that if base_unit is set to 1, after 65535 input clock-cycles the counter has been increased from 0 to 65535 and it will overflow on the next cycle, so it will overflow after every 65536 clock cycles and thus the calculations done in pwm_lpss_prepare() should use 65536 and not 65535. This commit fixes this. Note this also aligns the calculations in pwm_lpss_prepare() with those in pwm_lpss_get_state(). Note this effectively reverts commit `684309e504` ("pwm: lpss: Avoid potential overflow of base_unit"). The next patch in this series really fixes the potential overflow of the base_unit value. Fixes: `684309e504` ("pwm: lpss: Avoid potential overflow of base_unit") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-4-hdegoede@redhat.com	2020-09-06 15:37:58 +02:00
Hans de Goede	15aa5e4c43	ACPI / LPSS: Save Cherry Trail PWM ctx registers only once (at activation) The DSDTs on most Cherry Trail devices have an ugly clutch where the PWM controller gets turned off from the _PS3 method of the graphics-card dev: Method (_PS3, 0, Serialized) // _PS3: Power State 3 { ... PWMB = PWMC /* \_SB_.PCI0.GFX0.PWMC / PSAT \|= 0x03 Local0 = PSAT / \_SB_.PCI0.GFX0.PSAT */ ... } Where PSAT is the power-status register of the PWM controller. Since the i915 driver will do a pwm_get on the pwm device as it uses it to control the LCD panel backlight, there is a device-link marking the i915 device as a consumer of the pwm device. So that the PWM controller will always be suspended after the i915 driver suspends (which is the right thing to do). This causes the above GFX0 PS3 AML code to run before acpi_lpss.c calls acpi_lpss_save_ctx(). So on these devices the PWM controller will already be off when acpi_lpss_save_ctx() runs. This causes it to read/save all 1-s (0xffffffff) as ctx register values. When these bogus values get restored on resume the PWM controller actually keeps working, since most bits are reserved, but this does set bit 3 of the LPSS General purpose register, which for the PWM controller has the following function: "This bit is re-used to support 32kHz slow mode. Default is 19.2MHz as PWM source clock". This causes the clock of the PWM controller to switch from 19.2MHz to 32KHz, which is a slow-down of a factor 600. Surprisingly enough so far there have been few bug reports about this. This is likely because the i915 driver was hardcoding the PWM frequency to 46 KHz, which divided by 600 would result in a PWM frequency of approx. 78 Hz, which mostly still works fine. There are some bug reports about the LCD backlight flickering after suspend/resume which are likely caused by this issue. But with the upcoming patch-series to finally switch the i915 drivers code for external PWM controllers to use the atomic API and to honor the PWM frequency specified in the video BIOS (VBT), this becomes a much bigger problem. On most cases the VBT specifies either 200 Hz or 20 KHz as PWM frequency, which with the mentioned issue ends up being either 1/3 Hz, where the backlight actually visible blinks on and off every 3s, or in 33 Hz and horrible flickering of the backlight. There are a number of possible solutions to this problem: 1. Make acpi_lpss_save_ctx() run before GFX0._PS3 Pro: Clean solution from pov of not medling with save/restore ctx code Con: As mentioned the current ordering is the right thing to do Con: Requires assymmetry in at what suspend/resume phase we do the save vs restore, requiring more suspend/resume ordering hacks in already convoluted acpi_lpss.c suspend/resume code. 2. Do some sort of save once mode for the LPSS ctx Pro: Reasonably clean Con: Needs a new LPSS flag + code changes to handle the flag 3. Detect we have failed to save the ctx registers and do not restore them Pro: Not PWM specific, might help with issues on other LPSS devices too Con: If we can get away with not restoring the ctx why bother with it at all? 4. Do not save the ctx for CHT PWM controllers Pro: Clean, as simple as dropping a flag? Con: Not so simple as dropping a flag, needs a new flag to ensure that we still do lpss_deassert_reset() on device activation. 5. Make the pwm-lpss code fixup the LPSS-context registers Pro: Keeps acpi_lpss.c code clean Con: Moves knowledge of LPSS-context into the pwm-lpss.c code 1 and 5 both do not seem to be a desirable way forward. 3 and 4 seem ok, but they both assume that restoring the LPSS-context registers is not necessary. I have done a couple of test and those do show that restoring the LPSS-context indeed does not seem to be necessary on devices using s2idle suspend (and successfully reaching S0i3). But I have no hardware to test deep / S3 suspend. So I'm not sure that not restoring the context is safe. That leaves solution 2, which is about as simple / clean as 3 and 4, so this commit fixes the described problem by implementing a new LPSS_SAVE_CTX_ONCE flag and setting that for the CHT PWM controllers. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-3-hdegoede@redhat.com	2020-09-06 15:37:58 +02:00
Hans de Goede	5e31ee84c0	ACPI / LPSS: Resume Cherry Trail PWM controller in no-irq phase The DSDTs on most Cherry Trail devices have an ugly clutch where the PWM controller gets poked from the _PS0 method of the graphics-card device: Local0 = PSAT /* \_SB_.PCI0.GFX0.PSAT / If (((Local0 & 0x03) == 0x03)) { PSAT &= 0xFFFFFFFC Local1 = PSAT / \_SB_.PCI0.GFX0.PSAT / RSTA = Zero RSTF = Zero RSTA = One RSTF = One PWMB \|= 0xC0000000 PWMC = PWMB / \_SB_.PCI0.GFX0.PWMB */ } Where PSAT is the power-status register of the PWM controller, so if it is in D3 when the GFX0 device's PS0 method runs then it will turn it on and restore the PWM ctrl register value it saved from its PS3 handler. Note not only does it restore it, it ors it with 0xC0000000 turning it on at a time where we may not want it to get turned on at all. The pwm_get call which the i915 driver does to get a reference to the PWM controller, already adds a device-link making the GFX0 device a consumer of the PWM device. So it should already have been resumed when the above AML runs and the AML should thus not do its undesirable poking of the PWM controller register. But the PCI core powers on PCI devices in the no-irq resume phase and thus calls the troublesome PS0 method in the no-irq resume phase. Where as LPSS devices by default are resumed in the early resume phase. This commit sets the resume_from_noirq flag in the bsw_pwm_dev_desc struct, so that Cherry Trail PWM controllers will be resumed in the no-irq phase. Together with the device-link added by the pwm-get this ensures that the PWM controller will be on when the troublesome PS0 method runs, which stops it from poking the PWM controller. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200903112337.4113-2-hdegoede@redhat.com	2020-09-06 15:37:57 +02:00
Pavel Begunkov	c127a2a1b7	io_uring: fix linked deferred ->files cancellation While looking for ->files in ->defer_list, consider that requests there may actually be links. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-05 16:02:42 -06:00
Pavel Begunkov	b7ddce3cbf	io_uring: fix cancel of deferred reqs with ->files While trying to cancel requests with ->files, it also should look for requests in ->defer_list, otherwise it might end up hanging a thread. Cancel all requests in ->defer_list up to the last request there with matching ->files, that's needed to follow drain ordering semantics. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-05 15:59:51 -06:00
Linus Torvalds	dd9fb9bb33	A trivial patch for auxdisplay: - Replace HTTP links with HTTPS ones (Alexander A. Klimov) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl9TdpwACgkQGXyLc2ht IW0MMxAAkh9gkR1ygJik6Ii9t63jbpG/NG3jJjlxCllh6OZrJ64SJBM7sF80NC1Y MAq+j16yPcyvfI1M378hfwD71c6OYuwfH4z1By8WbAVQ0d7S/Qd9UJMhn8f6U/aA /KLwvq/+YiCT5328MAZQHmmpsGqKBVnARMAf65fNviiYDwjeSd+9G9RtFF2KD2Q6 m+OsOqAr7WvhqZ/1lMOuAXxkuee11WbEi/ZBijJl8ZGglld3eQcW1+aMKHsD/W/Q QXEi5QhdeLv60tu3HSQu6QyNYSPArDv7nZt1Yrxu05Sb+sryMzXgtuy03q6c0iUm zZmFDijcfneWJ5mJda+9301wJOC1NsQnEhJBi0RgjquRw8ydB175rKAzGBIKzaWp toQ9cN/LG/dJU/a4o1lrugIdqv7FJHUdvW2fsaL2twy/ly0sV8yv+dM/e3kItkv4 hzCvGe9RKW2VF2JwhCFDtEkknXkwnaOX+yYB5si1NYSvODvPrln/MADRIB7TONw+ eI8C9B52JQ+66YDb/Yg3OBhncobufYCOgs30taRv7q3sBd1P394RT598mW8JTRIN CAWKYsBfE4aTSL0PZeSfbQrNYxj+BxMrDjg9VQ9lGVzH4lw2vJZ51nSLtxgyuSr6 fwSC6R3Y3WxKKj5M2E0tAzDF0C0V/kVvDzUetFgTHSJN/Kuht6o= =w1IL -----END PGP SIGNATURE----- mergetag object `4e4bb89446` type commit tag clang-format-for-linus-v5.9-rc4 tagger Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> 1599306139 +0200 The usual trivial update: - Update with the latest for_each macro list (Miguel Ojeda) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl9TecAACgkQGXyLc2ht IW0D6A/9Guy8Q+fQoZ+jVynb7zRklfq8+D7rp4LrI+TXshGE1kUWzpwrPQX+QDcD H1wce3TMteZxOJe1KiWF5UwsFD2sGKpI1O8NJLn8Af9oImbIwhoJKxtdT0NOJ0cI 08bzCDfonBAMn5DA2abDkmuRpkeyRXn1AN2mHcxnZ33lTZttWf/SCyrrynhHP4Co AAyeOuJtD5duMFvVtnXU/3aRGUinZia6roVG9wDbWoWYD50H3OeVecVRbnM0Vonf DEUbU2yjQRKVHd8P/mhf8tRhiVCnAt6b+c4wwgzgpRcFhu69b0fjJPmZETRrt4Dc aQZT4ltY9H0R6Wz82Pt8peZieZEvq8cgX+YgCkpLuHNiv8+/Fce+Nllpaokg8lYb rirS+u4t5zX3AaVdKk1riPaYinAOuC5PirGr8JDap4T7kg2Zw+xUSSZgH0sqTK0N /UwvoqZK5wOnwotsn16vhQvxOtzbdgh4lkkkkBPbXMUXtN8PfkHiZ/4S5fSDdrTu hz2F4nHkZlvhkZYKd7tp8UytnN1XEJQF70Z17LTEnVyVTjSoAA2LfSoJ0NCIwdhe A12/H7DB1FlPjcoxdjJka5teoGnlV5JkSAklp6Q+FlC/Roytxq5hEpmm77YShW4a /KuaKlohInKCqA6/ZJq3H1jzz+dT14OHekwdSXk5ZsaMkXejEMA= =Lcl0 -----END PGP SIGNATURE----- mergetag object `e5fc436f06` type commit tag compiler-attributes-for-linus-v5.9-rc4 tagger Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> 1599306822 +0200 Two trivial changes and a fix for sparse: - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van Oostenryck) - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van Oostenryck) - Compiler Attributes: remove comment about sparse not supporting __has_attribute (Luc Van Oostenryck) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAl9TfFAACgkQGXyLc2ht IW0vfQ//dtC2zaqJo3L0RMyLqAWwJbbML1eyy7tT8qPnw635arJXSqyjPliVoRwA 3O3iQOjnss/vfL5bs9OSFEN5yjALP4wm9YZl2EponWZ8WiIKfC/YPSSH2iM2m2kW AK6bG6HPfoSy3Kx04CuxeqGN2F9/hZGLneLfF+y6gdfeAVBRTf1o+SvuCV/phLyW ek2UQK00+00BOVa62ZLZzqCKCHjTBSXtjDrrwg39g4MJ+gsGcUDoF13InRdgSWTY rgUD09OqOTjwaB2GeTiV7mXNn0yydyH0IoXz1EsfqRJV5mik3LMEpo/RWTBX95K1 82N2mrL4ejOPX8gj+iKvcDjC8b1Fiwnb4pw6SsAOTJ5KG/euMVV2ZqAIS4SVVvSx VcGdErNcDBFrpM2py6rM3UhHBOafdlOYAtLKb596hMh4H8Kjt3LFIZAuyLVAJQfz q3xuBhkKSPw/j3ZVJ4iBFJSMIkxKRLUkOtF4dxAUknHOuYktnsIH0T/NcN0sHUqY c/3bcJFG6jQhzrA3DLlwcKWcgeihEjx5JpJFreN97heSEWsn7G1GAMMWUJ1y4KqJ kheczTljl5RhxUqUH2WBMj4MgdmUhwtXqW6Ht9HqP6nKDDotUjinjrSGOP5alej7 3JYNz11c2pWwQNIzItHspmSnaiSUXbcn7WJoRTrO09kcPfsMw7E= =RrSY -----END PGP SIGNATURE----- Merge tags 'auxdisplay-for-linus-v5.9-rc4', 'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux Pull misc fixes from Miguel Ojeda: "A trivial patch for auxdisplay: - Replace HTTP links with HTTPS ones (Alexander A. Klimov) The usual clang-format trivial update: - Update with the latest for_each macro list (Miguel Ojeda) And Luc requested me to pick a sparse fix on my queue, so here it goes along with other two trivial Compiler Attributes ones (also from Luc). - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van Oostenryck) - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van Oostenryck) - Compiler Attributes: remove comment about sparse not supporting __has_attribute (Luc Van Oostenryck)" * tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: auxdisplay: Replace HTTP links with HTTPS ones * tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: clang-format: Update with the latest for_each macro list * tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: sparse: use static inline for __chk_{user,io}_ptr() Compiler Attributes: fix comment concerning GCC 4.6 Compiler Attributes: remove comment about sparse not supporting __has_attribute	2020-09-05 14:22:46 -07:00
Linus Torvalds	70187f7727	ARC fixes for 5.9-rc4 - HSDK-4xd Dev system: perf driver updates for sampling interrupt - HSDK* Dev System : Ethernet broken [Evgeniy Didin] - HIGHMEM broken (2 memory banks) [Mike Rapoport] - show_regs() rewrite once and for all - Other minor fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEOXpuCuR6hedrdLCJadfx3eKKwl4FAl9S5E4ACgkQadfx3eKK wl7y5xAAyExBJlAWH8tyJjKYnRnKUwP5xZxzSLt9k5sUF8+aVMBpSr6PCF9kf1ZP HTFJ3T9NLFOh4cI7Lk9QFkLXbOvuvjgkfqlSmbHjnlTYcuWCqCQfU93bLMhRvrvi oRnWzEXLXwRuYoYz4TVLzxw9SuQr/EtCvR+a4rAyMpUlg+9pWHTokpiv69nDeYSl t3ckCVkdKbqPsUW7fXclyzo5qtsdGgLhYh6pb8ZE/x/Qdk+IjaN4WEmjC5DCy10B VTKEKdK7VKfRGuhIJbPq/UDs0TwMMxic+5sCdQtM54vrQPuVTmpsUopOHQ4/2Y5D IVWsZ3Uy/jnZxS32FP4d4r4Rv+BT9r1c7o148glT9iI0poOZ13BZHNZbLuSxYi4l 1UQ2r/o+E+XKOWzAOc1WIgEJGdxcDu9seJNSSmxs1kBveOIg8S8JjoDyS6ZyTWD+ fbOttxNe3kqGWCdKkgtUGqrg2w0qr3kzJd5aF4ju4fE+KMSVQRylRSWBpM1hcovP rxhcVvxyd1xSS3krrXjBTsjgTlkPDAaLyku/G5VuwpG91LyTiWfSlg+IOfLO+zFm LAwMTlV0gXfV2GH/AlEDJXtvT7GPhXN9sqmqqqOpUmOcKbPE30RmUkUxTJJbGfPN s7ioUV8CJVzRAkxyo79IDQQjT20p13cbJrIeCfgaSKWBf1M64BU= =vOZp -----END PGP SIGNATURE----- Merge tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - HSDK-4xd Dev system: perf driver updates for sampling interrupt - HSDK* Dev System: Ethernet broken [Evgeniy Didin] - HIGHMEM broken (2 memory banks) [Mike Rapoport] - show_regs() rewrite once and for all - Other minor fixes * tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id arc: fix memory initialization for systems with two memory banks irqchip/eznps: Fix build error for !ARC700 builds ARC: show_regs: fix r12 printing and simplify ARC: HSDK: wireup perf irq ARC: perf: don't bail setup if pct irq missing in device-tree ARC: pgalloc.h: delete a duplicated word + other fixes	2020-09-05 13:46:14 -07:00
Linus Torvalds	7514c0362f	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "19 patches. Subsystems affected by this patch series: MAINTAINERS, ipc, fork, checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration, hugetlb)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/linux/log2.h: add missing () around n in roundup_pow_of_two() mm/khugepaged.c: fix khugepaged's request size in collapse_file mm/hugetlb: fix a race between hugetlb sysctl handlers mm/hugetlb: try preferred node first when alloc gigantic page from cma mm/migrate: preserve soft dirty in remove_migration_pte() mm/migrate: remove unnecessary is_zone_device_page() check mm/rmap: fixup copying of soft dirty and uffd ptes mm/migrate: fixup setting UFFD_WP flag mm: madvise: fix vma user-after-free checkpatch: fix the usage of capture group ( ... ) fork: adjust sysctl_max_threads definition to match prototype ipc: adjust proc_ipc_sem_dointvec definition to match prototype mm: track page table modifications in __apply_to_page_range() MAINTAINERS: IA64: mark Status as Odd Fixes only MAINTAINERS: add LLVM maintainers MAINTAINERS: update Cavium/Marvell entries mm: slub: fix conversion of freelist_corrupted() mm: memcg: fix memcg reclaim soft lockup memcg: fix use-after-free in uncharge_batch	2020-09-05 13:28:40 -07:00
Jason Gunthorpe	428fc0aff4	include/linux/log2.h: add missing () around n in roundup_pow_of_two() Otherwise gcc generates warnings if the expression is complicated. Fixes: `312a0c1709` ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
David Howells	e5a59d308f	mm/khugepaged.c: fix khugepaged's request size in collapse_file collapse_file() in khugepaged passes PAGE_SIZE as the number of pages to be read to page_cache_sync_readahead(). The intent was probably to read a single page. Fix it to use the number of pages to the end of the window instead. Fixes: `99cb0dbd47` ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yang Shi <shy828301@gmail.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-2-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Muchun Song	17743798d8	mm/hugetlb: fix a race between hugetlb sysctl handlers There is a race between the assignment of `table->data` and write value to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on the other thread. CPU0: CPU1: proc_sys_write hugetlb_sysctl_handler proc_sys_call_handler hugetlb_sysctl_handler_common hugetlb_sysctl_handler table->data = &tmp; hugetlb_sysctl_handler_common table->data = &tmp; proc_doulongvec_minmax do_proc_doulongvec_minmax sysctl_head_finish __do_proc_doulongvec_minmax unuse_table i = table->data; *i = val; // corrupt CPU1's stack Fix this by duplicating the `table`, and only update the duplicate of it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to simplify the code. The following oops was seen: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page Code: Bad RIP value. ... Call Trace: ? set_max_huge_pages+0x3da/0x4f0 ? alloc_pool_huge_page+0x150/0x150 ? proc_doulongvec_minmax+0x46/0x60 ? hugetlb_sysctl_handler_common+0x1c7/0x200 ? nr_hugepages_store+0x20/0x20 ? copy_fd_bitmaps+0x170/0x170 ? hugetlb_sysctl_handler+0x1e/0x20 ? proc_sys_call_handler+0x2f1/0x300 ? unregister_sysctl_table+0xb0/0xb0 ? __fd_install+0x78/0x100 ? proc_sys_write+0x14/0x20 ? __vfs_write+0x4d/0x90 ? vfs_write+0xef/0x240 ? ksys_write+0xc0/0x160 ? __ia32_sys_read+0x50/0x50 ? __close_fd+0x129/0x150 ? __x64_sys_write+0x43/0x50 ? do_syscall_64+0x6c/0x200 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `e5ff215941` ("hugetlb: multiple hstates for multiple page sizes") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20200828031146.43035-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Li Xinhai	953f064aa6	mm/hugetlb: try preferred node first when alloc gigantic page from cma Since commit `cf11e85fc0` ("mm: hugetlb: optionally allocate gigantic hugepages using cma"), the gigantic page would be allocated from node which is not the preferred node, although there are pages available from that node. The reason is that the nid parameter has been ignored in alloc_gigantic_page(). Besides, the __GFP_THISNODE also need be checked if user required to alloc only from the preferred node. After this patch, the preferred node is tried first before other allowed nodes, and don't try to allocate from other nodes if __GFP_THISNODE is specified. If user don't specify the preferred node, the current node will be used as preferred node, which makes sure consistent behavior of allocating gigantic and non-gigantic hugetlb page. Fixes: `cf11e85fc0` ("mm: hugetlb: optionally allocate gigantic hugepages using cma") Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <guro@fb.com> Link: https://lkml.kernel.org/r/20200902025016.697260-1-lixinhai.lxh@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Ralph Campbell	3d321bf82c	mm/migrate: preserve soft dirty in remove_migration_pte() The code to remove a migration PTE and replace it with a device private PTE was not copying the soft dirty bit from the migration entry. This could lead to page contents not being marked dirty when faulting the page back from device private memory. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Link: https://lkml.kernel.org/r/20200831212222.22409-3-rcampbell@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Ralph Campbell	6128763fc3	mm/migrate: remove unnecessary is_zone_device_page() check Patch series "mm/migrate: preserve soft dirty in remove_migration_pte()". I happened to notice this from code inspection after seeing Alistair Popple's patch ("mm/rmap: Fixup copying of soft dirty and uffd ptes"). This patch (of 2): The check for is_zone_device_page() and is_device_private_page() is unnecessary since the latter is sufficient to determine if the page is a device private page. Simplify the code for easier reading. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Link: https://lkml.kernel.org/r/20200831212222.22409-1-rcampbell@nvidia.com Link: https://lkml.kernel.org/r/20200831212222.22409-2-rcampbell@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Alistair Popple	ad7df764b7	mm/rmap: fixup copying of soft dirty and uffd ptes During memory migration a pte is temporarily replaced with a migration swap pte. Some pte bits from the existing mapping such as the soft-dirty and uffd write-protect bits are preserved by copying these to the temporary migration swap pte. However these bits are not stored at the same location for swap and non-swap ptes. Therefore testing these bits requires using the appropriate helper function for the given pte type. Unfortunately several code locations were found where the wrong helper function is being used to test soft_dirty and uffd_wp bits which leads to them getting incorrectly set or cleared during page-migration. Fix these by using the correct tests based on pte type. Fixes: `a5430dda8a` ("mm/migrate: support un-addressable ZONE_DEVICE page in migration") Fixes: `8c3328f1f3` ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Fixes: `f45ec5ff16` ("userfaultfd: wp: support swap and page migration") Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Alistair Popple <alistair@popple.id.au> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200825064232.10023-2-alistair@popple.id.au Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Alistair Popple	ebdf8321ee	mm/migrate: fixup setting UFFD_WP flag Commit `f45ec5ff16` ("userfaultfd: wp: support swap and page migration") introduced support for tracking the uffd wp bit during page migration. However the non-swap PTE variant was used to set the flag for zone device private pages which are a type of swap page. This leads to corruption of the swap offset if the original PTE has the uffd_wp flag set. Fixes: `f45ec5ff16` ("userfaultfd: wp: support swap and page migration") Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Link: https://lkml.kernel.org/r/20200825064232.10023-1-alistair@popple.id.au Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:30 -07:00
Yang Shi	7867fd7cc4	mm: madvise: fix vma user-after-free The syzbot reported the below use-after-free: BUG: KASAN: use-after-free in madvise_willneed mm/madvise.c:293 [inline] BUG: KASAN: use-after-free in madvise_vma mm/madvise.c:942 [inline] BUG: KASAN: use-after-free in do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145 Read of size 8 at addr ffff8880a6163eb0 by task syz-executor.0/9996 CPU: 0 PID: 9996 Comm: syz-executor.0 Not tainted 5.9.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 madvise_willneed mm/madvise.c:293 [inline] madvise_vma mm/madvise.c:942 [inline] do_madvise.part.0+0x1c8b/0x1cf0 mm/madvise.c:1145 do_madvise mm/madvise.c:1169 [inline] __do_sys_madvise mm/madvise.c:1171 [inline] __se_sys_madvise mm/madvise.c:1169 [inline] __x64_sys_madvise+0xd9/0x110 mm/madvise.c:1169 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Allocated by task 9992: kmem_cache_alloc+0x138/0x3a0 mm/slab.c:3482 vm_area_alloc+0x1c/0x110 kernel/fork.c:347 mmap_region+0x8e5/0x1780 mm/mmap.c:1743 do_mmap+0xcf9/0x11d0 mm/mmap.c:1545 vm_mmap_pgoff+0x195/0x200 mm/util.c:506 ksys_mmap_pgoff+0x43a/0x560 mm/mmap.c:1596 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 9992: kmem_cache_free.part.0+0x67/0x1f0 mm/slab.c:3693 remove_vma+0x132/0x170 mm/mmap.c:184 remove_vma_list mm/mmap.c:2613 [inline] __do_munmap+0x743/0x1170 mm/mmap.c:2869 do_munmap mm/mmap.c:2877 [inline] mmap_region+0x257/0x1780 mm/mmap.c:1716 do_mmap+0xcf9/0x11d0 mm/mmap.c:1545 vm_mmap_pgoff+0x195/0x200 mm/util.c:506 ksys_mmap_pgoff+0x43a/0x560 mm/mmap.c:1596 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It is because vma is accessed after releasing mmap_lock, but someone else acquired the mmap_lock and the vma is gone. Releasing mmap_lock after accessing vma should fix the problem. Fixes: `692fe62433` ("mm: Handle MADV_WILLNEED through vfs_fadvise()") Reported-by: syzbot+b90df26038d1d5d85c97@syzkaller.appspotmail.com Signed-off-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> [5.4+] Link: https://lkml.kernel.org/r/20200816141204.162624-1-shy828301@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Mrinal Pandey	13e45417ce	checkpatch: fix the usage of capture group ( ... ) The usage of "capture group (...)" in the immediate condition after `&&` results in `$1` being uninitialized. This issues a warning "Use of uninitialized value $1 in regexp compilation at ./scripts/checkpatch.pl line 2638". I noticed this bug while running checkpatch on the set of commits from v5.7 to v5.8-rc1 of the kernel on the commits with a diff content in their commit message. This bug was introduced in the script by commit `e518e9a59e` ("checkpatch: emit an error when there's a diff in a changelog"). It has been in the script since then. The author intended to store the match made by capture group in variable `$1`. This should have contained the name of the file as `[\w/]+` matched. However, this couldn't be accomplished due to usage of capture group and `$1` in the same regular expression. Fix this by placing the capture group in the condition before `&&`. Thus, `$1` can be initialized to the text that capture group matches thereby setting it to the desired and required value. Fixes: `e518e9a59e` ("checkpatch: emit an error when there's a diff in a changelog") Signed-off-by: Mrinal Pandey <mrinalmni@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Joe Perches <joe@perches.com> Link: https://lkml.kernel.org/r/20200714032352.f476hanaj2dlmiot@mrinalpandey Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Tobias Klauser	b0daa2c73f	fork: adjust sysctl_max_threads definition to match prototype Commit `32927393dc` ("sysctl: pass kernel pointers to ->proc_handler") changed ctl_table.proc_handler to take a kernel pointer. Adjust the definition of sysctl_max_threads to match its prototype in linux/sysctl.h which fixes the following sparse error/warning: kernel/fork.c:3050:47: warning: incorrect type in argument 3 (different address spaces) kernel/fork.c:3050:47: expected void * kernel/fork.c:3050:47: got void [noderef] __user *buffer kernel/fork.c:3036:5: error: symbol 'sysctl_max_threads' redeclared with different type (incompatible argument 3 (different address spaces)): kernel/fork.c:3036:5: int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... ) kernel/fork.c: note: in included file (through include/linux/key.h, include/linux/cred.h, include/linux/sched/signal.h, include/linux/sched/cputime.h): include/linux/sysctl.h:242:5: note: previously declared as: include/linux/sysctl.h:242:5: int extern [addressable] [signed] [toplevel] sysctl_max_threads( ... ) Fixes: `32927393dc` ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200825093647.24263-1-tklauser@distanz.ch Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Tobias Klauser	fff1662cc4	ipc: adjust proc_ipc_sem_dointvec definition to match prototype Commit `32927393dc` ("sysctl: pass kernel pointers to ->proc_handler") changed ctl_table.proc_handler to take a kernel pointer. Adjust the signature of proc_ipc_sem_dointvec to match ctl_table.proc_handler which fixes the following sparse error/warning: ipc/ipc_sysctl.c:94:47: warning: incorrect type in argument 3 (different address spaces) ipc/ipc_sysctl.c:94:47: expected void buffer ipc/ipc_sysctl.c:94:47: got void [noderef] __user buffer ipc/ipc_sysctl.c:194:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces)) ipc/ipc_sysctl.c:194:35: expected int ( [usertype] proc_handler )( ... ) ipc/ipc_sysctl.c:194:35: got int ( )( ... ) Fixes: `32927393dc` ("sysctl: pass kernel pointers to ->proc_handler") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/20200825105846.5193-1-tklauser@distanz.ch Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Joerg Roedel	e80d3909be	mm: track page table modifications in __apply_to_page_range() __apply_to_page_range() is also used to change and/or allocate page-table pages in the vmalloc area of the address space. Make sure these changes get synchronized to other page-tables in the system by calling arch_sync_kernel_mappings() when necessary. The impact appears limited to x86-32, where apply_to_page_range may miss updating the PMD. That leads to explosions in drivers like BUG: unable to handle page fault for address: fe036000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page *pde = 00000000 Oops: 0002 [#1] SMP CPU: 3 PID: 1300 Comm: gem_concurrent_ Not tainted 5.9.0-rc1+ #16 Hardware name: /NUC6i3SYB, BIOS SYSKLi35.86A.0024.2015.1027.2142 10/27/2015 EIP: __execlists_context_alloc+0x132/0x2d0 [i915] Code: 31 d2 89 f0 e8 2f 55 02 00 89 45 e8 3d 00 f0 ff ff 0f 87 11 01 00 00 8b 4d e8 03 4b 30 b8 5a 5a 5a 5a ba 01 00 00 00 8d 79 04 <c7> 01 5a 5a 5a 5a c7 81 fc 0f 00 00 5a 5a 5a 5a 83 e7 fc 29 f9 81 EAX: 5a5a5a5a EBX: f60ca000 ECX: fe036000 EDX: 00000001 ESI: f43b7340 EDI: fe036004 EBP: f6389cb8 ESP: f6389c9c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010286 CR0: 80050033 CR2: fe036000 CR3: 2d361000 CR4: 001506d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: fffe0ff0 DR7: 00000400 Call Trace: execlists_context_alloc+0x10/0x20 [i915] intel_context_alloc_state+0x3f/0x70 [i915] __intel_context_do_pin+0x117/0x170 [i915] i915_gem_do_execbuffer+0xcc7/0x2500 [i915] i915_gem_execbuffer2_ioctl+0xcd/0x1f0 [i915] drm_ioctl_kernel+0x8f/0xd0 drm_ioctl+0x223/0x3d0 __ia32_sys_ioctl+0x1ab/0x760 __do_fast_syscall_32+0x3f/0x70 do_fast_syscall_32+0x29/0x60 do_SYSENTER_32+0x15/0x20 entry_SYSENTER_32+0x9f/0xf2 EIP: 0xb7f28559 Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 EAX: ffffffda EBX: 00000005 ECX: c0406469 EDX: bf95556c ESI: b7e68000 EDI: c0406469 EBP: 00000005 ESP: bf9554d8 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 Modules linked in: i915 x86_pkg_temp_thermal intel_powerclamp crc32_pclmul crc32c_intel intel_cstate intel_uncore intel_gtt drm_kms_helper intel_pch_thermal video button autofs4 i2c_i801 i2c_smbus fan CR2: 00000000fe036000 It looks like kasan, xen and i915 are vulnerable. Actual impact is "on thinkpad X60 in 5.9-rc1, screen starts blinking after 30-or-so minutes, and machine is unusable" [sfr@canb.auug.org.au: ARCH_PAGE_TABLE_SYNC_MASK needs vmalloc.h] Link: https://lkml.kernel.org/r/20200825172508.16800a4f@canb.auug.org.au [chris@chris-wilson.co.uk: changelog addition] [pavel@ucw.cz: changelog addition] Fixes: `2ba3e6947a` ("mm/vmalloc: track which page-table levels were modified") Fixes: `86cf69f1d8` ("x86/mm/32: implement arch_sync_kernel_mappings()") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> [x86-32] Tested-by: Pavel Machek <pavel@ucw.cz> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> [5.8+] Link: https://lkml.kernel.org/r/20200821123746.16904-1-joro@8bytes.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Randy Dunlap	9d90dd188d	MAINTAINERS: IA64: mark Status as Odd Fixes only IA64 isn't really being maintained, so mark it as Odd Fixes only. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/7e719139-450f-52c2-59a2-7964a34eda1f@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Nick Desaulniers	b964428965	MAINTAINERS: add LLVM maintainers Nominate Nathan and myself to be point of contact for clang/LLVM related support, after a poll at the LLVM BoF at Linux Plumbers Conf 2020. While corporate sponsorship is beneficial, its important to not entrust the keys to the nukes with any one entity. Should Nathan and I find ourselves at the same employer, I would gladly step down. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Acked-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lkml.kernel.org/r/20200825143540.2948637-1-ndesaulniers@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Robert Richter	f548a64570	MAINTAINERS: update Cavium/Marvell entries I am leaving Marvell and already do not have access to my @marvell.com email address. So switching over to my korg mail address or removing my address there another maintainer is already listed. For the entries there no other maintainer is listed I will keep looking into patches for Cavium systems for a while until someone from Marvell takes it over. Since I might have limited access to hardware and also limited time I changed state to 'Odd Fixes' for those entries. Signed-off-by: Robert Richter <rric@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ganapatrao Kulkarni <gkulkarni@marvell.com> Cc: Sunil Goutham <sgoutham@marvell.com> CC: Borislav Petkov <bp@alien8.de> Cc: Marc Zyngier <maz@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Wolfram Sang <wsa@kernel.org>, Link: https://lkml.kernel.org/r/20200824122050.31164-1-rric@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Eugeniu Rosca	dc07a728d4	mm: slub: fix conversion of freelist_corrupted() Commit `52f2347808` ("mm/slub.c: fix corrupted freechain in deactivate_slab()") suffered an update when picked up from LKML [1]. Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()' created a no-op statement. Fix it by sticking to the behavior intended in the original patch [1]. In addition, make freelist_corrupted() immune to passing NULL instead of &freelist. The issue has been spotted via static analysis and code review. [1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle.com/ Fixes: `52f2347808` ("mm/slub.c: fix corrupted freechain in deactivate_slab()") Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: Joe Jin <joe.jin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200824130643.10291-1-erosca@de.adit-jv.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Xunlei Pang	e3336cab25	mm: memcg: fix memcg reclaim soft lockup We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg doesn't have any reclaimable memory. It can be easily reproduced as below: watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 Call Trace: shrink_lruvec+0x49f/0x640 shrink_node+0x2a6/0x6f0 do_try_to_free_pages+0xe9/0x3e0 try_to_free_mem_cgroup_pages+0xef/0x1f0 try_charge+0x2c1/0x750 mem_cgroup_charge+0xd7/0x240 __add_to_page_cache_locked+0x2fd/0x370 add_to_page_cache_lru+0x4a/0xc0 pagecache_get_page+0x10b/0x2f0 filemap_fault+0x661/0xad0 ext4_filemap_fault+0x2c/0x40 __do_fault+0x4d/0xf9 handle_mm_fault+0x1080/0x1790 It only happens on our 1-vcpu instances, because there's no chance for oom reaper to run to reclaim the to-be-killed process. Add a cond_resched() at the upper shrink_node_memcgs() to solve this issue, this will mean that we will get a scheduling point for each memcg in the reclaimed hierarchy without any dependency on the reclaimable memory in that memcg thus making it more predictable. Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Michal Hocko	f1796544a0	memcg: fix use-after-free in uncharge_batch syzbot has reported an use-after-free in the uncharge_batch path BUG: KASAN: use-after-free in instrument_atomic_write include/linux/instrumented.h:71 [inline] BUG: KASAN: use-after-free in atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline] BUG: KASAN: use-after-free in atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline] BUG: KASAN: use-after-free in page_counter_cancel mm/page_counter.c:54 [inline] BUG: KASAN: use-after-free in page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155 Write of size 8 at addr ffff8880371c0148 by task syz-executor.0/9304 CPU: 0 PID: 9304 Comm: syz-executor.0 Not tainted 5.8.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1f0/0x31e lib/dump_stack.c:118 print_address_description+0x66/0x620 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 check_memory_region_inline mm/kasan/generic.c:183 [inline] check_memory_region+0x2b5/0x2f0 mm/kasan/generic.c:192 instrument_atomic_write include/linux/instrumented.h:71 [inline] atomic64_sub_return include/asm-generic/atomic-instrumented.h:970 [inline] atomic_long_sub_return include/asm-generic/atomic-long.h:113 [inline] page_counter_cancel mm/page_counter.c:54 [inline] page_counter_uncharge+0x3d/0xc0 mm/page_counter.c:155 uncharge_batch+0x6c/0x350 mm/memcontrol.c:6764 uncharge_page+0x115/0x430 mm/memcontrol.c:6796 uncharge_list mm/memcontrol.c:6835 [inline] mem_cgroup_uncharge_list+0x70/0xe0 mm/memcontrol.c:6877 release_pages+0x13a2/0x1550 mm/swap.c:911 tlb_batch_pages_flush mm/mmu_gather.c:49 [inline] tlb_flush_mmu_free mm/mmu_gather.c:242 [inline] tlb_flush_mmu+0x780/0x910 mm/mmu_gather.c:249 tlb_finish_mmu+0xcb/0x200 mm/mmu_gather.c:328 exit_mmap+0x296/0x550 mm/mmap.c:3185 __mmput+0x113/0x370 kernel/fork.c:1076 exit_mm+0x4cd/0x550 kernel/exit.c:483 do_exit+0x576/0x1f20 kernel/exit.c:793 do_group_exit+0x161/0x2d0 kernel/exit.c:903 get_signal+0x139b/0x1d30 kernel/signal.c:2743 arch_do_signal+0x33/0x610 arch/x86/kernel/signal.c:811 exit_to_user_mode_loop kernel/entry/common.c:135 [inline] exit_to_user_mode_prepare+0x8d/0x1b0 kernel/entry/common.c:166 syscall_exit_to_user_mode+0x5e/0x1a0 kernel/entry/common.c:241 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Commit `1a3e1f4096` ("mm: memcontrol: decouple reference counting from page accounting") reworked the memcg lifetime to be bound the the struct page rather than charges. It also removed the css_put_many from uncharge_batch and that is causing the above splat. uncharge_batch() is supposed to uncharge accumulated charges for all pages freed from the same memcg. The queuing is done by uncharge_page which however drops the memcg reference after it adds charges to the batch. If the current page happens to be the last one holding the reference for its memcg then the memcg is OK to go and the next page to be freed will trigger batched uncharge which needs to access the memcg which is gone already. Fix the issue by taking a reference for the memcg in the current batch. Fixes: `1a3e1f4096` ("mm: memcontrol: decouple reference counting from page accounting") Reported-by: syzbot+b305848212deec86eabe@syzkaller.appspotmail.com Reported-by: syzbot+b5ea6fb6f139c8b9482b@syzkaller.appspotmail.com Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Hugh Dickins <hughd@google.com> Link: https://lkml.kernel.org/r/20200820090341.GC5033@dhcp22.suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 12:14:29 -07:00
Linus Torvalds	9322c47b21	Fixes (2) for 5.9: - Fix a broken metadata verifier that would incorrectly validate attr fork extents of a realtime file against the realtime volume. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl9RDTMACgkQ+H93GTRK tOseqA/8DTnwKRKuTXc6UboIXbgMwjGTJBv7xTD3KIDF3ls+Gmp2/RWHs7lfGRVY LVdvlgAwDLEe1al8aLaibhrsRCYX8MLvMp8dX6xoqMWCY0KF80VCpXXy+FXbwBmZ GQRqrThyjepseERJj15UC249p9ztBj4DAHQD0r2XD7JMKHBHo6cQ3eYY2NTzgqc3 q0kYcP5xPZZ4R6UFUNkJpkeV8PKpkzWTXyMod0+e8h4njZoRAmHimj7Id9DMQ6HB ciHjZjNCd04Nu6JIpNBwlQE4epCPh79zX8hTQZf5nGNAh13CB9Wc2nHDvwCFBLxw HUdU5BpxMNWGujJ+T5X+RVbzA4VXXaL3iypag9EjXadg+vOV6XmPQv6Fa3WqkfBT GjEXB9+TG9ZuyxAObjP6yn1PRvob0l0iccAbtfnX5bE/mwqx1aDxN1QTfJrjG2Fv 2EiUnqI+jxro66HdS5QU+W8ko7dG0tiQPMF3DCv7nfb3ZxEgfXiDrgbHyYbzdLJq pi5LqXBAfRHkDwRUzRU6G6pT7mplW31iTwh2AuiQogXnpPTWylb0XsUEVfYKpK9q Z0HzwRfHmwQSkIEDZ0rZxP3wH5ssniHyLdXh5FtfFcPSBTyDvflMqFpLmYmvDV/6 MPfwCQAnMTrGv2GRJ8hbFiWd987bYTTfOKwJBEsFKDH01ZKYcV8= =5U8o -----END PGP SIGNATURE----- Merge tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fix from Darrick Wong: "Fix a broken metadata verifier that would incorrectly validate attr fork extents of a realtime file against the realtime volume" * tag 'xfs-5.9-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt files	2020-09-05 10:04:53 -07:00
Mikulas Patocka	b17164e258	xfs: don't update mtime on COW faults When running in a dax mode, if the user maps a page with MAP_PRIVATE and PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime when the user hits a COW fault. This breaks building of the Linux kernel. How to reproduce: 1. extract the Linux kernel tree on dax-mounted xfs filesystem 2. run make clean 3. run make -j12 4. run make -j12 at step 4, make would incorrectly rebuild the whole kernel (although it was already built in step 3). The reason for the breakage is that almost all object files depend on objtool. When we run objtool, it takes COW page fault on its .data section, and these faults will incorrectly update the timestamp of the objtool binary. The updated timestamp causes make to rebuild the whole tree. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 10:00:05 -07:00
Mikulas Patocka	1ef6ea0efe	ext2: don't update mtime on COW faults When running in a dax mode, if the user maps a page with MAP_PRIVATE and PROT_WRITE, the ext2 filesystem would incorrectly update ctime and mtime when the user hits a COW fault. This breaks building of the Linux kernel. How to reproduce: 1. extract the Linux kernel tree on dax-mounted ext2 filesystem 2. run make clean 3. run make -j12 4. run make -j12 at step 4, make would incorrectly rebuild the whole kernel (although it was already built in step 3). The reason for the breakage is that almost all object files depend on objtool. When we run objtool, it takes COW page fault on its .data section, and these faults will incorrectly update the timestamp of the objtool binary. The updated timestamp causes make to rebuild the whole tree. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-09-05 10:00:05 -07:00
Jens Axboe	c183edff33	io_uring: fix explicit async read/write mapping for large segments If we exceed UIO_FASTIOV, we don't handle the transition correctly between an allocated vec for requests that are queued with IOSQE_ASYNC. Store the iovec appropriately and re-set it in the iter iov in case it changed. Fixes: `ff6165b2d7` ("io_uring: retain iov_iter state over io_read/io_write calls") Reported-by: Nick Hill <nick@nickhill.org> Tested-by: Norman Maurer <norman.maurer@googlemail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-05 09:02:47 -06:00
Linus Torvalds	c70672d8d3	s390 fixes for 5.9-rc4 - Fix GENERIC_LOCKBREAK dependency on PREEMPTION in Kconfig broken because of a typo. - Update defconfigs. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl9SpAcACgkQjYWKoQLX FBg+yAgAki9NzPxJAW8cmo001dnG33Fq0AvA8YsUYhlKD7/quqtA514UuhsJydEq NV6jjNQLIkDTit2m7Joh1ScBdaYA87isnbLI/BbarfiNA8MeDCxCuz7AJaTPqGR+ pKHK3eI9vFJW0cACrW1owsER4RpgdXJC74Zr3MoFwzRwQMWu8YH179hAaD8mwp8N nObNIiY3Yot7CQwtody0NSUh5eE0gmVmz/7yv4+zDQaMocBZa2ubqUU5awAVe6y1 L9HHnmhaUxG1iH6aUNFSelF1C3zs1IPlhr9l1WzNppe5+5fU2pZkh409B62HTf2F O/2w3LMPyTUJ1lo/Lt60kmLuo52vLA== =bore -----END PGP SIGNATURE----- Merge tag 's390-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix GENERIC_LOCKBREAK dependency on PREEMPTION in Kconfig broken because of a typo - Update defconfigs * tag 's390-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: update defconfigs s390: fix GENERIC_LOCKBREAK dependency typo in Kconfig	2020-09-04 13:46:33 -07:00
Linus Torvalds	09274aed90	- Fix the loading of modules built with binutils-2.35. This version produces writable and executable .text.ftrace_trampoline section which is rejected by the kernel. - Remove the exporting of cpu_logical_map() as the Tegra driver has now been fixed and no longer uses this function. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl9SaVMACgkQa9axLQDI XvHX1w//eo8OuGiJolZvyPSCDfFaMSxNniW9g6O6A2b9f0L9wVK1RbAW4hQcb+uO rCN8CfCeCq4IyTEG94CA9qbBev8oCtZ3gtpqSwBcK5rS8zozn8Krw0979aQQH7mt kotzi3ac44BA733ElbYETnNfmZPEokkSl0d6lfbzp8b2/kMtFgmg/1e/RRqr1o2s IdKdjy9LyvBcmVv0V3sfnLLzze1LF7xjlpYt9so4Rxlj6TZXkpGOAXOSHfbd87CJ Nq+LnYjMfnbJI7qTBLiUMb3IsT9O3KVQGtDXeVweXsW31h2f1gkwg69hGoRvKJxE vpwE3oxfpkd3eA2CvC73feTG+fXejezLHnT5LHUcwU9gSIm3cybFyTV2GSjskU5g 2RCmR1xHMLV9ZeQzyzN64VsKcqM/3qlooNkVcSJTL7ayqUCvVLct+ECu9rOQy+sz ROQ0BkSZbNCMIY/ixFXwwEG3yX15M13pbq7MM2mY9MchSAGy1EOOXDp65zX+QlO0 ljDA3JLIK2bxFhtsDYrLymwXFFoRHSK2sQnm6DX8/rPfTKq7MHyMpBFnjfrDnAzJ V0Y/gOaSyVUEqq/DYiX9mHhsEc01jThRqGi3j70Lb4xJUVu+rrxzJeetMD8HPrYY Db0YTovd6po/UwCyv0Ybu674fTBEUxPzIEzr0oLdEMV7w9iyX0E= =jbgA -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix the loading of modules built with binutils-2.35. This version produces writable and executable .text.ftrace_trampoline section which is rejected by the kernel. - Remove the exporting of cpu_logical_map() as the Tegra driver has now been fixed and no longer uses this function. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE arm64: Remove exporting cpu_logical_map symbol	2020-09-04 13:40:59 -07:00
Linus Torvalds	16bf121b2d	A few MIPS fixes: - fallthrough fallout fix - BMIPS fixes - MSA fix to avoid leaking MSA register contents - Loongson perf and cpu feature fix - SNI interrupt fix -----BEGIN PGP SIGNATURE----- iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAl9SMkAaHHRzYm9nZW5k QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHA/Lw/8Cp19eTYF0ps60u9i2Gen 5e2yY7TlO1gBerdyA4iX/Jd8wKYu0cCBreHAF9JzdFi1z9s7arguB6DWUOYwa/W/ t0q86CW+EddfwJzjdY+H7cnNQWhaLGeKGXUzevI7v2kd3LAAyPvm7vUeR1zYxVol IJ3XyLGo+NN3xLecc2sSDVAWQR+wfy+1pARbVimkh4wlJjZcrvWgl8+jYv15p8u5 DPywG+wgkRJNWV1hX54qh9bxNOtLajFhAsWIluwAVf/mmQCguY0Gd8bcBDYFqyFt HZPuH5Rhmnm8/alqetzcXCFN8Y9IwzjOlOemwENCs1/0O49mdCF08uOd7wp9Sek2 aXlksJBtNB7jVzYZAiwnQcm/L84gXaxkdMjf47jTzTFkMBn+3/lbcGfKDptDc7U0 sNQmOU1y9y69UP2G7cSGH5cOco7LMReTTDB+35X/wWMnthKh5iw5R2nK0vJzCpWS Vq2PD8vIsO+58rlHFwOv6zZmGOQlCb93Nuzk7zx7GQMAEN05Av9nVC91n47MH352 VVwTv2dsBgNBDyhukz7YFpZLF11K9hYz5+661q0SGmRyKe5a4OD8lVV1kqQ6e8hp dlmcgT/B93cEvs9qnx5E13UO5UyPYn9Trbl2qppMvL9zXyc/jdDnoMfH48LK22YP hUbnUmjJAD55PLFJ+dl/EXM= =HWpo -----END PGP SIGNATURE----- Merge tag 'mips_fixes_5.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: "A few MIPS fixes: - fallthrough fallout fix - BMIPS fixes - MSA fix to avoid leaking MSA register contents - Loongson perf and cpu feature fix - SNI interrupt fix" * tag 'mips_fixes_5.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: SNI: Fix SCSI interrupt MIPS: add missing MSACSR and upper MSA initialization MIPS: perf: Fix wrong check condition of Loongson event IDs mips/oprofile: Fix fallthrough placement MIPS: Loongson64: Remove unnecessary inclusion of boot_param.h MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores MIPS: mm: BMIPS5000 has inclusive physical caches MIPS: Loongson64: Do not override watch and ejtag feature	2020-09-04 13:37:19 -07:00

... 3 4 5 6 7 ...

951007 Commits All Branches Search

951007 Commits

All Branches