License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2016-02-27 02:43:28 +08:00
|
|
|
#ifndef __CPUHOTPLUG_H
|
|
|
|
#define __CPUHOTPLUG_H
|
|
|
|
|
2016-09-14 18:00:26 +08:00
|
|
|
#include <linux/types.h>
|
|
|
|
|
2017-09-21 01:00:15 +08:00
|
|
|
/*
|
|
|
|
* CPU-up CPU-down
|
|
|
|
*
|
|
|
|
* BP AP BP AP
|
|
|
|
*
|
|
|
|
* OFFLINE OFFLINE
|
|
|
|
* | ^
|
|
|
|
* v |
|
|
|
|
* BRINGUP_CPU->AP_OFFLINE BRINGUP_CPU <- AP_IDLE_DEAD (idle thread/play_dead)
|
|
|
|
* | AP_OFFLINE
|
|
|
|
* v (IRQ-off) ,---------------^
|
|
|
|
* AP_ONLNE | (stop_machine)
|
|
|
|
* | TEARDOWN_CPU <- AP_ONLINE_IDLE
|
|
|
|
* | ^
|
|
|
|
* v |
|
|
|
|
* AP_ACTIVE AP_ACTIVE
|
|
|
|
*/
|
|
|
|
|
2021-09-09 20:34:59 +08:00
|
|
|
/*
|
|
|
|
* CPU hotplug states. The state machine invokes the installed state
|
|
|
|
* startup callbacks sequentially from CPUHP_OFFLINE + 1 to CPUHP_ONLINE
|
|
|
|
* during a CPU online operation. During a CPU offline operation the
|
|
|
|
* installed teardown callbacks are invoked in the reverse order from
|
|
|
|
* CPU_ONLINE - 1 down to CPUHP_OFFLINE.
|
|
|
|
*
|
|
|
|
* The state space has three sections: PREPARE, STARTING and ONLINE.
|
|
|
|
*
|
|
|
|
* PREPARE: The callbacks are invoked on a control CPU before the
|
|
|
|
* hotplugged CPU is started up or after the hotplugged CPU has died.
|
|
|
|
*
|
|
|
|
* STARTING: The callbacks are invoked on the hotplugged CPU from the low level
|
|
|
|
* hotplug startup/teardown code with interrupts disabled.
|
|
|
|
*
|
|
|
|
* ONLINE: The callbacks are invoked on the hotplugged CPU from the per CPU
|
|
|
|
* hotplug thread with interrupts and preemption enabled.
|
|
|
|
*
|
|
|
|
* Adding explicit states to this enum is only necessary when:
|
|
|
|
*
|
|
|
|
* 1) The state is within the STARTING section
|
|
|
|
*
|
|
|
|
* 2) The state has ordering constraints vs. other states in the
|
|
|
|
* same section.
|
|
|
|
*
|
|
|
|
* If neither #1 nor #2 apply, please use the dynamic state space when
|
|
|
|
* setting up a state by using CPUHP_PREPARE_DYN or CPUHP_PREPARE_ONLINE
|
|
|
|
* for the @state argument of the setup function.
|
|
|
|
*
|
|
|
|
* See Documentation/core-api/cpu_hotplug.rst for further information and
|
|
|
|
* examples.
|
|
|
|
*/
|
2016-02-27 02:43:28 +08:00
|
|
|
enum cpuhp_state {
|
2017-09-21 01:00:21 +08:00
|
|
|
CPUHP_INVALID = -1,
|
2021-09-09 20:34:59 +08:00
|
|
|
|
|
|
|
/* PREPARE section invoked on a control CPU */
|
2017-09-21 01:00:21 +08:00
|
|
|
CPUHP_OFFLINE = 0,
|
2016-02-27 02:43:28 +08:00
|
|
|
CPUHP_CREATE_THREADS,
|
2016-07-14 01:16:09 +08:00
|
|
|
CPUHP_PERF_PREPARE,
|
2016-07-14 01:16:10 +08:00
|
|
|
CPUHP_PERF_X86_PREPARE,
|
2016-07-14 01:16:13 +08:00
|
|
|
CPUHP_PERF_X86_AMD_UNCORE_PREP,
|
2016-07-14 01:16:20 +08:00
|
|
|
CPUHP_PERF_POWER,
|
2016-07-14 01:16:23 +08:00
|
|
|
CPUHP_PERF_SUPERH,
|
2016-07-14 01:16:30 +08:00
|
|
|
CPUHP_X86_HPET_DEAD,
|
2016-07-14 01:16:34 +08:00
|
|
|
CPUHP_X86_APB_DEAD,
|
2016-11-11 01:44:47 +08:00
|
|
|
CPUHP_X86_MCE_DEAD,
|
2016-08-13 01:49:43 +08:00
|
|
|
CPUHP_VIRT_NET_DEAD,
|
2022-11-11 05:32:17 +08:00
|
|
|
CPUHP_IBMVNIC_DEAD,
|
2016-08-18 20:57:19 +08:00
|
|
|
CPUHP_SLUB_DEAD,
|
2020-09-08 14:27:09 +08:00
|
|
|
CPUHP_DEBUG_OBJ_DEAD,
|
2016-08-18 20:57:20 +08:00
|
|
|
CPUHP_MM_WRITEBACK_DEAD,
|
mm/migrate: fix CPUHP state to update node demotion order
The node demotion order needs to be updated during CPU hotplug. Because
whether a NUMA node has CPU may influence the demotion order. The
update function should be called during CPU online/offline after the
node_states[N_CPU] has been updated. That is done in
CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD during
CPU offline. But in commit 884a6e5d1f93 ("mm/migrate: update node
demotion order on hotplug events"), the function to update node demotion
order is called in CPUHP_AP_ONLINE_DYN during CPU online/offline. This
doesn't satisfy the order requirement.
For example, there are 4 CPUs (P0, P1, P2, P3) in 2 sockets (P0, P1 in S0
and P2, P3 in S1), the demotion order is
- S0 -> NUMA_NO_NODE
- S1 -> NUMA_NO_NODE
After P2 and P3 is offlined, because S1 has no CPU now, the demotion
order should have been changed to
- S0 -> S1
- S1 -> NO_NODE
but it isn't changed, because the order updating callback for CPU
hotplug doesn't see the new nodemask. After that, if P1 is offlined,
the demotion order is changed to the expected order as above.
So in this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
CPUHP_MM_DEMOTION_DEAD to be called after CPUHP_AP_ONLINE_DYN and
CPUHP_MM_VMSTAT_DEAD during CPU online and offline, and register the
update function on them.
Link: https://lkml.kernel.org/r/20210929060351.7293-1-ying.huang@intel.com
Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-19 06:15:35 +08:00
|
|
|
/* Must be after CPUHP_MM_VMSTAT_DEAD */
|
|
|
|
CPUHP_MM_DEMOTION_DEAD,
|
2016-11-29 22:52:21 +08:00
|
|
|
CPUHP_MM_VMSTAT_DEAD,
|
2016-08-18 20:57:21 +08:00
|
|
|
CPUHP_SOFTIRQ_DEAD,
|
2016-08-18 20:57:23 +08:00
|
|
|
CPUHP_NET_MVNETA_DEAD,
|
2016-08-18 20:57:25 +08:00
|
|
|
CPUHP_CPUIDLE_DEAD,
|
2016-09-07 01:04:37 +08:00
|
|
|
CPUHP_ARM64_FPSIMD_DEAD,
|
2016-09-07 01:04:39 +08:00
|
|
|
CPUHP_ARM_OMAP_WAKE_DEAD,
|
2016-09-07 01:04:43 +08:00
|
|
|
CPUHP_IRQ_POLL_DEAD,
|
2016-09-07 01:04:44 +08:00
|
|
|
CPUHP_BLOCK_SOFTIRQ_DEAD,
|
2021-03-09 02:37:47 +08:00
|
|
|
CPUHP_BIO_DEAD,
|
2016-09-07 01:04:47 +08:00
|
|
|
CPUHP_ACPI_CPUDRV_DEAD,
|
2016-09-07 01:04:53 +08:00
|
|
|
CPUHP_S390_PFAULT_DEAD,
|
2016-09-07 01:04:55 +08:00
|
|
|
CPUHP_BLK_MQ_DEAD,
|
2016-11-03 22:49:57 +08:00
|
|
|
CPUHP_FS_BUFF_DEAD,
|
2016-11-03 22:49:58 +08:00
|
|
|
CPUHP_PRINTK_DEAD,
|
2016-11-03 22:49:59 +08:00
|
|
|
CPUHP_MM_MEMCQ_DEAD,
|
2021-08-07 02:05:37 +08:00
|
|
|
CPUHP_XFS_DEAD,
|
2016-11-03 22:50:00 +08:00
|
|
|
CPUHP_PERCPU_CNT_DEAD,
|
2016-11-03 22:50:01 +08:00
|
|
|
CPUHP_RADIX_DEAD,
|
2021-06-29 10:42:15 +08:00
|
|
|
CPUHP_PAGE_ALLOC,
|
2016-11-03 22:50:04 +08:00
|
|
|
CPUHP_NET_DEV_DEAD,
|
2016-11-18 02:35:28 +08:00
|
|
|
CPUHP_PCI_XGENE_DEAD,
|
2021-03-25 20:29:58 +08:00
|
|
|
CPUHP_IOMMU_IOVA_DEAD,
|
2016-12-22 03:19:52 +08:00
|
|
|
CPUHP_LUSTRE_CFS_DEAD,
|
2017-12-01 08:10:11 +08:00
|
|
|
CPUHP_AP_ARM_CACHE_B15_RAC_DEAD,
|
2019-12-04 03:31:10 +08:00
|
|
|
CPUHP_PADATA_DEAD,
|
2021-03-12 21:04:07 +08:00
|
|
|
CPUHP_AP_DTPM_CPU_DEAD,
|
random: clear fast pool, crng, and batches in cpuhp bring up
For the irq randomness fast pool, rather than having to use expensive
atomics, which were visibly the most expensive thing in the entire irq
handler, simply take care of the extreme edge case of resetting count to
zero in the cpuhp online handler, just after workqueues have been
reenabled. This simplifies the code a bit and lets us use vanilla
variables rather than atomics, and performance should be improved.
As well, very early on when the CPU comes up, while interrupts are still
disabled, we clear out the per-cpu crng and its batches, so that it
always starts with fresh randomness.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-14 05:48:04 +08:00
|
|
|
CPUHP_RANDOM_PREPARE,
|
2016-07-14 01:16:29 +08:00
|
|
|
CPUHP_WORKQUEUE_PREP,
|
2016-07-18 22:07:28 +08:00
|
|
|
CPUHP_POWER_NUMA_PREPARE,
|
2016-07-15 16:41:04 +08:00
|
|
|
CPUHP_HRTIMERS_PREPARE,
|
2016-07-14 01:16:59 +08:00
|
|
|
CPUHP_PROFILE_PREPARE,
|
2016-07-14 01:17:00 +08:00
|
|
|
CPUHP_X2APIC_PREPARE,
|
2016-07-14 01:17:01 +08:00
|
|
|
CPUHP_SMPCFD_PREPARE,
|
2016-08-18 20:57:17 +08:00
|
|
|
CPUHP_RELAY_PREPARE,
|
2016-08-23 20:53:19 +08:00
|
|
|
CPUHP_SLAB_PREPARE,
|
2016-08-18 20:57:24 +08:00
|
|
|
CPUHP_MD_RAID5_PREPARE,
|
2016-07-14 01:17:03 +08:00
|
|
|
CPUHP_RCUTREE_PREP,
|
2016-08-24 17:14:44 +08:00
|
|
|
CPUHP_CPUIDLE_COUPLED_PREPARE,
|
2016-08-18 20:57:30 +08:00
|
|
|
CPUHP_POWERPC_PMAC_PREPARE,
|
2016-08-18 20:57:31 +08:00
|
|
|
CPUHP_POWERPC_MMU_CTX_PREPARE,
|
2016-09-08 01:19:00 +08:00
|
|
|
CPUHP_XEN_PREPARE,
|
2016-09-08 01:19:01 +08:00
|
|
|
CPUHP_XEN_EVTCHN_PREPARE,
|
2016-09-07 01:04:38 +08:00
|
|
|
CPUHP_ARM_SHMOBILE_SCU_PREPARE,
|
2016-09-07 01:04:41 +08:00
|
|
|
CPUHP_SH_SH3X_PREPARE,
|
2016-11-03 22:50:05 +08:00
|
|
|
CPUHP_NET_FLOW_PREPARE,
|
2016-11-03 22:50:09 +08:00
|
|
|
CPUHP_TOPOLOGY_PREPARE,
|
2016-11-18 02:35:33 +08:00
|
|
|
CPUHP_NET_IUCV_PREPARE,
|
2016-11-18 02:35:35 +08:00
|
|
|
CPUHP_ARM_BL_PREPARE,
|
2016-11-27 07:13:34 +08:00
|
|
|
CPUHP_TRACE_RB_PREPARE,
|
2016-11-27 07:13:38 +08:00
|
|
|
CPUHP_MM_ZS_PREPARE,
|
2016-11-27 07:13:39 +08:00
|
|
|
CPUHP_MM_ZSWP_MEM_PREPARE,
|
2016-11-27 07:13:40 +08:00
|
|
|
CPUHP_MM_ZSWP_POOL_PREPARE,
|
2016-11-27 07:13:45 +08:00
|
|
|
CPUHP_KVM_PPC_BOOK3S_PREPARE,
|
2016-11-27 07:13:46 +08:00
|
|
|
CPUHP_ZCOMP_PREPARE,
|
2017-12-28 04:37:25 +08:00
|
|
|
CPUHP_TIMERS_PREPARE,
|
2016-09-07 01:04:51 +08:00
|
|
|
CPUHP_MIPS_SOC_PREPARE,
|
2017-01-10 21:01:05 +08:00
|
|
|
CPUHP_BP_PREPARE_DYN,
|
|
|
|
CPUHP_BP_PREPARE_DYN_END = CPUHP_BP_PREPARE_DYN + 20,
|
2016-02-27 02:43:28 +08:00
|
|
|
CPUHP_BRINGUP_CPU,
|
2021-09-09 20:34:59 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* STARTING section invoked on the hotplugged CPU in low level
|
|
|
|
* bringup and teardown code.
|
|
|
|
*/
|
2016-02-27 02:43:43 +08:00
|
|
|
CPUHP_AP_IDLE_DEAD,
|
2016-02-27 02:43:29 +08:00
|
|
|
CPUHP_AP_OFFLINE,
|
2016-03-10 19:54:09 +08:00
|
|
|
CPUHP_AP_SCHED_STARTING,
|
2016-07-14 01:17:03 +08:00
|
|
|
CPUHP_AP_RCUTREE_DYING,
|
2019-10-10 18:01:48 +08:00
|
|
|
CPUHP_AP_CPU_PM_STARTING,
|
2016-07-14 01:16:04 +08:00
|
|
|
CPUHP_AP_IRQ_GIC_STARTING,
|
2016-07-14 01:16:06 +08:00
|
|
|
CPUHP_AP_IRQ_HIP04_STARTING,
|
2021-01-21 07:55:15 +08:00
|
|
|
CPUHP_AP_IRQ_APPLE_AIC_STARTING,
|
2016-07-14 01:16:07 +08:00
|
|
|
CPUHP_AP_IRQ_ARMADA_XP_STARTING,
|
2016-07-14 01:16:07 +08:00
|
|
|
CPUHP_AP_IRQ_BCM2836_STARTING,
|
2017-11-01 00:41:45 +08:00
|
|
|
CPUHP_AP_IRQ_MIPS_GIC_STARTING,
|
2020-06-01 17:15:40 +08:00
|
|
|
CPUHP_AP_IRQ_RISCV_STARTING,
|
2022-07-20 18:51:30 +08:00
|
|
|
CPUHP_AP_IRQ_LOONGARCH_STARTING,
|
2020-03-03 07:11:45 +08:00
|
|
|
CPUHP_AP_IRQ_SIFIVE_PLIC_STARTING,
|
2016-07-14 01:16:08 +08:00
|
|
|
CPUHP_AP_ARM_MVEBU_COHERENCY,
|
2019-06-13 21:49:02 +08:00
|
|
|
CPUHP_AP_MICROCODE_LOADER,
|
2016-07-14 01:16:13 +08:00
|
|
|
CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING,
|
2016-07-14 01:16:10 +08:00
|
|
|
CPUHP_AP_PERF_X86_STARTING,
|
2016-07-14 01:16:14 +08:00
|
|
|
CPUHP_AP_PERF_X86_AMD_IBS_STARTING,
|
2016-07-14 01:16:16 +08:00
|
|
|
CPUHP_AP_PERF_X86_CQM_STARTING,
|
2016-07-14 01:16:18 +08:00
|
|
|
CPUHP_AP_PERF_X86_CSTATE_STARTING,
|
2016-07-14 01:16:26 +08:00
|
|
|
CPUHP_AP_PERF_XTENSA_STARTING,
|
2016-07-14 01:16:53 +08:00
|
|
|
CPUHP_AP_MIPS_OP_LOONGSON3_STARTING,
|
2018-01-08 23:38:13 +08:00
|
|
|
CPUHP_AP_ARM_SDEI_STARTING,
|
2016-07-14 01:16:35 +08:00
|
|
|
CPUHP_AP_ARM_VFP_STARTING,
|
2016-08-16 18:29:17 +08:00
|
|
|
CPUHP_AP_ARM64_DEBUG_MONITORS_STARTING,
|
2016-08-16 01:55:11 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HW_BREAKPOINT_STARTING,
|
2017-04-11 16:39:55 +08:00
|
|
|
CPUHP_AP_PERF_ARM_ACPI_STARTING,
|
2016-07-14 01:16:36 +08:00
|
|
|
CPUHP_AP_PERF_ARM_STARTING,
|
2022-02-19 08:46:57 +08:00
|
|
|
CPUHP_AP_PERF_RISCV_STARTING,
|
2016-07-14 01:16:50 +08:00
|
|
|
CPUHP_AP_ARM_L2X0_STARTING,
|
2019-05-30 18:50:43 +08:00
|
|
|
CPUHP_AP_EXYNOS4_MCT_TIMER_STARTING,
|
2016-07-14 01:16:39 +08:00
|
|
|
CPUHP_AP_ARM_ARCH_TIMER_STARTING,
|
2016-07-14 01:17:04 +08:00
|
|
|
CPUHP_AP_ARM_GLOBAL_TIMER_STARTING,
|
2016-10-14 05:51:06 +08:00
|
|
|
CPUHP_AP_JCORE_TIMER_STARTING,
|
2016-07-14 01:16:51 +08:00
|
|
|
CPUHP_AP_ARM_TWD_STARTING,
|
2016-07-14 01:16:43 +08:00
|
|
|
CPUHP_AP_QCOM_TIMER_STARTING,
|
2019-02-21 15:21:44 +08:00
|
|
|
CPUHP_AP_TEGRA_TIMER_STARTING,
|
2016-07-14 01:17:06 +08:00
|
|
|
CPUHP_AP_ARMADA_TIMER_STARTING,
|
2016-07-14 01:17:07 +08:00
|
|
|
CPUHP_AP_MARCO_TIMER_STARTING,
|
2016-07-14 01:16:44 +08:00
|
|
|
CPUHP_AP_MIPS_GIC_TIMER_STARTING,
|
2016-07-14 01:17:07 +08:00
|
|
|
CPUHP_AP_ARC_TIMER_STARTING,
|
2018-08-04 16:23:19 +08:00
|
|
|
CPUHP_AP_RISCV_TIMER_STARTING,
|
2020-08-17 20:42:49 +08:00
|
|
|
CPUHP_AP_CLINT_TIMER_STARTING,
|
2018-11-03 00:51:28 +08:00
|
|
|
CPUHP_AP_CSKY_TIMER_STARTING,
|
2021-03-23 15:43:26 +08:00
|
|
|
CPUHP_AP_TI_GP_TIMER_STARTING,
|
x86/hyperv: Initialize clockevents earlier in CPU onlining
Hyper-V has historically initialized stimer-based clockevents late in the
process of onlining a CPU because clockevents depend on stimer
interrupts. In the original Hyper-V design, stimer interrupts generate a
VMbus message, so the VMbus machinery must be running first, and VMbus
can't be initialized until relatively late. On x86/64, LAPIC timer based
clockevents are used during early initialization before VMbus and
stimer-based clockevents are ready, and again during CPU offlining after
the stimer clockevents have been shut down.
Unfortunately, this design creates problems when offlining CPUs for
hibernation or other purposes. stimer-based clockevents are shut down
relatively early in the offlining process, so clockevents_unbind_device()
must be used to fallback to the LAPIC-based clockevents for the remainder
of the offlining process. Furthermore, the late initialization and early
shutdown of stimer-based clockevents doesn't work well on ARM64 since there
is no other timer like the LAPIC to fallback to. So CPU onlining and
offlining doesn't work properly.
Fix this by recognizing that stimer Direct Mode is the normal path for
newer versions of Hyper-V on x86/64, and the only path on other
architectures. With stimer Direct Mode, stimer interrupts don't require any
VMbus machinery. stimer clockevents can be initialized and shut down
consistent with how it is done for other clockevent devices. While the old
VMbus-based stimer interrupts must still be supported for backward
compatibility on x86, that mode of operation can be treated as legacy.
So add a new Hyper-V stimer entry in the CPU hotplug state list, and use
that new state when in Direct Mode. Update the Hyper-V clocksource driver
to allocate and initialize stimer clockevents earlier during boot. Update
Hyper-V initialization and the VMbus driver to use this new design. As a
result, the LAPIC timer is no longer used during boot or CPU
onlining/offlining and clockevents_unbind_device() is not called. But
retain the old design as a legacy implementation for older versions of
Hyper-V that don't support Direct Mode.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lkml.kernel.org/r/1573607467-9456-1-git-send-email-mikelley@microsoft.com
2019-11-13 09:11:49 +08:00
|
|
|
CPUHP_AP_HYPERV_TIMER_STARTING,
|
2016-07-14 01:16:37 +08:00
|
|
|
CPUHP_AP_KVM_STARTING,
|
2016-07-14 01:17:02 +08:00
|
|
|
CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING,
|
2016-07-14 01:16:46 +08:00
|
|
|
CPUHP_AP_KVM_ARM_VGIC_STARTING,
|
2016-07-14 01:16:47 +08:00
|
|
|
CPUHP_AP_KVM_ARM_TIMER_STARTING,
|
2016-12-15 19:01:05 +08:00
|
|
|
/* Must be the last timer callback */
|
|
|
|
CPUHP_AP_DUMMY_TIMER_STARTING,
|
2016-07-14 01:16:52 +08:00
|
|
|
CPUHP_AP_ARM_XEN_STARTING,
|
2016-07-14 01:16:54 +08:00
|
|
|
CPUHP_AP_ARM_CORESIGHT_STARTING,
|
2020-05-19 02:02:41 +08:00
|
|
|
CPUHP_AP_ARM_CORESIGHT_CTI_STARTING,
|
2016-07-14 01:16:56 +08:00
|
|
|
CPUHP_AP_ARM64_ISNDEP_STARTING,
|
2016-07-14 01:17:01 +08:00
|
|
|
CPUHP_AP_SMPCFD_DYING,
|
2016-07-14 01:16:57 +08:00
|
|
|
CPUHP_AP_X86_TBOOT_DYING,
|
2017-12-01 08:10:11 +08:00
|
|
|
CPUHP_AP_ARM_CACHE_B15_RAC_DYING,
|
2016-02-27 02:43:29 +08:00
|
|
|
CPUHP_AP_ONLINE,
|
|
|
|
CPUHP_TEARDOWN_CPU,
|
2021-09-09 20:34:59 +08:00
|
|
|
|
|
|
|
/* Online section invoked on the hotplugged CPU from the hotplug thread */
|
2016-02-27 02:43:40 +08:00
|
|
|
CPUHP_AP_ONLINE_IDLE,
|
2020-09-16 15:27:18 +08:00
|
|
|
CPUHP_AP_SCHED_WAIT_EMPTY,
|
2016-02-27 02:43:39 +08:00
|
|
|
CPUHP_AP_SMPBOOT_THREADS,
|
2016-07-14 01:16:03 +08:00
|
|
|
CPUHP_AP_X86_VDSO_VMA_ONLINE,
|
2017-06-20 07:37:51 +08:00
|
|
|
CPUHP_AP_IRQ_AFFINITY_ONLINE,
|
2020-05-29 21:53:15 +08:00
|
|
|
CPUHP_AP_BLK_MQ_ONLINE,
|
2018-06-18 23:32:30 +08:00
|
|
|
CPUHP_AP_ARM_MVEBU_SYNC_CLOCKS,
|
PM / arch: x86: Rework the MSR_IA32_ENERGY_PERF_BIAS handling
The current handling of MSR_IA32_ENERGY_PERF_BIAS in the kernel is
problematic, because it may cause changes made by user space to that
MSR (with the help of the x86_energy_perf_policy tool, for example)
to be lost every time a CPU goes offline and then back online as well
as during system-wide power management transitions into sleep states
and back into the working state.
The first problem is that if the current EPB value for a CPU going
online is 0 ('performance'), the kernel will change it to 6 ('normal')
regardless of whether or not this is the first bring-up of that CPU.
That also happens during system-wide resume from sleep states
(including, but not limited to, hibernation). However, the EPB may
have been adjusted by user space this way and the kernel should not
blindly override that setting.
The second problem is that if the platform firmware resets the EPB
values for any CPUs during system-wide resume from a sleep state,
the kernel will not restore their previous EPB values that may
have been set by user space before the preceding system-wide
suspend transition. Again, that behavior may at least be confusing
from the user space perspective.
In order to address these issues, rework the handling of
MSR_IA32_ENERGY_PERF_BIAS so that the EPB value is saved on CPU
offline and restored on CPU online as well as (for the boot CPU)
during the syscore stages of system-wide suspend and resume
transitions, respectively.
However, retain the policy by which the EPB is set to 6 ('normal')
on the first bring-up of each CPU if its initial value is 0, based
on the observation that 0 may mean 'not initialized' just as well as
'performance' in that case.
While at it, move the MSR_IA32_ENERGY_PERF_BIAS handling code into
a separate file and document it in Documentation/admin-guide.
Fixes: abe48b108247 (x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS)
Fixes: b51ef52df71c (x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume)
Reported-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2019-03-22 06:18:01 +08:00
|
|
|
CPUHP_AP_X86_INTEL_EPB_ONLINE,
|
2016-07-14 01:16:09 +08:00
|
|
|
CPUHP_AP_PERF_ONLINE,
|
2016-07-14 01:16:10 +08:00
|
|
|
CPUHP_AP_PERF_X86_ONLINE,
|
2016-07-14 01:16:12 +08:00
|
|
|
CPUHP_AP_PERF_X86_UNCORE_ONLINE,
|
2016-07-14 01:16:13 +08:00
|
|
|
CPUHP_AP_PERF_X86_AMD_UNCORE_ONLINE,
|
2016-07-14 01:16:28 +08:00
|
|
|
CPUHP_AP_PERF_X86_AMD_POWER_ONLINE,
|
2016-07-14 01:16:15 +08:00
|
|
|
CPUHP_AP_PERF_X86_RAPL_ONLINE,
|
2016-07-14 01:16:16 +08:00
|
|
|
CPUHP_AP_PERF_X86_CQM_ONLINE,
|
2016-07-14 01:16:18 +08:00
|
|
|
CPUHP_AP_PERF_X86_CSTATE_ONLINE,
|
dmaengine: idxd: Add IDXD performance monitor support
Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.
The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework. It has several features in
common with the existing uncore pmu support:
- it does not support sampling
- does not support per-thread counting
However it also has some unique features not present in the core and
uncore support:
- all general-purpose counters are identical, thus no event constraints
- operation is always system-wide
While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket. IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).
More specifically, the perf userspace tool by default opens a counter
for each cpu for an event. However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it. When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask. This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events. In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.
The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function. During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.
The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver. The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).
With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s). The event(s) are opened and
started, and following termination of the perf command, they're
stopped. At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle. An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.
Below are a couple of examples of perf usage for monitoring DSA events.
The following monitors all events in the 'engine' category. Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.
Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:
event 0x1: total input data processed, in 32-byte units
event 0x2: total data written, in 32-byte units
event 0x4: number of work descriptors that read the source
event 0x8: number of work descriptors that write the destination
event 0x10: number of work descriptors dispatched from batch descriptors
event 0x20: number of work descriptors dispatched from work queues
# perf stat -e dsa0/event=0x1,event_category=0x1/,
dsa0/event=0x2,event_category=0x1/,
dsa0/event=0x4,event_category=0x1/,
dsa0/event=0x8,event_category=0x1/,
dsa0/event=0x10,event_category=0x1/,
dsa0/event=0x20,event_category=0x1/
modprobe dmatest channel=dma0chan0 timeout=2000
iterations=19 run=1 wait=1
Performance counter stats for 'system wide':
5,332 dsa0/event=0x1,event_category=0x1/
5,327 dsa0/event=0x2,event_category=0x1/
19 dsa0/event=0x4,event_category=0x1/
19 dsa0/event=0x8,event_category=0x1/
0 dsa0/event=0x10,event_category=0x1/
19 dsa0/event=0x20,event_category=0x1/
21.977436186 seconds time elapsed
The command below illustrates filter usage with a simple example. It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]). In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria. In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count. Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.
# perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
filter_eng=0x1,event=0x8,event_category=0x3/
modprobe dmatest channel=dma0chan0 timeout=2000
iterations=19 run=1 wait=1
Performance counter stats for 'system wide':
19 dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
filter_eng=0x1,event=0x8,event_category=0x3/
21.865914091 seconds time elapsed
The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.
[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html
[ Based on work originally by Jing Lin. ]
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-24 23:04:15 +08:00
|
|
|
CPUHP_AP_PERF_X86_IDXD_ONLINE,
|
2016-07-14 01:16:21 +08:00
|
|
|
CPUHP_AP_PERF_S390_CF_ONLINE,
|
2016-07-14 01:16:22 +08:00
|
|
|
CPUHP_AP_PERF_S390_SF_ONLINE,
|
2016-07-14 01:16:24 +08:00
|
|
|
CPUHP_AP_PERF_ARM_CCI_ONLINE,
|
2016-07-14 01:16:25 +08:00
|
|
|
CPUHP_AP_PERF_ARM_CCN_ONLINE,
|
2022-04-15 18:23:52 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_CPA_ONLINE,
|
2017-10-19 19:05:20 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE,
|
2017-10-19 19:05:19 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE,
|
2017-10-19 19:05:18 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_L3_ONLINE,
|
2021-03-08 14:50:36 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_PA_ONLINE,
|
2021-03-08 14:50:35 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_SLLC_ONLINE,
|
2021-12-02 16:06:33 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE,
|
drivers/perf: hisi: add driver for HNS3 PMU
HNS3(HiSilicon Network System 3) PMU is RCiEP device in HiSilicon SoC NIC,
supports collection of performance statistics such as bandwidth, latency,
packet rate and interrupt rate.
NIC of each SICL has one PMU device for it. Driver registers each PMU
device to perf, and exports information of supported events, filter mode of
each event, bdf range, hardware clock frequency, identifier and so on via
sysfs.
Each PMU device has its own registers of control, counters and interrupt,
and it supports 8 hardware events, each hardward event has its own
registers for configuration, counters and interrupt.
Filter options contains:
config - select event
port - select physical port of nic
tc - select tc(must be used with port)
func - select PF/VF
queue - select queue of PF/VF(must be used with func)
intr - select interrupt number(must be used with func)
global - select all functions of IO DIE
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20220628063419.38514-3-huangguangbin2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28 14:34:19 +08:00
|
|
|
CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE,
|
2016-09-02 17:35:18 +08:00
|
|
|
CPUHP_AP_PERF_ARM_L2X0_ONLINE,
|
2017-02-08 02:14:04 +08:00
|
|
|
CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
|
2017-04-01 02:13:43 +08:00
|
|
|
CPUHP_AP_PERF_ARM_QCOM_L3_ONLINE,
|
2018-11-08 03:40:58 +08:00
|
|
|
CPUHP_AP_PERF_ARM_APM_XGENE_ONLINE,
|
2018-12-06 19:51:31 +08:00
|
|
|
CPUHP_AP_PERF_ARM_CAVIUM_TX2_UNCORE_ONLINE,
|
2022-02-11 12:53:46 +08:00
|
|
|
CPUHP_AP_PERF_ARM_MARVELL_CN10K_DDR_ONLINE,
|
2017-07-19 05:36:34 +08:00
|
|
|
CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE,
|
2017-07-19 05:36:35 +08:00
|
|
|
CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE,
|
2017-07-19 05:36:36 +08:00
|
|
|
CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE,
|
2019-04-16 17:48:30 +08:00
|
|
|
CPUHP_AP_PERF_POWERPC_TRACE_IMC_ONLINE,
|
2020-07-09 13:18:35 +08:00
|
|
|
CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
|
2020-10-03 15:49:42 +08:00
|
|
|
CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE,
|
2020-11-13 17:48:27 +08:00
|
|
|
CPUHP_AP_PERF_CSKY_ONLINE,
|
2018-06-07 16:52:03 +08:00
|
|
|
CPUHP_AP_WATCHDOG_ONLINE,
|
2016-07-14 01:16:29 +08:00
|
|
|
CPUHP_AP_WORKQUEUE_ONLINE,
|
random: clear fast pool, crng, and batches in cpuhp bring up
For the irq randomness fast pool, rather than having to use expensive
atomics, which were visibly the most expensive thing in the entire irq
handler, simply take care of the extreme edge case of resetting count to
zero in the cpuhp online handler, just after workqueues have been
reenabled. This simplifies the code a bit and lets us use vanilla
variables rather than atomics, and performance should be improved.
As well, very early on when the CPU comes up, while interrupts are still
disabled, we clear out the per-cpu crng and its batches, so that it
always starts with fresh randomness.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-14 05:48:04 +08:00
|
|
|
CPUHP_AP_RANDOM_ONLINE,
|
2016-07-14 01:17:03 +08:00
|
|
|
CPUHP_AP_RCUTREE_ONLINE,
|
2019-06-25 01:36:56 +08:00
|
|
|
CPUHP_AP_BASE_CACHEINFO_ONLINE,
|
2016-02-27 02:43:39 +08:00
|
|
|
CPUHP_AP_ONLINE_DYN,
|
|
|
|
CPUHP_AP_ONLINE_DYN_END = CPUHP_AP_ONLINE_DYN + 30,
|
mm/migrate: fix CPUHP state to update node demotion order
The node demotion order needs to be updated during CPU hotplug. Because
whether a NUMA node has CPU may influence the demotion order. The
update function should be called during CPU online/offline after the
node_states[N_CPU] has been updated. That is done in
CPUHP_AP_ONLINE_DYN during CPU online and in CPUHP_MM_VMSTAT_DEAD during
CPU offline. But in commit 884a6e5d1f93 ("mm/migrate: update node
demotion order on hotplug events"), the function to update node demotion
order is called in CPUHP_AP_ONLINE_DYN during CPU online/offline. This
doesn't satisfy the order requirement.
For example, there are 4 CPUs (P0, P1, P2, P3) in 2 sockets (P0, P1 in S0
and P2, P3 in S1), the demotion order is
- S0 -> NUMA_NO_NODE
- S1 -> NUMA_NO_NODE
After P2 and P3 is offlined, because S1 has no CPU now, the demotion
order should have been changed to
- S0 -> S1
- S1 -> NO_NODE
but it isn't changed, because the order updating callback for CPU
hotplug doesn't see the new nodemask. After that, if P1 is offlined,
the demotion order is changed to the expected order as above.
So in this patch, we added CPUHP_AP_MM_DEMOTION_ONLINE and
CPUHP_MM_DEMOTION_DEAD to be called after CPUHP_AP_ONLINE_DYN and
CPUHP_MM_VMSTAT_DEAD during CPU online and offline, and register the
update function on them.
Link: https://lkml.kernel.org/r/20210929060351.7293-1-ying.huang@intel.com
Fixes: 884a6e5d1f93 ("mm/migrate: update node demotion order on hotplug events")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-19 06:15:35 +08:00
|
|
|
/* Must be after CPUHP_AP_ONLINE_DYN for node_states[N_CPU] update */
|
|
|
|
CPUHP_AP_MM_DEMOTION_ONLINE,
|
2016-07-14 01:16:30 +08:00
|
|
|
CPUHP_AP_X86_HPET_ONLINE,
|
2016-07-14 01:16:33 +08:00
|
|
|
CPUHP_AP_X86_KVM_CLK_ONLINE,
|
2016-03-10 19:54:19 +08:00
|
|
|
CPUHP_AP_ACTIVE,
|
2016-02-27 02:43:28 +08:00
|
|
|
CPUHP_ONLINE,
|
|
|
|
};
|
|
|
|
|
2016-02-27 02:43:33 +08:00
|
|
|
int __cpuhp_setup_state(enum cpuhp_state state, const char *name, bool invoke,
|
|
|
|
int (*startup)(unsigned int cpu),
|
2016-08-13 01:49:39 +08:00
|
|
|
int (*teardown)(unsigned int cpu), bool multi_instance);
|
2016-02-27 02:43:33 +08:00
|
|
|
|
2017-05-24 16:15:14 +08:00
|
|
|
int __cpuhp_setup_state_cpuslocked(enum cpuhp_state state, const char *name,
|
|
|
|
bool invoke,
|
|
|
|
int (*startup)(unsigned int cpu),
|
|
|
|
int (*teardown)(unsigned int cpu),
|
|
|
|
bool multi_instance);
|
2016-02-27 02:43:33 +08:00
|
|
|
/**
|
2021-09-09 20:34:59 +08:00
|
|
|
* cpuhp_setup_state - Setup hotplug state callbacks with calling the @startup
|
|
|
|
* callback
|
2016-02-27 02:43:33 +08:00
|
|
|
* @state: The state for which the calls are installed
|
|
|
|
* @name: Name of the callback (will be used in debug output)
|
2021-09-09 20:34:59 +08:00
|
|
|
* @startup: startup callback function or NULL if not required
|
|
|
|
* @teardown: teardown callback function or NULL if not required
|
2016-02-27 02:43:33 +08:00
|
|
|
*
|
2021-09-09 20:34:59 +08:00
|
|
|
* Installs the callback functions and invokes the @startup callback on
|
|
|
|
* the online cpus which have already reached the @state.
|
2016-02-27 02:43:33 +08:00
|
|
|
*/
|
|
|
|
static inline int cpuhp_setup_state(enum cpuhp_state state,
|
|
|
|
const char *name,
|
|
|
|
int (*startup)(unsigned int cpu),
|
|
|
|
int (*teardown)(unsigned int cpu))
|
|
|
|
{
|
2016-08-13 01:49:39 +08:00
|
|
|
return __cpuhp_setup_state(state, name, true, startup, teardown, false);
|
2016-02-27 02:43:33 +08:00
|
|
|
}
|
|
|
|
|
2021-09-09 20:34:59 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_setup_state_cpuslocked - Setup hotplug state callbacks with calling
|
|
|
|
* @startup callback from a cpus_read_lock()
|
|
|
|
* held region
|
|
|
|
* @state: The state for which the calls are installed
|
|
|
|
* @name: Name of the callback (will be used in debug output)
|
|
|
|
* @startup: startup callback function or NULL if not required
|
|
|
|
* @teardown: teardown callback function or NULL if not required
|
|
|
|
*
|
|
|
|
* Same as cpuhp_setup_state() except that it must be invoked from within a
|
|
|
|
* cpus_read_lock() held region.
|
|
|
|
*/
|
2017-05-24 16:15:14 +08:00
|
|
|
static inline int cpuhp_setup_state_cpuslocked(enum cpuhp_state state,
|
|
|
|
const char *name,
|
|
|
|
int (*startup)(unsigned int cpu),
|
|
|
|
int (*teardown)(unsigned int cpu))
|
|
|
|
{
|
|
|
|
return __cpuhp_setup_state_cpuslocked(state, name, true, startup,
|
|
|
|
teardown, false);
|
|
|
|
}
|
|
|
|
|
2016-02-27 02:43:33 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_setup_state_nocalls - Setup hotplug state callbacks without calling the
|
2021-09-09 20:34:59 +08:00
|
|
|
* @startup callback
|
2016-02-27 02:43:33 +08:00
|
|
|
* @state: The state for which the calls are installed
|
|
|
|
* @name: Name of the callback.
|
2021-09-09 20:34:59 +08:00
|
|
|
* @startup: startup callback function or NULL if not required
|
|
|
|
* @teardown: teardown callback function or NULL if not required
|
2016-02-27 02:43:33 +08:00
|
|
|
*
|
2021-09-09 20:34:59 +08:00
|
|
|
* Same as cpuhp_setup_state() except that the @startup callback is not
|
|
|
|
* invoked during installation. NOP if SMP=n or HOTPLUG_CPU=n.
|
2016-02-27 02:43:33 +08:00
|
|
|
*/
|
|
|
|
static inline int cpuhp_setup_state_nocalls(enum cpuhp_state state,
|
|
|
|
const char *name,
|
|
|
|
int (*startup)(unsigned int cpu),
|
|
|
|
int (*teardown)(unsigned int cpu))
|
|
|
|
{
|
2016-08-13 01:49:39 +08:00
|
|
|
return __cpuhp_setup_state(state, name, false, startup, teardown,
|
|
|
|
false);
|
|
|
|
}
|
|
|
|
|
2021-09-09 20:34:59 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_setup_state_nocalls_cpuslocked - Setup hotplug state callbacks without
|
|
|
|
* invoking the @startup callback from
|
|
|
|
* a cpus_read_lock() held region
|
|
|
|
* callbacks
|
|
|
|
* @state: The state for which the calls are installed
|
|
|
|
* @name: Name of the callback.
|
|
|
|
* @startup: startup callback function or NULL if not required
|
|
|
|
* @teardown: teardown callback function or NULL if not required
|
|
|
|
*
|
|
|
|
* Same as cpuhp_setup_state_nocalls() except that it must be invoked from
|
|
|
|
* within a cpus_read_lock() held region.
|
|
|
|
*/
|
2017-05-24 16:15:14 +08:00
|
|
|
static inline int cpuhp_setup_state_nocalls_cpuslocked(enum cpuhp_state state,
|
|
|
|
const char *name,
|
|
|
|
int (*startup)(unsigned int cpu),
|
|
|
|
int (*teardown)(unsigned int cpu))
|
|
|
|
{
|
|
|
|
return __cpuhp_setup_state_cpuslocked(state, name, false, startup,
|
|
|
|
teardown, false);
|
|
|
|
}
|
|
|
|
|
2016-08-13 01:49:39 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_setup_state_multi - Add callbacks for multi state
|
|
|
|
* @state: The state for which the calls are installed
|
|
|
|
* @name: Name of the callback.
|
2021-09-09 20:34:59 +08:00
|
|
|
* @startup: startup callback function or NULL if not required
|
|
|
|
* @teardown: teardown callback function or NULL if not required
|
2016-08-13 01:49:39 +08:00
|
|
|
*
|
|
|
|
* Sets the internal multi_instance flag and prepares a state to work as a multi
|
|
|
|
* instance callback. No callbacks are invoked at this point. The callbacks are
|
|
|
|
* invoked once an instance for this state are registered via
|
2021-09-09 20:34:59 +08:00
|
|
|
* cpuhp_state_add_instance() or cpuhp_state_add_instance_nocalls()
|
2016-08-13 01:49:39 +08:00
|
|
|
*/
|
|
|
|
static inline int cpuhp_setup_state_multi(enum cpuhp_state state,
|
|
|
|
const char *name,
|
|
|
|
int (*startup)(unsigned int cpu,
|
|
|
|
struct hlist_node *node),
|
|
|
|
int (*teardown)(unsigned int cpu,
|
|
|
|
struct hlist_node *node))
|
|
|
|
{
|
|
|
|
return __cpuhp_setup_state(state, name, false,
|
|
|
|
(void *) startup,
|
|
|
|
(void *) teardown, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
int __cpuhp_state_add_instance(enum cpuhp_state state, struct hlist_node *node,
|
|
|
|
bool invoke);
|
2017-05-24 16:15:15 +08:00
|
|
|
int __cpuhp_state_add_instance_cpuslocked(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node, bool invoke);
|
2016-08-13 01:49:39 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* cpuhp_state_add_instance - Add an instance for a state and invoke startup
|
|
|
|
* callback.
|
|
|
|
* @state: The state for which the instance is installed
|
|
|
|
* @node: The node for this individual state.
|
|
|
|
*
|
2021-09-09 20:34:59 +08:00
|
|
|
* Installs the instance for the @state and invokes the registered startup
|
|
|
|
* callback on the online cpus which have already reached the @state. The
|
|
|
|
* @state must have been earlier marked as multi-instance by
|
|
|
|
* cpuhp_setup_state_multi().
|
2016-08-13 01:49:39 +08:00
|
|
|
*/
|
|
|
|
static inline int cpuhp_state_add_instance(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node)
|
|
|
|
{
|
|
|
|
return __cpuhp_state_add_instance(state, node, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* cpuhp_state_add_instance_nocalls - Add an instance for a state without
|
|
|
|
* invoking the startup callback.
|
|
|
|
* @state: The state for which the instance is installed
|
|
|
|
* @node: The node for this individual state.
|
|
|
|
*
|
2021-09-09 20:34:59 +08:00
|
|
|
* Installs the instance for the @state. The @state must have been earlier
|
|
|
|
* marked as multi-instance by cpuhp_setup_state_multi. NOP if SMP=n or
|
|
|
|
* HOTPLUG_CPU=n.
|
2016-08-13 01:49:39 +08:00
|
|
|
*/
|
|
|
|
static inline int cpuhp_state_add_instance_nocalls(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node)
|
|
|
|
{
|
|
|
|
return __cpuhp_state_add_instance(state, node, false);
|
2016-02-27 02:43:33 +08:00
|
|
|
}
|
|
|
|
|
2021-09-09 20:34:59 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_state_add_instance_nocalls_cpuslocked - Add an instance for a state
|
|
|
|
* without invoking the startup
|
|
|
|
* callback from a cpus_read_lock()
|
|
|
|
* held region.
|
|
|
|
* @state: The state for which the instance is installed
|
|
|
|
* @node: The node for this individual state.
|
|
|
|
*
|
|
|
|
* Same as cpuhp_state_add_instance_nocalls() except that it must be
|
|
|
|
* invoked from within a cpus_read_lock() held region.
|
|
|
|
*/
|
2017-05-24 16:15:15 +08:00
|
|
|
static inline int
|
|
|
|
cpuhp_state_add_instance_nocalls_cpuslocked(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node)
|
|
|
|
{
|
|
|
|
return __cpuhp_state_add_instance_cpuslocked(state, node, false);
|
|
|
|
}
|
|
|
|
|
2016-02-27 02:43:33 +08:00
|
|
|
void __cpuhp_remove_state(enum cpuhp_state state, bool invoke);
|
2017-05-24 16:15:14 +08:00
|
|
|
void __cpuhp_remove_state_cpuslocked(enum cpuhp_state state, bool invoke);
|
2016-02-27 02:43:33 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* cpuhp_remove_state - Remove hotplug state callbacks and invoke the teardown
|
|
|
|
* @state: The state for which the calls are removed
|
|
|
|
*
|
|
|
|
* Removes the callback functions and invokes the teardown callback on
|
2021-09-09 20:34:59 +08:00
|
|
|
* the online cpus which have already reached the @state.
|
2016-02-27 02:43:33 +08:00
|
|
|
*/
|
|
|
|
static inline void cpuhp_remove_state(enum cpuhp_state state)
|
|
|
|
{
|
|
|
|
__cpuhp_remove_state(state, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* cpuhp_remove_state_nocalls - Remove hotplug state callbacks without invoking
|
2021-09-09 20:34:59 +08:00
|
|
|
* the teardown callback
|
2016-02-27 02:43:33 +08:00
|
|
|
* @state: The state for which the calls are removed
|
|
|
|
*/
|
|
|
|
static inline void cpuhp_remove_state_nocalls(enum cpuhp_state state)
|
|
|
|
{
|
|
|
|
__cpuhp_remove_state(state, false);
|
|
|
|
}
|
|
|
|
|
2021-09-09 20:34:59 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_remove_state_nocalls_cpuslocked - Remove hotplug state callbacks without invoking
|
|
|
|
* teardown from a cpus_read_lock() held region.
|
|
|
|
* @state: The state for which the calls are removed
|
|
|
|
*
|
|
|
|
* Same as cpuhp_remove_state nocalls() except that it must be invoked
|
|
|
|
* from within a cpus_read_lock() held region.
|
|
|
|
*/
|
2017-05-24 16:15:14 +08:00
|
|
|
static inline void cpuhp_remove_state_nocalls_cpuslocked(enum cpuhp_state state)
|
|
|
|
{
|
|
|
|
__cpuhp_remove_state_cpuslocked(state, false);
|
|
|
|
}
|
|
|
|
|
2016-08-13 01:49:39 +08:00
|
|
|
/**
|
|
|
|
* cpuhp_remove_multi_state - Remove hotplug multi state callback
|
|
|
|
* @state: The state for which the calls are removed
|
|
|
|
*
|
|
|
|
* Removes the callback functions from a multi state. This is the reverse of
|
|
|
|
* cpuhp_setup_state_multi(). All instances should have been removed before
|
|
|
|
* invoking this function.
|
|
|
|
*/
|
|
|
|
static inline void cpuhp_remove_multi_state(enum cpuhp_state state)
|
|
|
|
{
|
|
|
|
__cpuhp_remove_state(state, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
int __cpuhp_state_remove_instance(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node, bool invoke);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* cpuhp_state_remove_instance - Remove hotplug instance from state and invoke
|
|
|
|
* the teardown callback
|
|
|
|
* @state: The state from which the instance is removed
|
|
|
|
* @node: The node for this individual state.
|
|
|
|
*
|
2021-09-09 20:34:59 +08:00
|
|
|
* Removes the instance and invokes the teardown callback on the online cpus
|
|
|
|
* which have already reached @state.
|
2016-08-13 01:49:39 +08:00
|
|
|
*/
|
|
|
|
static inline int cpuhp_state_remove_instance(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node)
|
|
|
|
{
|
|
|
|
return __cpuhp_state_remove_instance(state, node, true);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* cpuhp_state_remove_instance_nocalls - Remove hotplug instance from state
|
2021-05-21 16:28:09 +08:00
|
|
|
* without invoking the teardown callback
|
2016-08-13 01:49:39 +08:00
|
|
|
* @state: The state from which the instance is removed
|
|
|
|
* @node: The node for this individual state.
|
|
|
|
*
|
|
|
|
* Removes the instance without invoking the teardown callback.
|
|
|
|
*/
|
|
|
|
static inline int cpuhp_state_remove_instance_nocalls(enum cpuhp_state state,
|
|
|
|
struct hlist_node *node)
|
|
|
|
{
|
|
|
|
return __cpuhp_state_remove_instance(state, node, false);
|
|
|
|
}
|
|
|
|
|
2016-02-27 02:43:41 +08:00
|
|
|
#ifdef CONFIG_SMP
|
|
|
|
void cpuhp_online_idle(enum cpuhp_state state);
|
|
|
|
#else
|
|
|
|
static inline void cpuhp_online_idle(enum cpuhp_state state) { }
|
|
|
|
#endif
|
|
|
|
|
2016-02-27 02:43:28 +08:00
|
|
|
#endif
|