OpenCloudOS-Kernel

History

Waiman Long f99fd22e4d x86/hpet: Reduce HPET counter read contention On a large system with many CPUs, using HPET as the clock source can have a significant impact on the overall system performance because of the following reasons: 1) There is a single HPET counter shared by all the CPUs. 2) HPET counter reading is a very slow operation. Using HPET as the default clock source may happen when, for example, the TSC clock calibration exceeds the allowable tolerance. Something the performance slowdown can be so severe that the system may crash because of a NMI watchdog soft lockup, for example. During the TSC clock calibration process, the default clock source will be set temporarily to HPET. For systems with many CPUs, it is possible that NMI watchdog soft lockup may occur occasionally during that short time period where HPET clocking is active as is shown in the kernel log below: [ 71.646504] hpet0: 8 comparators, 64-bit 14.318180 MHz counter [ 71.655313] Switching to clocksource hpet [ 95.679135] BUG: soft lockup - CPU#144 stuck for 23s! [swapper/144:0] [ 95.693363] BUG: soft lockup - CPU#145 stuck for 23s! [swapper/145:0] [ 95.695580] BUG: soft lockup - CPU#582 stuck for 23s! [swapper/582:0] [ 95.698128] BUG: soft lockup - CPU#357 stuck for 23s! [swapper/357:0] This patch addresses the above issues by reducing HPET read contention using the fact that if more than one CPUs are trying to access HPET at the same time, it will be more efficient when only one CPU in the group reads the HPET counter and shares it with the rest of the group instead of each group member trying to read the HPET counter individually. This is done by using a combination quadword that contains a 32-bit stored HPET value and a 32-bit spinlock. The CPU that gets the lock will be responsible for reading the HPET counter and storing it in the quadword. The others will monitor the change in HPET value and lock status and grab the latest stored HPET value accordingly. This change is only enabled on 64-bit SMP configuration. On a 4-socket Haswell-EX box with 144 threads (HT on), running the AIM7 compute workload (1500 users) on a 4.8-rc1 kernel (HZ=1000) with and without the patch has the following performance numbers (with HPET or TSC as clock source): TSC = 1042431 jobs/min HPET w/o patch = 798068 jobs/min HPET with patch = 1029445 jobs/min The perf profile showed a reduction of the %CPU time consumed by read_hpet from 11.19% without patch to 1.24% with patch. [ tglx: It's really sad that we need to have such hacks just to deal with the fact that cpu vendors have not managed to fix the TSC wreckage within 15+ years. Were They Forgetting? ] Signed-off-by: Waiman Long <Waiman.Long@hpe.com> Tested-by: Prarit Bhargava <prarit@redhat.com> Cc: Scott J Norton <scott.norton@hpe.com> Cc: Douglas Hatch <doug.hatch@hpe.com> Cc: Randy Wright <rwright@hpe.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1473182530-29175-1-git-send-email-Waiman.Long@hpe.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>		2016-09-09 15:16:19 +02:00
..
acpi	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
apic	x86/apic: Do not init irq remapping if ioapic is disabled	2016-08-24 09:45:40 +02:00
cpu	x86/AMD: Apply erratum 665 on machines without a BIOS fix	2016-09-02 20:42:28 +02:00
fpu	x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation	2016-08-10 16:12:26 +02:00
kprobes	kprobes/x86: Clear TF bit in fault on single-stepping	2016-06-14 12:00:54 +02:00
.gitignore	…
Makefile	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching	2016-05-17 17:11:27 -07:00
alternative.c	x86/asm: Stop depending on ptrace.h in alternative.h	2016-04-29 11:56:40 +02:00
amd_gart_64.c	dma-mapping: use unsigned long for dma_attrs	2016-08-04 08:50:07 -04:00
amd_nb.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
apb_timer.c	x86/apb_timer: Convert to hotplug state machine	2016-07-15 10:40:22 +02:00
aperture_64.c	param: convert some "on"/"off" users to strtobool	2016-03-17 15:09:34 -07:00
apm_32.c	x86/apm32: Remove paravirt_enabled() use	2016-04-22 10:29:03 +02:00
asm-offsets.c	x86/uaccess: Move thread_info::addr_limit to thread_struct	2016-07-15 10:26:30 +02:00
asm-offsets_32.c	x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup	2016-03-10 09:48:14 +01:00
asm-offsets_64.c	x86/syscalls: Add syscall entry qualifiers	2016-01-29 09:46:38 +01:00
audit_64.c	…
bootflag.c	x86: don't use module_init for non-modular core bootflag code	2015-06-16 14:12:34 -04:00
check.c	Linux 4.2-rc8	2015-08-25 09:59:19 +02:00
cpuid.c	new helpers: no_seek_end_llseek{,_size}()	2015-12-23 10:41:31 -05:00
crash.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
crash_dump_32.c	…
crash_dump_64.c	…
devicetree.c	x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage	2016-04-13 11:37:41 +02:00
doublefault.c	…
dumpstack.c	Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-07-25 18:18:04 -07:00
dumpstack_32.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
dumpstack_64.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
e820.c	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-03-15 09:32:27 -07:00
early-quirks.c	Linux 4.7	2016-07-26 17:26:29 +10:00
early_printk.c	x86: Fix misspellings in comments	2016-02-24 08:44:58 +01:00
ebda.c	x86/boot: Simplify EBDA-vs-BIOS reservation logic	2016-07-22 11:46:01 +02:00
espfix_64.c	x86: get rid of superfluous __GFP_REPEAT	2016-06-24 17:23:52 -07:00
ftrace.c	Nothing major this round. Mostly small clean ups and fixes.	2016-03-24 10:52:25 -07:00
head32.c	x86/boot: Run reserve_bios_regions() after we initialize the memory map	2016-08-11 11:14:59 +02:00
head64.c	x86/boot: Run reserve_bios_regions() after we initialize the memory map	2016-08-11 11:14:59 +02:00
head_32.S	Merge branch 'x86/urgent' into x86/asm, to refresh the tree	2016-04-29 11:55:04 +02:00
head_64.S	x86/mm: Enable KASLR for physical mapping memory regions	2016-07-08 17:35:15 +02:00
hpet.c	x86/hpet: Reduce HPET counter read contention	2016-09-09 15:16:19 +02:00
hw_breakpoint.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
i386_ksyms_32.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
i8237.c	…
i8253.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
i8259.c	x86/irq: Probe for PIC presence before allocating descs for legacy IRQs	2015-11-07 10:37:37 +01:00
io_delay.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
ioport.c	x86/iopl: Fix iopl capability check on Xen PV	2016-03-17 09:49:27 +01:00
irq.c	x86/irq: Do not substract irq_tlb_count from irq_call_count	2016-08-11 11:14:59 +02:00
irq_32.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
irq_64.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
irq_work.c	treewide: Remove old email address	2015-11-23 09:44:58 +01:00
irqinit.c	x86/irq: Store irq descriptor in vector array	2015-08-06 00:14:59 +02:00
jump_label.c	x86/asm: Stop depending on ptrace.h in alternative.h	2016-04-29 11:56:40 +02:00
kdebugfs.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
kexec-bzimage64.c	KEYS: Generalise system_verify_data() to provide access to internal content	2016-04-06 16:14:24 +01:00
kgdb.c	x86/asm: Stop depending on ptrace.h in alternative.h	2016-04-29 11:56:40 +02:00
ksysfs.c	…
kvm.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
kvmclock.c	x86: Fix misspellings in comments	2016-02-24 08:44:58 +01:00
ldt.c	x86/mm: Factor out LDT init from context init	2016-02-18 19:46:31 +01:00
machine_kexec_32.c	…
machine_kexec_64.c	kexec: provide arch_kexec_protect(unprotect)_crashkres()	2016-05-23 17:04:14 -07:00
mcount_64.S	ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it	2016-05-20 13:28:40 -04:00
mmconf-fam10h_64.c	…
module.c	x86/asm: Stop depending on ptrace.h in alternative.h	2016-04-29 11:56:40 +02:00
mpparse.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
msr.c	x86/cpufeature: Carve out X86_FEATURE_*	2016-01-30 11:22:17 +01:00
nmi.c	x86: include linux/ratelimit.h in nmi.c	2016-06-06 17:10:15 +02:00
nmi_selftest.c	…
paravirt-spinlocks.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
paravirt.c	x86/paravirt: Do not trace _paravirt_ident_*() functions	2016-09-02 09:40:47 -07:00
paravirt_patch_32.c	x86/paravirt: Remove the unused irq_enable_sysexit pv op	2015-11-23 10:48:16 +01:00
paravirt_patch_64.c	x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op	2015-11-23 10:48:16 +01:00
pci-calgary_64.c	dma-mapping: use unsigned long for dma_attrs	2016-08-04 08:50:07 -04:00
pci-dma.c	dma-mapping: use unsigned long for dma_attrs	2016-08-04 08:50:07 -04:00
pci-iommu_table.c	x86: Fix non-static inlines	2016-04-16 13:21:40 +02:00
pci-nommu.c	dma-mapping: use unsigned long for dma_attrs	2016-08-04 08:50:07 -04:00
pci-swiotlb.c	dma-mapping: use unsigned long for dma_attrs	2016-08-04 08:50:07 -04:00
pcspeaker.c	…
perf_regs.c	perf/x86/64: Report regs_user->ax too in get_regs_user()	2015-04-11 13:08:53 +02:00
platform-quirks.c	x86/boot: Reorganize and clean up the BIOS area reservation code	2016-07-21 10:11:57 +02:00
pmem.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
probe_roms.c	…
process.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
process_32.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
process_64.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
ptrace.c	x86/ptrace: Stop setting TS_COMPAT in ptrace code	2016-07-27 11:09:43 +02:00
pvclock.c	pvclock: introduce seqcount-like API	2016-08-04 13:52:21 +02:00
quirks.c	timers/x86/hpet: Type adjustments	2015-10-21 11:17:32 +02:00
reboot.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
reboot_fixups_32.c	…
relocate_kernel_32.S	…
relocate_kernel_64.S	x86/asm: Replace "MOVQ $imm, %reg" with MOVL	2015-04-01 13:17:39 +02:00
resource.c	…
rtc.c	char/genrtc: x86: remove remnants of asm/rtc.h	2016-06-04 00:20:07 +02:00
setup.c	x86/boot: Defer setup_real_mode() to early_initcall time	2016-08-11 11:15:00 +02:00
setup_percpu.c	x86/acpi: store ACPI ids from MADT for future usage	2016-07-25 13:30:53 +01:00
signal.c	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-06 09:04:35 -04:00
signal_compat.c	x86/signals: Add build-time checks to the siginfo compat code	2016-06-14 12:19:24 +02:00
smp.c	x86/smp: Remove single IPI wrapper	2015-11-05 13:07:54 +01:00
smpboot.c	x86/smp: Fix __max_logical_packages value setup	2016-08-18 10:14:48 +02:00
stacktrace.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
step.c	Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes	2015-08-18 09:39:47 +02:00
sys_x86_64.c	x86/mm: Improve AMD Bulldozer ASLR workaround	2015-03-31 10:01:17 +02:00
sysfb.c	…
sysfb_efi.c	Merge branch 'linus' into efi/core, to pick up fixes	2016-05-07 07:00:07 +02:00
sysfb_simplefb.c	…
tboot.c	x86/tboot: Convert to hotplug state machine	2016-07-15 10:40:30 +02:00
tce_64.c	x86/cpufeature: Remove cpu_has_clflush	2016-03-31 13:35:09 +02:00
test_nx.c	x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option	2016-02-22 08:51:38 +01:00
test_rodata.c	x86: Don't use module.h just for AUTHOR / LICENSE tags	2016-07-14 13:04:20 +02:00
time.c	…
tls.c	x86/tls: Synchronize segment registers in set_thread_area()	2016-04-29 11:56:42 +02:00
tls.h	…
topology.c	x86: Drop bogus __ref / __refdata annotations	2015-07-20 18:57:20 +02:00
trace_clock.c	x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sites	2015-07-06 15:23:29 +02:00
tracepoint.c	…
traps.c	x86/kernel: Audit and remove any unnecessary uses of module.h	2016-07-14 15:06:41 +02:00
tsc.c	Merge branch 'linus' into timers/urgent, to pick up fixes	2016-08-10 14:36:23 +02:00
tsc_msr.c	x86/tsc_msr: Remove irqoff around MSR-based TSC enumeration	2016-07-11 21:30:12 +02:00
tsc_sync.c	x86/asm/tsc/sync: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers	2015-07-06 15:23:29 +02:00
uprobes.c	uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions	2016-08-12 08:29:24 +02:00
verify_cpu.S	x86/cpufeature: Carve out X86_FEATURE_*	2016-01-30 11:22:17 +01:00
vm86_32.c	x86, bitops: remove use of "sbb" to return CF	2016-06-08 12:41:20 -07:00
vmlinux.lds.S	x86/boot: Move compressed kernel to the end of the decompression buffer	2016-04-29 11:03:29 +02:00
vsmp_64.c	x86: replace __init_or_module with __init in non-modular vsmp_64.c	2015-06-16 14:12:41 -04:00
x86_init.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00
x8664_ksyms_64.c	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2016-08-01 14:23:42 -04:00