OpenCloudOS-Kernel/arch/powerpc/kernel
Paul Mackerras 371fefd6f2 KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7.  The host still has to run single-threaded.

This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability.  The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.

To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode.  KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline).  To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it.  Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.

When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host.  This number is exported
to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.

We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host.  We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.

When a vcore starts to run, it executes in the context of one of the
vcpu threads.  The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).

It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running.  In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest.  It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.

Note that there is no fixed relationship between the hardware thread
number and the vcpu number.  Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
..
vdso32 Fix common misspellings 2011-03-31 11:26:23 -03:00
vdso64 Fix common misspellings 2011-03-31 11:26:23 -03:00
.gitignore
Makefile powerpc/ftrace: Implement raw syscall tracepoints on PowerPC 2011-05-26 13:38:57 +10:00
align.c powerpc: Remove fpscr use from [kvm_]cvt_{fd,df} 2010-09-02 14:07:32 +10:00
asm-offsets.c KVM: PPC: Allow book3s_hv guests to use SMT processor modes 2011-07-12 13:16:57 +03:00
audit.c
btext.c Fix common misspellings 2011-03-31 11:26:23 -03:00
cacheinfo.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
cacheinfo.h powerpc: Rewrite sysfs processor cache info code 2009-01-08 16:25:10 +11:00
clock.c
compat_audit.c
cpu_setup_6xx.S powerpc: Fix some 6xx/7xxx CPU setup functions 2011-02-07 12:57:11 +11:00
cpu_setup_44x.S powerpc/44x: Add support for the AMCC APM821xx SoC 2010-10-13 08:47:09 -04:00
cpu_setup_a2.S powerpc: Add A2 cpu support 2011-04-27 13:02:02 +10:00
cpu_setup_fsl_booke.S powerpc/e5500: set non-base IVORs 2011-05-19 00:36:43 -05:00
cpu_setup_pa6t.S
cpu_setup_power7.S powerpc: Set up LPCR for running guest partitions 2011-07-12 13:16:52 +03:00
cpu_setup_ppc970.S
cputable.c powerpc/book3e: Fix CPU feature handling on e5500 in 32-bit mode 2011-06-02 15:29:09 -05:00
crash.c powerpc/kdump64: Don't reference freed memory as pacas 2011-05-19 14:30:44 +10:00
crash_dump.c crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn 2011-03-23 19:47:19 -07:00
dbell.c powerpc: Consolidate ipi message mux and demux 2011-05-19 15:31:03 +10:00
dma-iommu.c powerpc/iommu: Use coherent_dma_mask for alloc_coherent 2010-12-09 15:17:50 +11:00
dma-swiotlb.c of: Merge of_platform_bus_type with platform_bus_type 2010-07-24 09:57:51 -06:00
dma.c powerpc: Implement dma_mmap_coherent() 2011-03-30 10:44:00 +11:00
e500-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
entry_32.S powerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off 2011-01-21 14:08:33 +11:00
entry_64.S powerpc: Free up some CPU feature bits by moving out MMU-related features 2011-04-27 14:18:52 +10:00
exceptions-64e.S powerpc/e5500: set non-base IVORs 2011-05-19 00:36:43 -05:00
exceptions-64s.S KVM: PPC: Allow book3s_hv guests to use SMT processor modes 2011-07-12 13:16:57 +03:00
firmware.c powerpc: Make powerpc_firmware_features __read_mostly 2010-02-09 13:56:07 +11:00
fpu.S powerpc: Remove second definition of STACK_FRAME_OVERHEAD 2010-11-29 15:48:23 +11:00
fsl_booke_entry_mapping.S powerpc/fsl-booke: Fix address issue when using relocatable kernels 2010-07-11 11:04:08 -05:00
ftrace.c powerpc/ftrace: Implement raw syscall tracepoints on PowerPC 2011-05-26 13:38:57 +10:00
head_8xx.S powerpc: Remove second definition of STACK_FRAME_OVERHEAD 2010-11-29 15:48:23 +11:00
head_32.S powerpc: Remove last piece of GEMINI 2011-05-19 17:32:29 +10:00
head_40x.S Fix common misspellings 2011-03-31 11:26:23 -03:00
head_44x.S Fix common misspellings 2011-03-31 11:26:23 -03:00
head_64.S powerpc: Don't search for paca in freed memory 2011-05-19 14:30:43 +10:00
head_booke.h powerpc/booke: Add Stack Marking support to Booke Exception Prolog 2010-05-05 08:01:52 -04:00
head_fsl_booke.S powerpc/e500: SPE register saving: take arbitrary struct offset 2011-07-12 13:16:31 +03:00
hw_breakpoint.c powerpc, hw_breakpoint: Tell generic code we have no instruction breakpoints 2010-06-30 13:54:58 +10:00
ibmebus.c PM / Hibernate: Introduce CONFIG_HIBERNATE_CALLBACKS 2011-04-11 22:54:42 +02:00
idle.c powerpc: Re-enable preemption before cpu_die() 2010-08-24 15:26:29 +10:00
idle_6xx.S
idle_book3e.S powerpc/book3e: Add generic 64-bit idle powersave support 2010-07-14 14:13:18 +10:00
idle_e500.S
idle_power4.S powerpc/pmac/smp: Properly NAP offlined CPU on G5 2011-04-01 15:37:25 +11:00
idle_power7.S KVM: PPC: Allow book3s_hv guests to use SMT processor modes 2011-07-12 13:16:57 +03:00
init_task.c Use new __init_task_data macro in arch init_task.c files. 2009-09-21 06:27:08 +02:00
io-workarounds.c powerpc/pci: Properly initialize IO workaround "private" 2011-04-27 14:18:33 +10:00
io.c powerpc: tiny memcpy_(to|from)io optimisation 2009-11-04 16:43:12 -07:00
iomap.c
iommu.c powerpc: iommu: Add device name to iommu error printks 2010-12-09 15:35:32 +11:00
irq.c powerpc: Fix irq_free_virt by adjusting bounds before loop 2011-05-26 13:38:59 +10:00
isa-bridge.c
kgdb.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-05-23 09:12:26 -07:00
kprobes.c powerpc/kprobes: Remove resume_execution() in kprobes 2010-06-02 17:50:37 +10:00
kvm.c PPC: KVM: Book E doesn't have __end_interrupts. 2010-11-05 14:42:27 -02:00
kvm_emul.S KVM: PPC: Make PV mtmsrd L=1 work with r30 and r31 2010-10-24 10:52:14 +02:00
l2cr_6xx.S Fix common misspellings 2011-03-31 11:26:23 -03:00
legacy_serial.c powerpc: Check device status before adding serial device 2011-04-12 06:29:21 -05:00
lparcfg.c powerpc/pseries: Add page coalescing support 2011-05-04 16:02:21 +10:00
machine_kexec.c powerpc: Convert to new irq_* function names 2011-03-29 14:48:12 +02:00
machine_kexec_32.c powerpc/kexec: make masking/disabling interrupts generic 2010-10-14 00:52:46 -05:00
machine_kexec_64.c powerpc/kexec: Fix orphaned offline CPUs across kexec 2010-07-31 15:05:22 +10:00
misc.S powerpc: Remove unneeded cpu_setup/restore from POWER7 cputable entry 2010-11-29 15:48:22 +11:00
misc_32.S powerpc: Fix 32-bit SMP build 2011-05-20 16:23:19 -07:00
misc_64.S powerpc/kexec: Fix memory corruption from unallocated slaves 2011-05-19 14:30:43 +10:00
module.c powerpc: remove unused variable 2010-10-05 17:27:54 -07:00
module_32.c powerpc/ppc32: ftrace, dynamic ftrace to handle modules 2008-11-20 10:52:53 -08:00
module_64.c powerpc: Unify opcode definitions and support 2009-02-23 10:48:56 +11:00
mpc7450-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
msi.c powerpc/PCI: include pci.h in powerpc MSI implementation 2009-03-25 08:54:29 -07:00
nvram_64.c powerpc/nvram: Generalize code for OS partitions in NVRAM 2011-03-04 18:19:04 +11:00
of_platform.c dt/powerpc: Eliminate users of of_platform_{,un}register_driver 2011-02-28 01:36:39 -07:00
paca.c powerpc: Use nr_cpu_ids in initial paca allocation 2011-05-19 14:30:44 +10:00
pci-common.c powerpc: Convert to new irq_* function names 2011-03-29 14:48:12 +02:00
pci_32.c powerpc/pci: Make both ppc32 and ppc64 use sysdata for pci_controller 2011-02-04 11:46:51 -07:00
pci_64.c powerpc/pci: Make both ppc32 and ppc64 use sysdata for pci_controller 2011-02-04 11:46:51 -07:00
pci_dn.c powerpc: Remove alloc_maybe_bootmem for zalloc version 2011-05-19 15:30:57 +10:00
pci_of_scan.c powerpc/pci: Make both ppc32 and ppc64 use sysdata for pci_controller 2011-02-04 11:46:51 -07:00
perf_callchain.c perf: Factorize callchain context handling 2010-08-19 01:32:11 +02:00
perf_event.c powerpc/perf_event: Skip updating kernel counters if register value shrinks 2011-04-18 13:08:23 +10:00
perf_event_fsl_emb.c powerpc, perf: Fix frequency calculation for overflowing counters (FSL version) 2011-01-19 20:05:42 +01:00
pmc.c powerpc: Convert pmc_owner_lock to raw_spinlock 2010-02-19 14:52:33 +11:00
power4-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
power5+-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
power5-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
power6-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
power7-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
ppc32.h
ppc970-pmu.c perf, arch: Cleanup perf-pmu init vs lockup-detector 2010-11-26 15:14:56 +01:00
ppc_ksyms.c powerpc: Simplify 4k/64k copy_page logic 2011-05-19 14:30:42 +10:00
ppc_save_regs.S Fix common misspellings 2011-03-31 11:26:23 -03:00
proc_powerpc.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
process.c KVM: PPC: Add support for Book3S processors in hypervisor mode 2011-07-12 13:16:54 +03:00
prom.c powerpc: Force page alignment for initrd reserved memory 2011-06-09 16:52:38 +10:00
prom_init.c powerpc/pseries: Add page coalescing support 2011-05-04 16:02:21 +10:00
prom_init_check.sh powerpc: Fix compile errors in prom_init_check for gcc 4.5 2010-07-08 18:11:39 +10:00
prom_parse.c of/pci: move of_irq_map_pci() into generic code 2011-02-04 11:46:50 -07:00
ptrace.c powerpc/ftrace: Implement raw syscall tracepoints on PowerPC 2011-05-26 13:38:57 +10:00
ptrace32.c powerpc: Update compat_arch_ptrace 2010-12-09 15:35:32 +11:00
reloc_64.S
rtas-proc.c powerpc: Move /proc/ppc64 to /proc/powerpc update 2010-01-15 13:26:17 +11:00
rtas-rtc.c powerpc/rtas-rtc: remove sideeffects of printk_ratelimit 2011-06-29 15:30:43 +10:00
rtas.c powerpc/pseries: Add page coalescing support 2011-05-04 16:02:21 +10:00
rtas_flash.c powerpc/rtas_flash: Use simple_read_from_buffer 2011-01-21 14:08:34 +11:00
rtas_pci.c powerpc/pci: Clean up direct access to sysdata by RTAS 2009-05-21 15:44:23 +10:00
rtasd.c Fix common misspellings 2011-03-31 11:26:23 -03:00
setup-common.c KVM: PPC: Add support for Book3S processors in hypervisor mode 2011-07-12 13:16:54 +03:00
setup.h
setup_32.c powerpc: Properly handshake CPUs going out of boot spin loop 2011-04-20 11:03:24 +10:00
setup_64.c powerpc/fsl-booke64: Add support for Debug Level exception handler 2011-05-19 00:36:42 -05:00
signal.c powerpc: fix double syscall restarts 2010-09-22 09:33:50 -07:00
signal.h powerpc: Sanitize stack pointer in signal handling code 2009-03-27 16:58:24 +11:00
signal_32.c arch/powerpc: use printk_ratelimited instead of printk_ratelimit 2011-06-29 15:31:01 +10:00
signal_64.c arch/powerpc: use printk_ratelimited instead of printk_ratelimit 2011-06-29 15:31:01 +10:00
smp-tbsync.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
smp.c KVM: PPC: Add support for Book3S processors in hypervisor mode 2011-07-12 13:16:54 +03:00
softemu8xx.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
stacktrace.c
suspend.c update email address 2010-07-19 10:56:54 +02:00
swsusp.c PM / Hibernate: Remove arch_prepare_suspend() 2011-05-24 23:35:55 +02:00
swsusp_32.S Fix common misspellings 2011-03-31 11:26:23 -03:00
swsusp_64.c
swsusp_asm64.S
swsusp_booke.S powerpc/fsl-booke: Add hibernation support for FSL BookE processors 2010-05-21 07:41:53 -05:00
sys_ppc32.c BKL: remove extraneous #include <smp_lock.h> 2010-11-17 08:59:32 -08:00
syscalls.c Add generic sys_olduname() 2010-03-12 15:52:32 -08:00
sysfs.c powerpc: Per process DSCR + some fixes (try#4) 2011-04-27 14:18:19 +10:00
systbl.S
systbl_chk.c
systbl_chk.sh
tau_6xx.c tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
time.c powerpc: Fix oops if scan_dispatch_log is called too early 2011-04-18 13:08:19 +10:00
traps.c powerpc/e500: Save SPEFCSR in flush_spe_to_thread() 2011-07-12 13:16:30 +03:00
udbg.c powerpc: Add early debug for WSP platforms 2011-05-06 13:32:41 +10:00
udbg_16550.c powerpc: Add early debug for WSP platforms 2011-05-06 13:32:41 +10:00
vdso.c mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm 2011-03-23 16:36:55 -04:00
vecemu.c
vector.S powerpc: Remove static branch hint in giveup_altivec 2011-05-19 14:30:42 +10:00
vio.c powerpc/iommu: Use coherent_dma_mask for alloc_coherent 2010-12-09 15:17:50 +11:00
vmlinux.lds.S percpu: Always align percpu output section to PAGE_SIZE 2011-03-24 18:50:09 +01:00