linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	0988c37c24	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix crash due to missing debugctlmsr on AMD K6-3 x86: add PTE_FLAGS_MASK x86: rename PTE_MASK to PTE_PFN_MASK x86: fix pte_flags() to only return flags, fix lguest (updated) x86: use setup_clear_cpu_cap with disable_apic, fix x86: move the last Dprintk instance to pr_debug()	2008-07-22 13:40:24 -07:00
Jan Kratochvil	d536b1f865	x86: fix crash due to missing debugctlmsr on AMD K6-3 currently if you use PTRACE_SINGLEBLOCK on AMD K6-3 (i586) it will crash. Kernel now wrongly assumes existing DEBUGCTLMSR MSR register there. Removed the assumption also for some other non-K6 CPUs but I am not sure there (but it can only bring small inefficiency there if my assumption is wrong). Based on info from Roland McGrath, Chuck Ebbert and Mikulas Patocka. More info at: https://bugzilla.redhat.com/show_bug.cgi?id=456175 Signed-off-by: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-22 14:16:37 +02:00
Jeremy Fitzhardinge	77be1fabd0	x86: add PTE_FLAGS_MASK PTE_PFN_MASK was getting lonely, so I made it a friend. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-22 10:43:45 +02:00
Jeremy Fitzhardinge	59438c9fc4	x86: rename PTE_MASK to PTE_PFN_MASK Rusty, in his peevish way, complained that macros defining constants should have a name which somewhat accurately reflects the actual purpose of the constant. Aside from the fact that PTE_MASK gives no clue as to what's actually being masked, and is misleadingly similar to the functionally entirely different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the problem is. But if this patch silences the incessent noise, then it will have achieved its goal (TODO: write test-case). Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-22 10:43:44 +02:00
Rusty Russell	c2e3277f87	x86: fix pte_flags() to only return flags, fix lguest (updated) (Jeremy said: rusty: use PTE_MASK rusty: use PTE_MASK rusty: use PTE_MASK When I asked: jsgf: does that include the NX flag? He responded eloquently: rusty: use PTE_MASK rusty: use PTE_MASK yes, it's the official constant of masking flags out of ptes ) Change `a15af1c9ea` 'x86/paravirt: add pte_flags to just get pte flags' removed lguest's private pte_flags() in favor of a generic one. Unfortunately, the generic one doesn't filter out the non-flags bits: this results in lguest creating corrupt shadow page tables and blowing up host memory. Since noone is supposed to use the pfn part of pte_flags(), it seems safest to always do the filtering. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-and-morning-tea-spilled-by: Ingo Molnar <mingo@elte.hu>	2008-07-22 10:41:18 +02:00
Yinghai Lu	988781dc3e	x86: use setup_clear_cpu_cap with disable_apic, fix beauty fix: /proc/cpuinfo will still show apic feature even if we booted up with it disabled. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-22 09:05:45 +02:00
Andi Kleen	d95d62c018	sysdev: Convert the x86 mce tolerant sysdev attribute to generic attribute Use the new generic int attribute accessors for the x86 mce tolerant attribute. Simple example to illustrate the new macros. There are much more places all over the tree that could be converted like this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:02 -07:00
Andi Kleen	4a0b2b4dbe	sysdev: Pass the attribute to the low level sysdev show/store function This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:55:02 -07:00
Greg Kroah-Hartman	fc3a8828b1	driver core: fix a lot of printk usages of bus_id We have the dev_printk() variants for this kind of thing, use them instead of directly trying to access the bus_id field of struct device. This is done in order to remove bus_id entirely. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:53 -07:00
Greg Kroah-Hartman	3bfd49c8ab	device create: x86: convert device_create to device_create_drvdata device_create() is race-prone, so use the race-free device_create_drvdata() instead as device_create() is going away. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-07-21 21:54:46 -07:00
Linus Torvalds	6d52dcbe56	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] cpufreq: remove CVS keywords [CPUFREQ] change cpu freq arrays to per_cpu variables	2008-07-21 15:10:37 -07:00
Linus Torvalds	f2d0f1dea4	x86: Fix help message for STRICT_DEVMEM config option The message talked about "left on" when it meant to say disabled. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-21 13:04:08 -07:00
Thomas Gleixner	5171c3047d	x86: move the last Dprintk instance to pr_debug() Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-21 21:58:34 +02:00
Thomas Gleixner	cfc1b9a6a6	x86: convert Dprintk to pr_debug There are a couple of places where (P)Dprintk is used which is an old compile time enabled printk wrapper. Convert it to the generic pr_debug(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-07-21 21:35:38 +02:00
Ingo Molnar	eb6a12c242	Merge branch 'linus' into cpus4096-for-linus Conflicts: net/sunrpc/svc.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-21 17:19:50 +02:00
Ingo Molnar	2e2dcc7631	Merge branch 'x86/paravirt-spinlocks' into x86/for-linus	2008-07-21 16:45:56 +02:00
Ingo Molnar	acee709cab	Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus	2008-07-21 16:37:17 +02:00
Ingo Molnar	e66d90fb4a	Merge branch 'linus' into xen-64bit	2008-07-21 15:06:09 +02:00
Ingo Molnar	1c29dd9a9e	Merge branch 'linus' into x86/paravirt-spinlocks	2008-07-21 15:05:58 +02:00
Yinghai Lu	7edf8891ad	x86: remove extra calling to get ext cpuid level Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-21 13:03:13 +02:00
Yinghai Lu	9175fc06ae	x86: use setup_clear_cpu_cap() when disabling the lapic ... so don't need to call clear_cpu_cap again in early_identify_cpu, and could use cleared_cpu_caps like other places. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-21 13:03:12 +02:00
Ingo Molnar	e27772b48d	Merge branch 'linus' into x86/urgent	2008-07-21 11:02:45 +02:00
Avi Kivity	722c05f219	KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts The direct mapped shadow code (used for real mode and two dimensional paging) sets upper-level ptes using direct assignment rather than calling set_shadow_pte(). A nonpae host will split this into two writes, which opens up a race if another vcpu accesses the same memory area. Fix by calling set_shadow_pte() instead of assigning directly. Noticed by Izik Eidus. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Glauber Costa	2a7c5b8b55	KVM: x86 emulator: emulate clflush If the guest issues a clflush in a mmio address, the instruction can trap into the hypervisor. Currently, we do not decode clflush properly, causing the guest to hang. This patch fixes this emulating clflush (opcode 0f ae). Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Marcelo Tosatti	376c53c2b3	KVM: MMU: improve invalid shadow root page handling Harden kvm_mmu_zap_page() against invalid root pages that had been shadowed from memslots that are gone. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Marcelo Tosatti	34d4cb8fca	KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction Flush the shadow mmu before removing regions to avoid stale entries. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:40 +03:00
Avi Kivity	d6e88aec07	KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues Fixes compilation with CONFIG_VMI enabled. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:39 +03:00
Ben-Ami Yassour	c65bbfa1d6	KVM: check injected pic irq within valid pic irqs Check that an injected pic irq is between 0 and 15. Signed-off-by: Ben-Ami Yassour <benami@il.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:39 +03:00
Mohammed Gamal	19fdfa0d13	KVM: x86 emulator: Fix HLT instruction This patch fixes issue encountered with HLT instruction under FreeDOS's HIMEM XMS Driver. The HLT instruction jumped directly to the done label and skips updating the EIP value, therefore causing the guest to spin endlessly on the same instruction. The patch changes the instruction so that it writes back the updated EIP value. Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:38 +03:00
Avi Kivity	ac9f6dc0db	KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:38 +03:00
Sheng Yang	4e1096d27f	KVM: VMX: Add ept_sync_context in flush_tlb Fix a potention issue caused by kvm_mmu_slot_remove_write_access(). The old behavior don't sync EPT TLB with modified EPT entry, which result in inconsistent content of EPT TLB and EPT table. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:38 +03:00
Marcelo Tosatti	5a4c928804	KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held kvm_mmu_zap_page() needs slots lock held (rmap_remove->gfn_to_memslot, for example). Since kvm_lock spinlock is held in mmu_shrink(), do a non-blocking down_read_trylock(). Untested. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:38 +03:00
Adrian Bunk	7e37c2998a	x86: KVM guest: make kvm_smp_prepare_boot_cpu() static This patch makes the needlessly global kvm_smp_prepare_boot_cpu() static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:37 +03:00
Joerg Roedel	0da1db75a2	KVM: SVM: fix suspend/resume support On suspend the svm_hardware_disable function is called which frees all svm_data variables. On resume they are not re-allocated. This patch removes the deallocation of svm_data from the hardware_disable function to the hardware_unsetup function which is not called on suspend. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:37 +03:00
Marcelo Tosatti	f8b78fa3d4	KVM: move slots_lock acquision down to vapic_exit There is no need to grab slots_lock if the vapic_page will not be touched. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:36 +03:00
Chris Lalancette	efa67e0d1f	KVM: VMX: Fake emulate Intel perfctr MSRs Older linux guests (in this case, 2.6.9) can attempt to access the performance counter MSRs without a fixup section, and injecting a GPF kills the guest. Work around by allowing the guest to write those MSRs. Tested by me on RHEL-4 i386 and x86_64 guests, as well as F-9 guests. Signed-off-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:36 +03:00
Sheng Yang	65267ea1b3	KVM: VMX: Fix a wrong usage of vmcs_config The function ept_update_paging_mode_cr0() write to CPU_BASED_VM_EXEC_CONTROL based on vmcs_config.cpu_based_exec_ctrl. That's wrong because the variable may not consistent with the content in the CPU_BASE_VM_EXEC_CONTROL MSR. Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:36 +03:00
Avi Kivity	db475c39ec	KVM: MMU: Fix printk format Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:35 +03:00
Avi Kivity	6ada8cca79	KVM: MMU: When debug is enabled, make it a run-time parameter Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:35 +03:00
Avi Kivity	7a5b56dfd3	KVM: x86 emulator: lazily evaluate segment registers Instead of prefetching all segment bases before emulation, read them at the last moment. Since most of them are unneeded, we save some cycles on Intel machines where this is a bit expensive. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:35 +03:00
Avi Kivity	0adc8675d6	KVM: x86 emulator: avoid segment base adjust for lea Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:34 +03:00
Avi Kivity	f5b4edcd52	KVM: x86 emulator: simplify rip relative decoding rip relative decoding is relative to the instruction pointer of the next instruction; by moving address adjustment until after decoding is complete, we remove the need to determine the instruction size. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:34 +03:00
Avi Kivity	84411d85da	KVM: x86 emulator: simplify r/m decoding Consolidate the duplicated code when not in any special case. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:33 +03:00
Avi Kivity	dc71d0f162	KVM: x86 emulator: simplify sib decoding Instead of using sparse switches, use simpler if/else sequences. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:33 +03:00
Avi Kivity	8684c0af0b	KVM: x86 emulator: handle undecoded rex.b with r/m = 5 in certain cases x86_64 does not decode rex.b in certain cases, where the r/m field = 5. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:33 +03:00
Mohammed Gamal	b13354f8f0	KVM: x86 emulator: emulate nop and xchg reg, acc (opcodes 0x90 - 0x97) Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:33 +03:00
Avi Kivity	f76c710d75	KVM: Use printk_rlimit() instead of reporting emulation failures just once Emulation failure reports are useful, so allow more than one per the lifetime of the module. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:32 +03:00
Glauber Costa	25be46080f	KVM: Do not calculate linear rip in emulation failure report If we're not gonna do anything (case in which failure is already reported), we do not need to even bother with calculating the linear rip. Signed-off-by: Glauber Costa <gcosta@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:32 +03:00
Marcelo Tosatti	622395a9e6	KVM: only abort guest entry if timer count goes from 0->1 Only abort guest entry if the timer count went from 0->1, since for 1->2 or larger the bit will either be set already or a timer irq will have been injected. Using atomic_inc_and_test() for it also introduces an SMP barrier to the LAPIC version (thought it was unecessary because of timer migration, but guest can be scheduled to a different pCPU between exit and kvm_vcpu_block(), so there is the possibility for a race). Noticed by Avi. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:32 +03:00
Laurent Vivier	542472b53e	KVM: Add coalesced MMIO support (x86 part) This patch enables coalesced MMIO for x86 architecture. It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO. It enables the compilation of coalesced_mmio.c. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:31 +03:00
Laurent Vivier	92760499d0	KVM: kvm_io_device: extend in_range() to manage len and write attribute Modify member in_range() of structure kvm_io_device to pass length and the type of the I/O (write or read). This modification allows to use kvm_io_device with coalesced MMIO. Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:30 +03:00
Avi Kivity	131d82791b	KVM: MMU: Avoid page prefetch on SVM SVM cannot benefit from page prefetching since guest page fault bypass cannot by made to work there. Avoid accessing the guest page table in this case. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:30 +03:00
Avi Kivity	d761a501cf	KVM: MMU: Move nonpaging_prefetch_page() In preparation for next patch. No code change. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:30 +03:00
Avi Kivity	91ed7a0e15	KVM: x86 emulator: implement 'push imm' (opcode 0x68) Encountered in FC6 boot sequence, now that we don't force ss.rpl = 0 during the protected mode transition. Not really necessary, but nice to have. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:29 +03:00
Avi Kivity	19e43636b5	KVM: x86 emulator: simplify push imm8 emulation Instead of fetching the data explicitly, use SrcImmByte. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:29 +03:00
Avi Kivity	eab9f71feb	KVM: MMU: Optimize prefetch_page() Instead of reading each pte individually, read 256 bytes worth of ptes and batch process them. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:28 +03:00
Guillaume Thouvenin	38d5bc6d50	KVM: x86 emulator: Add support for mov r, sreg (0x8c) instruction Add support for mov r, sreg (0x8c) instruction Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:28 +03:00
Guillaume Thouvenin	4257198ae2	KVM: x86 emulator: Add support for mov seg, r (0x8e) instruction Add support for mov r, sreg (0x8c) instruction. [avi: drop the sreg decoding table in favor of 1:1 encoding] Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:28 +03:00
Guillaume Thouvenin	615ac12561	KVM: x86 emulator: adds support to mov r,imm (opcode 0xb8) instruction Add support to mov r, imm (0xb8) instruction. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:27 +03:00
Guillaume Thouvenin	954cd36f76	KVM: x86 emulator: add support for jmp far 0xea Add support for jmp far (opcode 0xea) instruction. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:27 +03:00
Guillaume Thouvenin	89c696383d	KVM: x86 emulator: Update c->dst.bytes in decode instruction Update c->dst.bytes in decode instruction instead of instruction itself. It's needed because if c->dst.bytes is equal to 0, the instruction is not emulated. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:27 +03:00
Guillaume Thouvenin	3e6e0aab1b	KVM: Prefixes segment functions that will be exported with "kvm_" Prefixes functions that will be exported with kvm_. We also prefixed set_segment() even if it still static to be coherent. signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Laurent Vivier <laurent.vivier@bull.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:27 +03:00
Avi Kivity	9ba075a664	KVM: MTRR support Add emulation for the memory type range registers, needed by VMware esx 3.5, and by pci device assignment. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:26 +03:00
Sheng Yang	f08864b42a	KVM: VMX: Enable NMI with in-kernel irqchip Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:26 +03:00
Sheng Yang	3419ffc8e4	KVM: IOAPIC/LAPIC: Enable NMI support [avi: fix ia64 build breakage] Signed-off-by: Sheng Yang <sheng.yang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:25 +03:00
Avi Kivity	50d40d7fb9	KVM: Remove unnecessary ->decache_regs() call Since we aren't modifying any register, there's no need to decache the register state. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:25 +03:00
Avi Kivity	7cc8883074	KVM: Remove decache_vcpus_on_cpu() and related callbacks Obsoleted by the vmx-specific per-cpu list. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:25 +03:00
Avi Kivity	543e424366	KVM: VMX: Add list of potentially locally cached vcpus VMX hardware can cache the contents of a vcpu's vmcs. This cache needs to be flushed when migrating a vcpu to another cpu, or (which is the case that interests us here) when disabling hardware virtualization on a cpu. The current implementation of decaching iterates over the list of all vcpus, picks the ones that are potentially cached on the cpu that is being offlined, and flushes the cache. The problem is that it uses mutex_trylock() to gain exclusive access to the vcpu, which fires off a (benign) warning about using the mutex in an interrupt context. To avoid this, and to make things generally nicer, add a new per-cpu list of potentially cached vcus. This makes the decaching code much simpler. The list is vmx-specific since other hardware doesn't have this issue. [andrea: fix crash on suspend/resume] Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:42:24 +03:00
Avi Kivity	4ecac3fd6d	KVM: Handle virtualization instruction #UD faults during reboot KVM turns off hardware virtualization extensions during reboot, in order to disassociate the memory used by the virtualization extensions from the processor, and in order to have the system in a consistent state. Unfortunately virtual machines may still be running while this goes on, and once virtualization extensions are turned off, any virtulization instruction will #UD on execution. Fix by adding an exception handler to virtualization instructions; if we get an exception during reboot, we simply spin waiting for the reset to complete. If it's a true exception, BUG() so we can have our stack trace. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:41:43 +03:00
Avi Kivity	1b7fcd3263	KVM: MMU: Fix false flooding when a pte points to page table The KVM MMU tries to detect when a speculative pte update is not actually used by demand fault, by checking the accessed bit of the shadow pte. If the shadow pte has not been accessed, we deem that page table flooded and remove the shadow page table, allowing further pte updates to proceed without emulation. However, if the pte itself points at a page table and only used for write operations, the accessed bit will never be set since all access will happen through the emulator. This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels. The kernel points a kmap_atomic() pte at a page table, and then proceeds with read-modify-write operations to look at the dirty and accessed bits. We get a false flood trigger on the kmap ptes, which results in the mmu spending all its time setting up and tearing down shadows. Fix by setting the shadow accessed bit on emulated accesses. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:50 +03:00
Avi Kivity	7682f2d0dd	KVM: VMX: Trivial vmcs_write64() code simplification Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:50 +03:00
Chris Lalancette	14ae51b6c0	KVM: SVM: Fake MSR_K7 performance counters Attached is a patch that fixes a guest crash when booting older Linux kernels. The problem stems from the fact that we are currently emulating MSR_K7_EVNTSEL[0-3], but not emulating MSR_K7_PERFCTR[0-3]. Because of this, setup_k7_watchdog() in the Linux kernel receives a GPF when it attempts to write into MSR_K7_PERFCTR, which causes an OOPs. The patch fixes it by just "fake" emulating the appropriate MSRs, throwing away the data in the process. This causes the NMI watchdog to not actually work, but it's not such a big deal in a virtualized environment. When we get a write to one of these counters, we printk_ratelimit() a warning. I decided to print it out for all writes, even if the data is 0; it doesn't seem to make sense to me to special case when data == 0. Tested by myself on a RHEL-4 guest, and Joerg Roedel on a Windows XP 64-bit guest. Signed-off-by: Chris Lalancette <clalance@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:49 +03:00
Aurelien Jarno	f697554515	KVM: PIT: support mode 3 The in-kernel PIT emulation ignores pending timers if operating under mode 3, which for example Hurd uses. This mode should output a square wave, high for (N+1)/2 counts and low for (N-1)/2 counts. As we only care about the resulting interrupts, the period is N, and mode 3 is the same as mode 2 with regard to interrupts. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:49 +03:00
Joerg Roedel	d2ebb4103f	KVM: SVM: add tracing support for TDP page faults To distinguish between real page faults and nested page faults they should be traced as different events. This is implemented by this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:48 +03:00
Joerg Roedel	af9ca2d703	KVM: SVM: add missing kvmtrace markers This patch adds the missing kvmtrace markers to the svm module of kvm. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:48 +03:00
Joerg Roedel	54e445ca84	KVM: add missing kvmtrace bits This patch adds some kvmtrace bits to the generic x86 code where it is instrumented from SVM. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:48 +03:00
Joerg Roedel	a069805579	KVM: SVM: implement dedicated INTR exit handler With an exit handler for INTR intercepts its possible to account them using kvmtrace. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:47 +03:00
Joerg Roedel	c47f098d69	KVM: SVM: implement dedicated NMI exit handler With an exit handler for NMI intercepts its possible to account them using kvmtrace. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:47 +03:00
Joerg Roedel	c7bf23babc	KVM: VMX: move APIC_ACCESS trace entry to generic code This patch moves the trace entry for APIC accesses from the VMX code to the generic lapic code. This way APIC accesses from SVM will also be traced. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:47 +03:00
Harvey Harrison	8b2cf73cc1	KVM: add statics were possible, function definition in lapic.h Noticed by sparse: arch/x86/kvm/vmx.c:1583:6: warning: symbol 'vmx_disable_intercept_for_msr' was not declared. Should it be static? arch/x86/kvm/x86.c:3406:5: warning: symbol 'kvm_task_switch_16' was not declared. Should it be static? arch/x86/kvm/x86.c:3429:5: warning: symbol 'kvm_task_switch_32' was not declared. Should it be static? arch/x86/kvm/mmu.c:1968:6: warning: symbol 'kvm_mmu_remove_one_alloc_mmu_page' was not declared. Should it be static? arch/x86/kvm/mmu.c:2014:6: warning: symbol 'mmu_destroy_caches' was not declared. Should it be static? arch/x86/kvm/lapic.c:862:5: warning: symbol 'kvm_lapic_get_base' was not declared. Should it be static? arch/x86/kvm/i8254.c:94:5: warning: symbol 'pit_get_gate' was not declared. Should it be static? arch/x86/kvm/i8254.c:196:5: warning: symbol '__pit_timer_fn' was not declared. Should it be static? arch/x86/kvm/i8254.c:561:6: warning: symbol '__inject_pit_timer_intr' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-07-20 12:40:46 +03:00
Peter Zijlstra	31656519e1	sched, x86: clean up hrtick implementation random uvesafb failures were reported against Gentoo: http://bugs.gentoo.org/show_bug.cgi?id=222799 and Mihai Moldovan bisected it back to: > `8f4d37ec07` is first bad commit > commit `8f4d37ec07` > Author: Peter Zijlstra <a.p.zijlstra@chello.nl> > Date: Fri Jan 25 21:08:29 2008 +0100 > > sched: high-res preemption tick Linus suspected it to be hrtick + vm86 interaction and observed: > Btw, Peter, Ingo: I think that commit is doing bad things. They aren't > _incorrect_ per se, but they are definitely bad. > > Why? > > Using random _TIF_WORK_MASK flags is really impolite for doing > "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S > special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of > vm86 mode unnecessarily. > > See the "work_notifysig_v86" label, and how it does that > "save_v86_state()" thing etc etc. Right, I never liked having to fiddle with those TIF flags. Initially I needed it because the hrtimer base lock could not nest in the rq lock. That however is fixed these days. Currently the only reason left to fiddle with the TIF flags is remote wakeups. We cannot program a remote cpu's hrtimer. I've been thinking about using the new and improved IPI function call stuff to implement hrtimer_start_on(). However that does require that smp_call_function_single(.wait=0) works from interrupt context - /me looks at the latest series from Jens - Yes that does seem to be supported, good. Here's a stab at cleaning this stuff up ... Mihai reported test success as well. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Mihai Moldovan <ionic@ionic.de> Cc: Michal Januszewski <spock@gentoo.org> Cc: Antonino Daplas <adaplas@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:37:28 +02:00
Mike Travis	c4762aba0b	NR_CPUS: Replace NR_CPUS in speedstep-centrino.c Some cleanups in speedstep-centrino.c for NR_CPUS=4096. * Use new CPUMASK_PTR (instead of old CPUMASK_VAR). * Replace arrays sized by NR_CPUS with percpu variables. * Cleanup some formatting problems (>80 chars per line) and other checkpatch complaints. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:12 +02:00
Mike Travis	1bd9d6b64e	NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c * nr_cpu_ids should be used to determine if a percpu area is available for a given cpu. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:10 +02:00
Mike Travis	247bc6ca0f	NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c * Replace NR_CPUS loop with for_each_possible_cpu(). * nr_cpu_ids should be used to determine if a percpu area is available for a given cpu. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:09 +02:00
Mike Travis	f2ad47ffeb	NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c * Use nr_cpu_ids instead of NR_CPUS to limit traversal of cpu online map. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:09 +02:00
Mike Travis	6bca67f951	NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c * nr_cpu_ids should be used to allocate arrays based on the number of cpu's present. Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:21:08 +02:00
Simon Arlott	e3a61b0a8c	x86: add unknown_nmi_panic kernel parameter It's not possible to enable the unknown_nmi_panic sysctl option until init is run. It's useful to be able to panic the kernel during boot too, this adds a parameter to enable this option. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 10:10:31 +02:00
Yinghai Lu	63b5d7af25	x86: add ->pre_time_init to x86_quirks so NUMAQ can use that to call numaq_pre_time_init() This allows us to remove a NUMAQ special from arch/x86/kernel/setup.c. (and paves the way to remove the NUMAQ subarch) Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 09:25:52 +02:00
Yinghai Lu	64898a8bad	x86: extend and use x86_quirks to clean up NUMAQ code add these new x86_quirks methods: int mpc_record; int (mpc_apic_id)(struct mpc_config_processor m); void (mpc_oem_bus_info)(struct mpc_config_bus m, char name); void (mpc_oem_pci_bus)(struct mpc_config_bus m); void (smp_read_mpc_oem)(struct mp_config_oemtable oemtable, unsigned short oemsize); ... and move NUMAQ related mps table handling to numaq_32.c. also move the call to smp_read_mpc_oem() to smp_read_mpc() directly. Should not change functionality, albeit it would be nice to get it tested on real NUMAQ as well ... Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 09:25:52 +02:00
Yinghai Lu	3c9cb6de1e	x86: introduce x86_quirks introduce x86_quirks array of boot-time quirk methods. No change in functionality intended. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 09:18:17 +02:00
Yinghai Lu	5f1f2b3d9d	x86: improve debug printout: add target bootmem range in early_res_to_bootmem() Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-20 09:11:07 +02:00
Ingo Molnar	d092633bff	Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM From: Arjan van de Ven <arjan@infradead.org> Date: Sat, 19 Jul 2008 15:47:17 -0700 CONFIG_NONPROMISC_DEVMEM was a rather confusing name - but renaming it to CONFIG_PROMISC_DEVMEM causes problems on architectures that do not support this feature; this patch renames it to CONFIG_STRICT_DEVMEM, so that architectures can opt-in into it. ( the polarity of the option is still the same as it was originally; it needs to be for now to not break architectures that don't have the infastructure yet to support this feature) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: "V.Radhakrishnan" <rk@atr-labs.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> ---	2008-07-20 08:35:55 +02:00
Yinghai Lu	e5849e71ad	x86: remove arch_get_ram_range no user now Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 17:43:40 -07:00
venkatesh.pallipadi@intel.com	fec0962e0b	x86: Add a debugfs interface to dump PAT memtype Add a debugfs interface to list out all the PAT memtype reservations. Appears at debugfs x86/pat_memtype_list and output format is type @ <start addr>-<end addr> We do not hold the lock while printing the entire list. So, the list may not be a consistent copy in case where regions are getting added or deleted at the same time. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 17:22:05 -07:00
venkatesh.pallipadi@intel.com	ae79cdaacb	x86: Add a arch directory for x86 under debugfs Add a directory for x86 arch under debugfs. Can be used to accumulate all x86 specific debugfs files. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 17:22:04 -07:00
Jan Beulich	2ddf9b7b3e	i386/xen: add proper unwind annotations to xen_sysenter_target Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 16:05:55 -07:00
Jan Beulich	08ad8afaa0	x86: reduce force_mwait visibility It's not used anywhere outside its single referencing file. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 15:55:09 -07:00
Jan Beulich	08e1a13e7d	x86: reduce forbid_dac's visibility It's not used anywhere outside its declaring file. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 14:39:37 -07:00
Jan Beulich	369c99205f	x86: fix two modpost warnings Even though it's only the difference of the two __initdata symbols that's being calculated, modpost still doesn't like this. So rather calculate the size once in an __init function and store it for later use. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 14:34:08 -07:00
Jan Beulich	f2ba93929f	x86: check function status in EDD boot code Without checking the return value of get_edd_info() and adding the entry only in the success case, 6 devices show up under /sys/firmware/edd/, no matter how many devices are actually present. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2008-07-18 14:33:17 -07:00

1 2 3 4 5 ...

3547 Commits