Commit Graph

1941 Commits

Author SHA1 Message Date
Petr Tesarik 4ab4ba32aa x86, tracehook: clean up implementation of syscall_get_error()
The x86-tracehook code now contains this line in syscall_get_error():

	return error >= -4095L ? error : 0;

Hard-wiring a constant is not nice. Let's use the IS_ERR_VALUE macro
from linux/err.h instead.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Cc: utrace-devel@redhat.com
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:53:16 +02:00
Ingo Molnar 28c3cfd5fb Merge branch 'linus' into x86/tracehook 2008-09-05 17:53:05 +02:00
Alex Nixon 913da64b54 x86: build fix for !CONFIG_SMP
Move reset_lazy_tlbstate into tlb_32.c, and define noop versions of
play_dead() in process_{32,64}.c when !CONFIG_SMP.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:44:08 +02:00
Jeremy Fitzhardinge 110e0358e7 x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious
The CPA test code uses _PAGE_UNUSED1, so make sure its obvious.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:09:57 +02:00
Jan Beulich 74e91604b2 x86: ticket spin locks: reduce instruction dependencies
Reduce the amount of partial register accesses in the NR_CPUS < 256
case, and slightly weaken resource dependencies in the other case.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:06:20 +02:00
Jan Beulich 08f5fcbe6e x86: ticket spin locks: factor out more common code
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:05:56 +02:00
Jan Beulich ef1f341328 x86: ticket spin locks: fix asm constraints
In addition to these changes I doubt the 'volatile' on all the ticket
lock asm()-s are really necessary.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 17:04:08 +02:00
Ingo Molnar 0a328ea43d Merge branch 'x86/alternatives' into x86/core 2008-09-05 17:03:17 +02:00
Yinghai Lu 97e4db7c87 x86: make detect_ht depend on CONFIG_X86_HT
64-bit has X86_HT set too, so use that instead of SMP.

This also removes a include/asm-x86/processor.h ifdef.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:40:45 +02:00
Ingo Molnar 9042763808 Merge branch 'x86/x2apic' into x86/core
Conflicts:
	arch/x86/kernel/cpu/common_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-05 09:21:21 +02:00
Ingo Molnar 446d27338d Merge branch 'x86/cpu' into x86/core 2008-09-05 09:19:50 +02:00
Ingo Molnar accf0fa697 Merge branch 'x86/xsave' into x86/core 2008-09-05 09:18:39 +02:00
Yinghai Lu 3da99c9776 x86: make (early)_identify_cpu more the same between 32bit and 64 bit
1. add extended_cpuid_level for 32bit
 2. add generic_identify for 64bit
 3. add early_identify_cpu for 32bit
 4. early_identify_cpu not be called by identify_cpu
 5. remove early in get_cpu_vendor for 32bit
 6. add get_cpu_cap
 7. add cpu_detect for 64bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 21:09:44 +02:00
Ingo Molnar 62b3f98188 Merge branch 'x86/debug' into x86/cpu 2008-09-04 21:08:09 +02:00
H. Peter Anvin aa3341a168 Merge branch 'x86/cpu' into x86/x2apic
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 09:21:21 -07:00
H. Peter Anvin fe47784ba5 Merge branch 'x86/cpu' into x86/xsave
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 09:04:45 -07:00
Yinghai Lu 58f7c98850 x86: split e820 reserved entries record to late v2
so could let BAR res register at first, or even pnp.

v2: insert e820 reserve resources before pnp_system_init

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 08:37:57 -07:00
H. Peter Anvin 0ccd8c39bc Merge branch 'linus' into x86/core 2008-09-04 08:09:09 -07:00
H. Peter Anvin 7203781c98 Merge branch 'x86/cpu' into x86/core
Conflicts:

	arch/x86/kernel/cpu/feature_names.c
	include/asm-x86/cpufeature.h
2008-09-04 08:08:42 -07:00
Ingo Molnar 42390cdec5 Merge branch 'linus' into x86/x2apic
Conflicts:
	arch/x86/kernel/cpu/cyrix.c
	include/asm-x86/cpufeature.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-04 13:02:35 +02:00
Linus Torvalds e52c8857e0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: update defconfigs
  x86: msr: fix bogus return values from rdmsr_safe/wrmsr_safe
  x86: cpuid: correct return value on partial operations
  x86: msr: correct return value on partial operations
  x86: cpuid: propagate error from smp_call_function_single()
  x86: msr: propagate errors from smp_call_function_single()
  smp: have smp_call_function_single() detect invalid CPUs
2008-08-28 12:30:59 -07:00
H. Peter Anvin af2e1f276f x86: cpufeature: fix SMX flag
Impact: "smx" flags showed as "safer" in /proc/cpuinfo

The SMX feature flag is the so-called "safer mode"... I had put that
in quotes, but that flagged it as a cpuinfo flag name.  Remove the
quotes to correct the flag name in /proc/cpuinfo.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-27 22:05:45 -07:00
H. Peter Anvin 2798c63e65 x86: <asm/cpufeature.h>: clean up overlong lines, whitespace
Clean up overlong lines and stealth whitespace in
<asm-x86/cpufeature.h>.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-27 21:20:07 -07:00
H. Peter Anvin f1240c0026 x86: cpufeature: add Intel features from CPUID and AVX specs
Add all Intel CPUID features currently documented in the CPUID spec
(AP-485, 241618-032, Dec 2007) and the AVX Programming Reference
(319433-003, Aug 2008).

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-27 19:25:44 -07:00
H. Peter Anvin 7414aa41a6 x86: generate names for /proc/cpuinfo from <asm/cpufeature.h>
We have had a number of cases where <asm/cpufeature.h> (and its
predecessors) have diverged substantially from the names list in
/proc/cpuinfo.  This patch generates the latter from the former.

It retains the option for explicitly overriding the strings, but by
making that require a separate action it should at least be less
likely to happen.

It would be good to do a future pass and rename strings that are
gratuituously different in the kernel (/proc/cpuinfo is a userspace
interface and must remain constant.)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-27 19:23:22 -07:00
H. Peter Anvin b30a72a7ed Merge branch 'x86/urgent' into x86/cpu
Conflicts:

	arch/x86/kernel/cpu/cyrix.c
2008-08-27 19:17:07 -07:00
H. Peter Anvin 94d4ac2f4a Merge branch 'x86/urgent' into x86/cleanups 2008-08-25 22:45:37 -07:00
H. Peter Anvin 08970fc4e0 x86: msr: fix bogus return values from rdmsr_safe/wrmsr_safe
Impact: bogus error codes (+other?) on x86-64

The rdmsr_safe/wrmsr_safe routines have macros for the handling of the
edx:eax arguments.  Those macros take a variable number of assembly
arguments.  This is rather inherently incompatible with using
%digit-style escapes in the inline assembly; replace those with
%[name]-style escapes.

This fixes miscompilation on x86-64, which at the very least caused
bogus return values.  It is possible that this could also corrupt the
return value; I am not sure.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 22:39:15 -07:00
H. Peter Anvin c6f31932d0 x86: msr: propagate errors from smp_call_function_single()
Propagate error (-ENXIO) from smp_call_function_single().  These
errors can happen when a CPU is unplugged while the MSR driver is
open.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-25 17:45:48 -07:00
Linus Torvalds ec73adba51 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: add X86_FEATURE_XMM4_2 definitions
  x86: fix cpufreq + sched_clock() regression
  x86: fix HPET regression in 2.6.26 versus 2.6.25, check hpet against BAR, v3
  x86: do not enable TSC notifier if we don't need it
  x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
  x86: fix: make PCI ECS for AMD CPUs hotplug capable
  x86: fix: do not run code in amd_bus.c on non-AMD CPUs
2008-08-25 11:26:33 -07:00
Austin Zhang 2a61812af2 x86: add X86_FEATURE_XMM4_2 definitions
Added Intel processor SSE4.2 feature flag.

No in-tree user at the moment, but makes the tree-merging life easier
for the crypto tree.

Signed-off-by: Austin Zhang <austin.zhang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 17:28:16 +02:00
Eduardo Habkost 18b13e5457 KVM: Use .fixup instead of .text.fixup on __kvm_handle_fault_on_reboot
vmlinux.lds expects the fixup code to be on a section named .fixup. The
.text.fixup section is not mentioned on vmlinux.lds, and is included on
the resulting vmlinux (just after .text) only because of ld heuristics on
placing orphan sections.

However, placing .text.fixup outside .text breaks the definition of
_etext, making it exclude the .text.fixup contents. That makes .text.fixup
be ignored by the kernel initialization code that needs to know about
section locations, such as the code setting page protection bits.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-08-25 17:22:57 +03:00
Ingo Molnar f58899bb02 Merge branch 'linus' into x86/urgent 2008-08-25 14:39:12 +02:00
Ingo Molnar ea1c9de45e Merge branch 'x86/urgent' into x86/cleanups 2008-08-25 11:10:42 +02:00
Alex Nixon 8227dce7dc x86: separate generic cpu disabling code from APIC writes in cpu_disable
It allows paravirt implementations of cpu_disable to share the
cpu_disable_common code, without having to take on board APIC
writes, which may not be appropriate.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:59:20 +02:00
Alex Nixon a21f5d88c1 x86: unify x86_32 and x86_64 play_dead into one function
Add the new play_dead into smpboot.c, as it fits more cleanly in there
alongside other CONFIG_HOTPLUG functions.

Separate out the common code into its own function.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:59:19 +02:00
Alex Nixon 3790025863 x86_32: clean up play_dead
The removal of the CPU from the various maps was redundant as it already
happened in cpu_disable.

After cleaning this up, cpu_uninit only resets the tlb state, so rename
it and create a noop version for the X86_64 case (so the two play_deads
can be unified later).

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:59:18 +02:00
Alex Nixon 93be71b672 x86: add cpu hotplug hooks into smp_ops
Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:59:18 +02:00
Ingo Molnar e4f807c2b4 Merge branch 'linus' into x86/xen
Conflicts:
	arch/x86/kernel/paravirt.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-25 10:54:07 +02:00
Adrian Bunk 7a8fc9b248 removed unused #include <linux/version.h>'s
This patch lets the files using linux/version.h match the files that
#include it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-23 12:14:12 -07:00
Rafael J. Wysocki 8735728ef8 x86 MCE: Fix CPU hotplug problem with multiple multicore AMD CPUs
During CPU hot-remove the sysfs directory created by
threshold_create_bank(), defined in
arch/x86/kernel/cpu/mcheck/mce_amd_64.c, has to be removed before
its parent directory, created by mce_create_device(), defined in
arch/x86/kernel/cpu/mcheck/mce_64.c .  Moreover, when the CPU in
question is hotplugged again, obviously the latter has to be created
before the former.  At present, the right ordering is not enforced,
because all of these operations are carried out by CPU hotplug
notifiers which are not appropriately ordered with respect to each
other.  This leads to serious problems on systems with two or more
multicore AMD CPUs, among other things during suspend and hibernation.

Fix the problem by placing threshold bank CPU hotplug callbacks in
mce_cpu_callback(), so that they are invoked at the right places,
if defined.  Additionally, use kobject_del() to remove the sysfs
directory associated with the kobject created by
kobject_create_and_add() in threshold_create_bank(), to prevent the
kernel from crashing during CPU hotplug operations on systems with
two or more multicore AMD CPUs.

This patch fixes bug #11337.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-23 17:49:19 +02:00
Suresh Siddha bbb65d2d36 x86: use cpuid vector 0xb when available for detecting cpu topology
cpuid leaf 0xb provides extended topology enumeration. This interface provides
the 32-bit x2APIC id of the logical processor and it also provides a new
mechanism to detect SMT and core siblings (which provides increased
addressability).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-23 17:47:10 +02:00
Ingo Molnar 87ce786ae5 Merge branch 'x86/cpu' into x86/x2apic 2008-08-23 17:46:59 +02:00
Marcin Slusarz c4bd1fdab0 x86: fix section mismatch warning - uv_cpu_init
WARNING: vmlinux.o(.cpuinit.text+0x3cc4): Section mismatch in reference from the function uv_cpu_init() to the function .init.text:uv_system_init()
The function __cpuinit uv_cpu_init() references
a function __init uv_system_init().
If uv_system_init is only used by uv_cpu_init then
annotate uv_system_init with a matching annotation.

uv_system_init was ment to be called only once, so do it from codepath
(native_smp_prepare_cpus) which is called once, right before activation
of other cpus (smp_init).

Note: old code relied on uv_node_to_blade being initialized to 0,
but it'a not initialized from anywhere.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 14:12:20 +02:00
Yinghai Lu b05f78f5c7 x86_64: printout msr -v2
commandline show_msr=1 for bsp, show_msr=32 for all 32 cpus.

[ mingo@elte.hu: added documentation ]

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 10:43:21 +02:00
FUJITA Tomonori 766af9fa81 dma-mapping.h, x86: remove last user of dma_mapping_ops->map_simple
pci-dma.c doesn't use map_simple hook any more so we can remove it
from struct dma_mapping_ops now.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 08:43:25 +02:00
Joerg Roedel 6c505ce393 x86: move dma_*_coherent functions to include file
All the x86 DMA-API functions are defined in asm/dma-mapping.h. This patch
moves the dma_*_coherent functions also to this header file because they are
now small enough to do so.
This is done as a separate patch because it also includes some renaming and
restructuring of the dma-mapping.h file.

Signed-off-by: Joerg Roedel <joerg.roede@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 08:34:51 +02:00
Ingo Molnar 8b53b57576 Merge branch 'x86/urgent' into x86/pat
Conflicts:
	arch/x86/mm/pageattr.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 06:06:51 +02:00
Eduardo Habkost f86399396c x86, paravirt_ops: use unsigned long instead of u32 for alloc_p*() pfn args
This patch changes the pfn args from 'u32' to 'unsigned long'
on alloc_p*() functions on paravirt_ops, and the corresponding
implementations for Xen and VMI. The prototypes for CONFIG_PARAVIRT=n
are already using unsigned long, so paravirt.h now matches the prototypes
on asm-x86/pgalloc.h.

It shouldn't result in any changes on generated code on 32-bit, with
or without CONFIG_PARAVIRT. On both cases, 'codiff -f' didn't show any
change after applying this patch.

On 64-bit, there are (expected) binary changes only when CONFIG_PARAVIRT
is enabled, as the patch is really supposed to change the size of the
pfn args.

[ v2: KVM_GUEST: use the right parameter type on kvm_release_pt() ]

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-22 05:34:44 +02:00
Shaohua Li d75586ad01 x86, pageattr: introduce APIs to change pageattr of a page array
Add array interface APIs of pageattr. page based cache flush is quite
slow for a lot of pages. If pages are more than 1024 (4M), the patch
will use a wbinvd(). We have a simple test here (run a 3d game - open
arena), nearly all agp memory allocation are small (< 1M), so suppose
this will not impact runtime performance.

Signed-off-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 13:47:45 +02:00
Ingo Molnar cacf890694 Revert "introduce two APIs for page attribute"
This reverts commit 1ac2f7d55b.
2008-08-21 13:46:33 +02:00
Ingo Molnar 9326d61bf6 Revert "reduce tlb/cache flush times of agpgart memory allocation"
This reverts commit 466ae83742.
2008-08-21 13:46:25 +02:00
Cihula, Joseph 671eef85a3 x86, e820: add support for AddressRangeUnusuable ACPI memory type
Add support for the E820_UNUSABLE memory type, which is defined in
Revision 3.0b (Oct.  10, 2006) of the ACPI Specification on p.  394 Table
14-1:

  AddressRangeUnusuable This range of address contains memory in which
  errors have been detected.  This range must not be used by the OSPM.

Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: Gang Wei <gang.wei@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 13:22:13 +02:00
Yinghai Lu 11494547b1 x86: fix apic version warning
after following patch,

commit 1b313f4a6d
Author: Cyrill Gorcunov <gorcunov@gmail.com>
Date:   Mon Aug 18 20:45:57 2008 +0400

    x86: apic - generic_processor_info

    - use physid_set instead of phys_cpu and physids_or
    - set phys_cpu_present_map bit AFTER check for allowed
      number of processors
    - add checking for APIC valid version in 64bit mode
      (mostly not needed but added for merging purpose)
    - add apic_version definition for 64bit mode which
      is used now

we are getting warning for acpi path on 64 bit system.

make the 64-bit side fill in apic_version[] as well.

[ mingo@elte.hu: build fix ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 11:07:06 +02:00
Dave Young e621bd1895 i386: vmalloc size fix
Booting kernel with vmalloc=[any size<=16m] will oops on my pc (i386/1G memory).

BUG_ON in arch/x86/mm/init_32.c triggered:
BUG_ON((unsigned long)high_memory > VMALLOC_START);

It's due to the vm area hole.

In include/asm-x86/pgtable_32.h:
#define VMALLOC_OFFSET	(8 * 1024 * 1024)
#define VMALLOC_START	(((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \
			 & ~(VMALLOC_OFFSET - 1))

There's several related point:
1. MAXMEM :
 (-__PAGE_OFFSET - __VMALLOC_RESERVE).
The space after VMALLOC_END is included as well, I set it to
(VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)

2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE
fixed by adding VMALLOC_OFFSET to it.

3. VMALLOC_START :
 (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1))
So it's not always 8M, bigger than 8M possible.
I set it to ((unsigned long)high_memory + VMALLOC_OFFSET)

4. the VMALLOC_RESERVE is an unused macro, so remove it here.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: akpm@linux-foundation.org
Cc: hidave.darkstar@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-08-21 10:13:21 +02:00
Jeremy Fitzhardinge 63d3a75d6f x86/paravirt: add spin_lock_flags lock op
It is useful for a pv_lock_ops backend to know whether interrupts are
enabled or not in the context a spin_lock is being called.  This
allows it to enable interrupts while spinning, which could be
particularly helpful when spinning becomes blocking.

The default implementation just calls the normal spin_lock op,
ignoring the flags.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:40:07 +02:00
Jeremy Fitzhardinge 6e833587e1 xen: clean up domain mode predicates
There are four operating modes Xen code may find itself running in:
 - native
 - hvm domain
 - pv dom0
 - pv domU

Clean up predicates for testing for these states to make them more consistent.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:40:07 +02:00
Ingo Molnar 170465ee7f Merge branch 'linus' into x86/xen 2008-08-20 12:39:18 +02:00
Cliff Wickman 99dd871330 x86, SGI UV: hardcode the TLB flush interrupt system vector
The UV TLB shootdown mechanism needs a system interrupt vector.

Its vector had been hardcoded as 200, but needs to moved to the reserved
system vector range so that it does not collide with some device vector.

This is still temporary until dynamic system IRQ allocation is provided.
But it will be needed when real UV hardware becomes available and runs 2.6.27.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:36:03 +02:00
Jeremy Fitzhardinge b6edbb1e04 x86_64: use save/loadsegment in ia32 compat
Use savesegment and loadsegment consistently in ia32 compat code.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:33:03 +02:00
Dmitry Adamushko d45de40934 x86-microcode: generic interface refactoring
This is the 1st patch in the series. Here the aim was to avoid any
significant changes, logically-wise.

So it's mainly about generic interface refactoring: e.g. make
microcode_{intel,amd}.c more about arch-specific details and less
about policies like make-sure-we-run-on-a-target-cpu
(no more set_cpus_allowed_ptr() here) and generic synchronization (no
more microcode_mutex here).

All in all, more line have been deleted than added.

4 files changed, 145 insertions(+), 198 deletions(-)

Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:18:57 +02:00
Ingo Molnar 7393423dd9 Merge branch 'linus' into x86/cleanups 2008-08-20 11:52:15 +02:00
H. Peter Anvin 8df9676d64 x86: <asm/asm.h> consistency cleanups
Rename _ASM_MOV_UL to _ASM_MOV for consistency with other _ASM_
instructions (_ASM_ADD, _ASM_SUB and so on.)

Add ASM_SP, _ASM_BP, _ASM_SI, and _ASM_DI for consistency with
_ASM_[ABCD]X.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-18 18:22:56 -07:00
H. Peter Anvin 7e00df5818 x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-18 18:22:17 -07:00
Linus Torvalds a7f5aaf36d Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix build warnings in real mode code
  x86, calgary: fix section mismatch warning - get_tce_space_from_tar
  x86: silence section mismatch warning - get_local_pda
  x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
  x86: fix i486 suspend to disk CR4 oops
  x86: mpparse.c: fix section mismatch warning
  x86: mmconf: fix section mismatch warning
  x86: fix MP_processor_info section mismatch warning
  x86, tsc: fix section mismatch warning
  x86: correct register constraints for 64-bit atomic operations
2008-08-18 12:10:14 -07:00
Thomas Petazzoni 8d02c2110b x86: configuration options to compile out x86 CPU support code
This patch adds some configuration options that allow to compile out
CPU vendor-specific code in x86 kernels (in arch/x86/kernel/cpu). The
new configuration options are only visible when CONFIG_EMBEDDED is
selected, as they are mostly interesting for space savings reasons.

An example of size saving, on x86 with only Intel CPU support:

   text	   data	    bss	    dec	    hex	filename
1125479	 118760	 212992	1457231	 163c4f	vmlinux.old
1121355	 116536	 212992	1450883	 162383	vmlinux
  -4124   -2224       0   -6348   -18CC +/-

However, I'm not exactly sure that the Kconfig wording is correct with
regard to !64BIT / 64BIT.

[ mingo@elte.hu: convert macro to inline ]

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:48 +02:00
Marcin Slusarz c6a92a2501 x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
Quoting Mike Travis in "x86: cleanup early per cpu variables/accesses v4"
(23ca4bba3e):

    The DEFINE macro defines the per_cpu variable as well as the early
    map and pointer.  It also initializes the per_cpu variable and map
    elements to "_initvalue".  The early_* macros provide access to
    the initial map (usually setup during system init) and the early
    pointer.  This pointer is initialized to point to the early map
    but is then NULL'ed when the actual per_cpu areas are setup.  After
    that the per_cpu variable is the correct access to the variable.

As these variables are NULL'ed before __init sections are dropped
(in setup_per_cpu_maps), they can be safely annotated as __ref.

This change silences following section mismatch warnings:

WARNING: vmlinux.o(.data+0x46c0): Section mismatch in reference from the variable x86_cpu_to_apicid_early_ptr to the variable .init.data:x86_cpu_to_apicid_early_map
The variable x86_cpu_to_apicid_early_ptr references
the variable __initdata x86_cpu_to_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: vmlinux.o(.data+0x46c8): Section mismatch in reference from the variable x86_bios_cpu_apicid_early_ptr to the variable .init.data:x86_bios_cpu_apicid_early_map
The variable x86_bios_cpu_apicid_early_ptr references
the variable __initdata x86_bios_cpu_apicid_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

WARNING: vmlinux.o(.data+0x46d0): Section mismatch in reference from the variable x86_cpu_to_node_map_early_ptr to the variable .init.data:x86_cpu_to_node_map_early_map
The variable x86_cpu_to_node_map_early_ptr references
the variable __initdata x86_cpu_to_node_map_early_map
If the reference is valid then annotate the
variable with __init* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:10:55 +02:00
Marcin Slusarz c72a5efec1 x86: mmconf: fix section mismatch warning
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.

check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:49:06 +02:00
Mathieu Desnoyers 3c3b5c3b0b x86: correct register constraints for 64-bit atomic operations
x86_64 add/sub atomic ops does not seems to accept integer values bigger
than 32 bits as immediates. Intel's add/sub documentation specifies they
have to be passed as registers.

The only operations in the x86-64 architecture which accept arbitrary
64-bit immediates is "movq" to any register; similarly, the only
operation which accept arbitrary 64-bit displacement is "movabs" to or
from al/ax/eax/rax.

http://gcc.gnu.org/onlinedocs/gcc-4.3.0/gcc/Machine-Constraints.html

states :

e
    32-bit signed integer constant, or a symbolic reference known to fit
    that range (for immediate operands in sign-extending x86-64
    instructions).
Z
    32-bit unsigned integer constant, or a symbolic reference known to
    fit that range (for immediate operands in zero-extending x86-64
    instructions).

Since add/sub does sign extension, using the "e" constraint seems appropriate.

It applies to 2.6.27-rc, 2.6.26, 2.6.25...

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:47:30 +02:00
Linus Torvalds 0473b79929 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
  x86: add MAP_STACK mmap flag
  x86: fix section mismatch warning - spp_getpage()
  x86: change init_gdt to update the gdt via write_gdt, rather than a direct write.
  x86-64: fix overlap of modules and fixmap areas
  x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
  x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set
  x86: fix spin_is_contended()
  x86, nmi: clean UP NMI watchdog failure message
  x86, NMI: fix watchdog failure message
  x86: fix /proc/meminfo DirectMap
  x86: fix readb() et al compile error with gcc-3.2.3
  arch/x86/Kconfig: clean up, experimental adjustement
  x86: invalidate caches before going into suspend
  x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds
  x86, AMD IOMMU: initialize dma_ops after sysfs registration
  x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
  x86, AMD IOMMU: initialize device table properly
  x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
  x86: silence mmconfig printk
  x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs
  ...
2008-08-16 17:14:07 -07:00
Mathieu Desnoyers 5bbd4c3724 x86: spinlock use LOCK_PREFIX
Since we are now using DS prefixes instead of NOP to remove LOCK
prefixes, there is no longer any problems with instruction boundaries
moving around.

* Linus Torvalds (torvalds@linux-foundation.org) wrote:
>
>
> On Thu, 14 Aug 2008, Mathieu Desnoyers wrote:
> >
> > Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment
> > override prefix should fix this issue. Since the default of the atomic
> > instructions is to use the DS segment anyway, it should not affect the
> > behavior.
>
> Ok, so I think this is an _excellent_ patch, but I'd like to also then use
> LOCK_PREFIX in include/asm-x86/futex.h.
>
> See commit 9d55b9923a.
>
>     Linus

Unless there a rationale for this, I think these be changed to LOCK_PREFIX
too.

grep "lock ;" include/asm-x86/spinlock.h
         "lock ; cmpxchgw %w1,%2\n\t"
  asm volatile("lock ; xaddl %0, %1\n"
         "lock ; cmpxchgl %1,%2\n\t"

Applies to 2.6.27-rc2.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Jeremy Fitzhardinge <jeremy@goop.org>
CC: Roland McGrath <roland@redhat.com>
CC: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
CC: Steven Rostedt <srostedt@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: David Miller <davem@davemloft.net>
CC: Ulrich Drepper <drepper@redhat.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Gregory Haskins <ghaskins@novell.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
CC: Clark Williams <williams@redhat.com>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-15 12:51:11 -07:00
Mathieu Desnoyers 1f49a2c2ae x86: revert replace LOCK_PREFIX in futex.h
Since we now use DS prefixes instead of NOP to remove LOCK prefixes,
there are no longer any issues with instruction boundaries moving around.

Depends on :

x86 alternatives : fix LOCK_PREFIX race with preemptible kernel and CPU hotplug

On Thu, 14 Aug 2008, Mathieu Desnoyers wrote:
>
> Changing the 0x90 (single-byte nop) currently used into a 0x3E DS segment
> override prefix should fix this issue. Since the default of the atomic
> instructions is to use the DS segment anyway, it should not affect the
> behavior.

Ok, so I think this is an _excellent_ patch, but I'd like to also then use
LOCK_PREFIX in include/asm-x86/futex.h.

See commit 9d55b9923a.

                Linus

Applies to 2.6.27-rc2 (and -rc3 unless hell broke loose in futex.h between rc2
and rc3).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Jeremy Fitzhardinge <jeremy@goop.org>
CC: Roland McGrath <roland@redhat.com>
CC: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
CC: Steven Rostedt <srostedt@redhat.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: David Miller <davem@davemloft.net>
CC: Ulrich Drepper <drepper@redhat.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Gregory Haskins <ghaskins@novell.com>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
CC: Clark Williams <williams@redhat.com>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-15 12:49:45 -07:00
Ingo Molnar cd98a04a59 x86: add MAP_STACK mmap flag
as per this discussion:

   http://lkml.org/lkml/2008/8/12/423

Pardo reported that 64-bit threaded apps, if their stacks exceed the
combined size of ~4GB, slow down drastically in pthread_create() - because
glibc uses MAP_32BIT to allocate the stacks. The use of MAP_32BIT is
a legacy hack - to speed up context switching on certain early model
64-bit P4 CPUs.

So introduce a new flag to be used by glibc instead, to not constrain
64-bit apps like this.

glibc can switch to this new flag straight away - it will be ignored
by the kernel. If those old CPUs ever matter to anyone, support for
it can be implemented.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 11:45:19 -07:00
Ingo Molnar 2fdc86901d x86: add MAP_STACK mmap flag
as per this discussion:

   http://lkml.org/lkml/2008/8/12/423

Pardo reported that 64-bit threaded apps, if their stacks exceed the
combined size of ~4GB, slow down drastically in pthread_create() - because
glibc uses MAP_32BIT to allocate the stacks. The use of MAP_32BIT is
a legacy hack - to speed up context switching on certain early model
64-bit P4 CPUs.

So introduce a new flag to be used by glibc instead, to not constrain
64-bit apps like this.

glibc can switch to this new flag straight away - it will be ignored
by the kernel. If those old CPUs ever matter to anyone, support for
it can be implemented.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@gmail.com>
2008-08-15 19:17:33 +02:00
Ingo Molnar 529d0e402e Merge branch 'x86/geode' into x86/urgent 2008-08-15 17:53:07 +02:00
Huang Ying fb45daa69d kexec jump: check code size in control page
Kexec/Kexec-jump require code size in control page is less than
PAGE_SIZE/2.  This patch add link-time checking for this.

ASSERT() of ld link script is used as the link-time checking mechanism.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Huang Ying 163f6876f5 kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE
Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control
page is used for not only code on some platform.  For example in kexec
jump, it is used for data and stack too.

[akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Jan Beulich 66d4bdf22b x86-64: fix overlap of modules and fixmap areas
Plus add a build time check so this doesn't go unnoticed again.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:31:50 +02:00
Jens Rottmann 0d5cdc97e2 x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
Adds a simple IRQ autodetection to the AMD Geode MFGPT driver, and more
importantly, adds some checks, if IRQs can actually be received on the
chosen line.  This fixes cases where MFGPT is selected as clocksource
though not producing any ticks, so the kernel simply starves during
boot.

Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de>
Cc: Andres Salomon <dilinger@debian.org>
Cc: linux-geode@bombadil.infradead.org
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:12:32 +02:00
Ingo Molnar 04197c83b3 Merge branch 'linus' into x86/tracehook
Conflicts:
	arch/x86/Kconfig

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:07:34 +02:00
Shaohua Li 466ae83742 reduce tlb/cache flush times of agpgart memory allocation
To reduce tlb/cache flush, makes agp memory allocation do one flush
after all pages in a region are changed to uc.

All agp drivers except agp-sgi uses agp_generic_alloc_page()
for .agp_alloc_page, so the patch should work for them. agp-sgi is only
for ia64, so not a problem too.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: airlied@linux.ie
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 16:30:48 +02:00
Shaohua Li 1ac2f7d55b introduce two APIs for page attribute
Introduce two APIs for page attribute. flushing tlb/cache in every page
attribute is expensive. AGP gart usually will do a lot of operations to
change a page to uc, new APIs can reduce flush.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: airlied@linux.ie
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 16:30:45 +02:00
Jan Beulich 7bc069c6bc x86: fix spin_is_contended()
The masked difference is what needs to be compared against 1, rather
than the difference of masked values (which can be negative).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 16:26:51 +02:00
Ingo Molnar 1a10390708 Merge branch 'linus' into x86/cpu 2008-08-15 16:16:15 +02:00
Mikael Pettersson 1c5b0eb66d x86: fix readb() et al compile error with gcc-3.2.3
Building 2.6.27-rc1 on x86 with gcc-3.2.3 fails with:

In file included from include/asm/dma.h:12,
                 from include/linux/bootmem.h:8,
                 from init/main.c:26:
include/asm/io.h: In function `readb':
include/asm/io.h:32: syntax error before string constant
include/asm/io.h: In function `readw':
include/asm/io.h:33: syntax error before string constant
include/asm/io.h: In function `readl':
include/asm/io.h:34: syntax error before string constant
include/asm/io.h: In function `__readb':
include/asm/io.h:36: syntax error before string constant
include/asm/io.h: In function `__readw':
include/asm/io.h:37: syntax error before string constant
include/asm/io.h: In function `__readl':
include/asm/io.h:38: syntax error before string constant
make[1]: *** [init/main.o] Error 1
make: *** [init] Error 2

Starting with 2.6.27-rc1 readb() et al are generated by a
build_mmio_read() macro, which generates asm() statements with
output register constraints like "=" "q", i.e. as two adjacent
string literals. This doesn't work with gcc-3.2.3.

Fixed by moving the "=" part into the callers' reg parameter
(as suggested by Ingo).

Build and boot-tested with gcc-3.2.3 on 32 and 64-bit x86.

Fixes <http://bugzilla.kernel.org/show_bug.cgi?id=11205>.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 14:30:32 +02:00
Mark Langsdorf 394a15051c x86: invalidate caches before going into suspend
When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache.  On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.  Tie into an assembly line so the compile will not
reorder it.  Add some documentation explaining what is going
on and why we're doing this.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Mark Borden <mark.borden@amd.com>
Acked-by: Michael Hohmuth <michael.hohmuth@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 14:04:30 +02:00
Ingo Molnar 975439fe73 Merge branch 'x86/amd-iommu' into x86/urgent 2008-08-15 13:57:32 +02:00
Joerg Roedel 8a456695c5 x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:56 +02:00
Joerg Roedel 9f5f5fb35d x86, AMD IOMMU: initialize device table properly
This patch adds device table initializations which forbids memory accesses
for devices per default and disables all page faults.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:54 +02:00
Joerg Roedel 519c31bacf x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:46 +02:00
Ingo Molnar 881b374705 Merge branch 'x86/apic' into x86/core 2008-08-14 15:13:47 +02:00
Ingo Molnar c83d12806b Merge branches 'x86/prototypes', 'x86/x2apic' and 'x86/debug' into x86/core 2008-08-14 14:58:22 +02:00
Ingo Molnar 51ca3c6791 Merge branch 'linus' into x86/core
Conflicts:
	arch/x86/kernel/genapic_64.c
	include/asm-x86/kvm_host.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 14:58:01 +02:00
Ingo Molnar 8d7ccaa545 Merge commit 'v2.6.27-rc3' into x86/prototypes
Conflicts:

	include/asm-x86/dma-mapping.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 12:19:59 +02:00
Ingo Molnar 3167761965 Merge branch 'x86/fpu' into x86/urgent 2008-08-14 11:18:08 +02:00
Ingo Molnar d4439087d3 Merge commit 'v2.6.27-rc3' into x86/xsave
Conflicts:

	arch/x86/kernel/genapic_64.c
	include/asm-x86/kvm_host.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 10:55:26 +02:00
Suresh Siddha e49140120c crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore()
Wolfgang Walter reported this oops on his via C3 using padlock for
AES-encryption:

##################################################################

BUG: unable to handle kernel NULL pointer dereference at 000001f0
IP: [<c01028c5>] __switch_to+0x30/0x117
*pde = 00000000
Oops: 0002 [#1] PREEMPT
Modules linked in:

Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
EIP is at __switch_to+0x30/0x117
EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
       c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
       c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
Call Trace:
 [<c03b5b43>] ? schedule+0x285/0x2ff
 [<c0131856>] ? pm_qos_requirement+0x3c/0x53
 [<c0239f54>] ? acpi_processor_idle+0x0/0x434
 [<c01025fe>] ? cpu_idle+0x73/0x7f
 [<c03a4dcd>] ? rest_init+0x61/0x63
 =======================

Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
around the padlock instructions fix the oops.

Suresh wrote:

These padlock instructions though don't use/touch SSE registers, but it behaves
similar to other SSE instructions. For example, it might cause DNA faults
when cr0.ts is set. While this is a spurious DNA trap, it might cause
oops with the recent fpu code changes.

This is the code sequence  that is probably causing this problem:

a) new app is getting exec'd and it is somewhere in between
   start_thread() and flush_old_exec() in the load_xyz_binary()

b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
   cleared.

c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
   routines in the network stack. This generates a math fault (as
   cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
   in the task's xstate.

d) Return to exec code path, which does start_thread() which does
   free_thread_xstate() and sets xstate pointer to NULL while
   the TS_USEDFPU is still set.

e) At the next context switch from the new exec'd task to another task,
   we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
   This can cause an oops during unlazy_fpu() in __switch_to()

Now:

1) This should happen with or with out pre-emption. Viro also encountered
   similar problem with out CONFIG_PREEMPT.

2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
   kernel_fpu_begin() will manually do a clts() and won't run in to the
   situation of setting TS_USEDFPU in step "c" above.

3) This was working before the fpu changes, because its a spurious
   math fault  which doesn't corrupt any fpu/sse registers and the task's
   math state was always in an allocated state.

With out the recent lazy fpu allocation changes, while we don't see oops,
there is a possible race still present in older kernels(for example,
while kernel is using kernel_fpu_begin() in some optimized clear/copy
page and an interrupt/softirq happens which uses these padlock
instructions generating DNA fault).

This is the failing scenario that existed even before the lazy fpu allocation
changes:

0. CPU's TS flag is set

1. kernel using FPU in some optimized copy  routine and while doing
kernel_fpu_begin() takes an interrupt just before doing clts()

2. Takes an interrupt and ipsec uses padlock instruction. And we
take a DNA fault as TS flag is still set.

3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts

4. We complete the padlock routine

5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
the optimized copy routine and does kernel_fpu_end(). At this point,
we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
set and not cleared.

6. Now kernel resumes its user operation. And at the next context
switch, kernel sees it has do a FP save as TS_USEDFPU is still set
and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
will take a DNA fault, as cr0.ts is '1' and now, because we are
in __switch_to(), math_state_restore() will get confused and will
restore the next task's FP state and will save it in prev tasks's FP state.
Remember, in __switch_to() we are already on the stack of the next task
but take a DNA fault for the prev task.

This causes the fpu leakage.

Fix the padlock instruction usage by calling them inside the
context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
manually in the interrupt context. This will not generate spurious DNA
in the  context of the interrupt which will fix the oops encountered and
the possible FPU leakage issue.

Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13 22:02:26 +10:00
Ingo Molnar a12e61df4f Merge commit 'v2.6.27-rc3' into x86/urgent 2008-08-13 13:08:47 +02:00
Ingo Molnar 26d809af63 x86: fix xsave build error
fix this build failure with certain glibc versions:

In file included from /usr/include/bits/sigcontext.h:28,
                 from /usr/include/signal.h:333,
                 from Documentation/accounting/getdelays.c:24:
/home/mingo/tip/usr/include/asm/sigcontext.h:191: error: expected specifier-qualifier-list before ‘u64’

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 12:59:29 +02:00
Johannes Weiner 0ed89b06e4 x86: propagate new nonpanic bootmem macros to CONFIG_HAVE_ARCH_BOOTMEM_NODE
Commit 74768ed833 "page allocator: use no-panic variant of
alloc_bootmem() in alloc_large_system_hash()" introduced two new
_nopanic macros which are undefined for CONFIG_HAVE_ARCH_BOOTMEM_NODE.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: "Jan Beulich" <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 11:57:18 +02:00
Linus Torvalds 7019b1b500 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix 2.6.27rc1 cannot boot more than 8CPUs
  x86: make "apic" an early_param() on 32-bit, NULL check
  EFI, x86: fix function prototype
  x86, pci-calgary: fix function declaration
  x86: work around gcc 3.4.x bug
  x86: make "apic" an early_param() on 32-bit
  x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk
  x86_64: restore the proper NR_IRQS define so larger systems work.
  x86: Restore proper vector locking during cpu hotplug
  x86: Fix broken VMI in 2.6.27-rc..
  x86: fdiv bug detection fix
2008-08-11 16:44:35 -07:00
Randy Dunlap b0fbaa6b59 EFI, x86: fix function prototype
Fix function prototype in header file to match source code:

linux-next-20080807/arch/x86/kernel/efi_64.c💯14: error: symbol 'efi_ioremap' redeclared with different type (originally declared at include2/asm/efi.h:89) - different address spaces

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 18:50:19 +02:00
Cyrill Gorcunov 2ae111cdd8 x86: apic interrupts - move assignments to irqinit_32.c, v2
64bit mode APIC interrupt handlers are set within irqinit_64.c.
Lets do tha same for 32bit mode which would help in furter code merging.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 16:43:09 +02:00
Ingo Molnar 6de9c70882 Merge branch 'linus' into x86/cleanups 2008-08-11 12:57:01 +02:00
Ingo Molnar 8067794bec Merge branch 'linus' into x86/x2apic
Conflicts:

	arch/x86/kernel/genapic_64.c

Manual merge:

	arch/x86/kernel/genx2apic_uv_x.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 11:19:20 +02:00
Eric W. Biederman 3c7569b284 x86_64: restore the proper NR_IRQS define so larger systems work.
As pointed out and tracked by Yinghai Lu <yhlu.kernel@gmail.com>:

 Dhaval Giani got:
 kernel BUG at arch/x86/kernel/io_apic_64.c:357!
 invalid opcode: 0000 [1] SMP
 CPU 24
 ...

his system (x3950) has 8 ioapic, irq > 256

This was caused by:

       commit 9b7dc567d0
       Author: Thomas Gleixner <tglx@linutronix.de>
       Date:   Fri May 2 20:10:09 2008 +0200

          x86: unify interrupt vector defines

          The interrupt vector defines are copied 4 times around with minimal
          differences. Move them all into asm-x86/irq_vectors.h

It appears that Thomas did not notice that x86_64 does something
completely different when he merge irq_vectors.h

We can solve this for 2.6.27 by simply reintroducing the old heuristic
for setting NR_IRQS on x86_64 to a usable value, which trivially removes
the regression.

Long term it would be nice to harmonize the handling of ioapic interrupts
of x86_32 and x86_64 so we don't have this kind of confusion.

Dhaval Giani <dhaval@linux.vnet.ibm.com> tested an earlier version of
this patch by YH which confirms simply increasing NR_IRQS fixes the
problem.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 10:39:04 +02:00
Eric W. Biederman d388e5fdc4 x86: Restore proper vector locking during cpu hotplug
Having cpu_online_map change during assign_irq_vector can result
in some really nasty and weird things happening.  The one that
bit me last time was accessing non existent per cpu memory for non
existent cpus.

This locking was removed in a sloppy x86_64 and x86_32 merge patch.

Guys can we please try and avoid subtly breaking x86 when we are
merging files together?

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-11 10:37:34 +02:00
Marcin Slusarz bafc1dae82 x86: mmconf: fix section mismatch warning
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.

check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-10 21:13:08 -07:00
Linus Torvalds 84ff7a0012 Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: s390: Fix kvm on IBM System z10
  KVM: Advertise synchronized mmu support to userspace
  KVM: Synchronize guest physical memory map to host virtual memory map
  KVM: Allow browsing memslots with mmu_lock
  KVM: Allow reading aliases with mmu_lock
2008-08-01 12:48:16 -07:00
Ingo Molnar 4b336b0625 Merge branch 'x86/urgent' into x86/xen 2008-07-31 12:41:34 +02:00
Ingo Molnar 5fbf24659b Merge branch 'linus' into x86/xen 2008-07-31 12:38:04 +02:00
H. Peter Anvin 6152e4b1c9 x86, xsave: keep the XSAVE feature mask as an u64
The XSAVE feature mask is a 64-bit number; keep it that way, in order
to avoid the mistake done with rdmsr/wrmsr.  Use the xsetbv() function
provided in the previous patch.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:50:35 +02:00
H. Peter Anvin b4a091a62c x86, xsave: add <asm/xcr.h> header file for XCR registers
Add <asm-x86/xcr.h> header file for the XCR registers and their access
functions.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:36 +02:00
Suresh Siddha c37b5efea4 x86, xsave: save/restore the extended state context in sigframe
On cpu's supporting xsave/xrstor, fpstate pointer in the sigcontext, will
include the extended state information along with fpstate information. Presence
of extended state information is indicated by the presence
of FP_XSTATE_MAGIC1 at fpstate.sw_reserved.magic1 and FP_XSTATE_MAGIC2
at fpstate + (fpstate.sw_reserved.extended_size - FP_XSTATE_MAGIC2_SIZE).

Extended feature bit mask that is saved in the memory layout is represented
by the fpstate.sw_reserved.xstate_bv

For RT signal frames, UC_FP_XSTATE in the uc_flags also indicate the
presence of extended state information in the sigcontext's fpstate
pointer.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:27 +02:00
Suresh Siddha bdd8caba5e x86, xsave: struct _fpstate extensions to include extended state information
Bytes 464..511 in the current 512byte layout of fxsave/fxrstor
frame, are reserved for SW usage. On cpu's supporting xsave/xrstor, these bytes
are used to extended the fpstate pointer in the sigcontext, which now includes
the extended state information along with fpstate information.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:27 +02:00
Suresh Siddha 9dc89c0f96 x86, xsave: xsave/xrstor specific routines
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:26 +02:00
Suresh Siddha ab5137015f x86, xsave: reorganization of signal save/restore fpstate code layout
move 64bit routines that saves/restores fpstate in/from user stack from
signal_64.c to xsave.c

restore_i387_xstate() now handles the condition when user passes
NULL fpstate.

Other misc changes for prepartion of xsave/xrstor sigcontext support.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:26 +02:00
Suresh Siddha 3c1c7f1014 x86, xsave: dynamically allocate sigframes fpstate instead of static allocation
dynamically allocate fpstate on the stack, instead of static allocation
in the current sigframe layout on the user stack. This will allow the
fpstate structure to grow in the future, which includes extended state
information supporting xsave/xrstor.

signal handlers will be able to access the fpstate pointer from the
sigcontext structure asusual, with no change. For the non RT sigframe's
(which are supported only for 32bit apps), current static fpstate layout
in the sigframe will be unused(so that we don't change the extramask[]
offset in the sigframe and thus prevent breaking app's which modify
extramask[]).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:25 +02:00
Suresh Siddha b359e8a434 x86, xsave: context switch support using xsave/xrstor
Uses xsave/xrstor (instead of traditional fxsave/fxrstor) in context switch
when available.

Introduces TS_XSAVE flag, which determine the need to use xsave/xrstor
instructions during context switch instead of the legacy fxsave/fxrstor
instructions. Thread-synchronous status word is already in L1 cache during
this code patch and thus minimizes the performance penality compared to
(cpu_has_xsave) checks.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:24 +02:00
Suresh Siddha dc1e35c6e9 x86, xsave: enable xsave/xrstor on cpus with xsave support
Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have
the xsave support. For now, features that OS supports/enabled are
FP and SSE.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:24 +02:00
Suresh Siddha a648bf4632 x86, xsave: xsave cpuid feature bits
Add xsave CPU feature bits.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:23 +02:00
Ingo Molnar 15dd859cac Merge commit 'v2.6.27-rc1' into x86/core
Conflicts:

	include/asm-x86/dma-mapping.h
	include/asm-x86/namei.h
	include/asm-x86/uaccess.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:33:48 +02:00
Ingo Molnar b2d9d33412 Merge branch 'x86/fpu' into x86/core 2008-07-30 19:32:39 +02:00
FUJITA Tomonori 8978b74253 generic, x86: fix add iommu_num_pages helper function
This IOMMU helper function doesn't work for some architectures:

  http://marc.info/?l=linux-kernel&m=121699304403202&w=2

It also breaks POWER and SPARC builds:

  http://marc.info/?l=linux-kernel&m=121730388001890&w=2

Currently, only x86 IOMMUs use this so let's move it to x86 for
now.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29 12:12:48 +02:00
Ingo Molnar 3825c9e8d0 Merge commit 'v2.6.27-rc1' into x86/microcode
Conflicts:

	arch/x86/kernel/microcode.c

Manual resolutions:

	arch/x86/kernel/microcode_amd.c
	arch/x86/kernel/microcode_intel.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29 11:54:24 +02:00
Andrea Arcangeli e930bffe95 KVM: Synchronize guest physical memory map to host virtual memory map
Synchronize changes to host virtual addresses which are part of
a KVM memory slot to the KVM shadow mmu.  This allows pte operations
like swapping, page migration, and madvise() to transparently work
with KVM.

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-29 12:33:53 +03:00
Ingo Molnar cb28a1bbdb Merge branch 'linus' into core/generic-dma-coherent
Conflicts:

	arch/x86/Kconfig

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29 00:07:55 +02:00
Peter Oruba 8d86f390d9 x86: major refactoring
Refactored code by introducing a two-module solution.

There is one general module in which vendor specific modules can hook into.
However, that is exclusive, there is only one vendor specific module
allowed at a time. A CPU vendor check makes sure only the correct
module for the underlying system gets called.

Functinally in terms of patch loading itself there are no changes. This
refactoring provides a basis for future implementations of other vendors'
patch loaders.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:57 +02:00
Peter Oruba 26bf7a48c3 x86: first step of refactoring, introducing microcode_ops
Refactoring with the goal of having one general module and separate
vendor specific modules that hook into the general one.

Microcode_ops is a function pointer structure in which vendor
specific modules will enter all functions that differ between
vendors and that need to be accessed from the general module.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:57 +02:00
Peter Oruba 9835fd4ad9 x86: add AMD specific declarations
Added AMD specific declarations to header file.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:56 +02:00
Peter Oruba d4ee366868 x86: structure declaration renaming
Renamed common structures to vendor specific naming scheme
so other vendors will be able to use the same naming
convention.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:55 +02:00
Peter Oruba c3b71bcec0 x86: move per CPU microcode structure declaration to header file
This structure will be later used by other modules as well and
needs therfore to be moved out to a header file.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:54 +02:00
Peter Oruba 8e61028dfd x86: typedef removal
Removed typedefs. No functional changes to the code.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:53 +02:00
Peter Oruba 9a56a0f80b x86: moved Intel microcode patch loader declarations to seperate header file
Intel specific microcode declarations have been moved to a seperate header file.
There are no code changes to the code itself and no side effects to other parts.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 19:57:52 +02:00
Ingo Molnar 71998e83c5 Merge branch 'x86-tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace into x86/tracehook 2008-07-28 17:03:43 +02:00
Ingo Molnar b7d0b67845 Merge branch 'linus' into x86/cpu
Conflicts:

	arch/x86/kernel/cpu/intel_cacheinfo.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:26:31 +02:00
Ingo Molnar a403e45c3b Merge core/lib: pick up memparse() change.
Merge branch 'core/lib' into x86/xen
2008-07-28 15:07:54 +02:00
Jeremy Fitzhardinge 64f53a0492 x86: fix initialization of 'l' bit in ldt descriptors
Make sure that fill_ldt() initializes the 'l' bit in the descriptor.
It always sets it to 0, ignoring 'lm' in user_desc, preserving original
x86_64 behaviour.

Previously it was leaving 'l' uninitialized.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
2008-07-28 14:26:26 +02:00
Joerg Roedel 5f4cb662a0 KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Al Viro 7f2da1e7d0 [PATCH] kill altroot
long overdue...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:20 -04:00
Roland McGrath 59e52130f0 x86: tracehook: TIF_NOTIFY_RESUME
This adds TIF_NOTIFY_RESUME support for x86, both 64-bit and 32-bit.
When set, we call tracehook_notify_resume() on the way to user mode.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-26 14:38:05 -07:00
Roland McGrath 68bd0f4ef7 x86: tracehook: asm/syscall.h
Add asm/syscall.h for x86 with all the required entry points.
This will allow arch-independent tracing code for system calls.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-26 14:38:01 -07:00
Linus Torvalds fb3b806144 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization
  x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels
  x86, RDC321x: remove gpio.h complications
  x86, RDC321x: add to mach-default
  crashdump: fix undefined reference to `elfcorehdr_addr'
  flag parameters: fix compile error of sys_epoll_create1
2008-07-26 13:25:05 -07:00
Nick Piggin 8174c430e4 x86: lockless get_user_pages_fast()
Implement get_user_pages_fast without locking in the fastpath on x86.

Do an optimistic lockless pagetable walk, without taking mmap_sem or any
page table locks or even mmap_sem.  Page table existence is guaranteed by
turning interrupts off (combined with the fact that we're always looking
up the current mm, means we can do the lockless page table walk within the
constraints of the TLB shootdown design).  Basically we can do this
lockless pagetable walk in a similar manner to the way the CPU's pagetable
walker does not have to take any locks to find present ptes.

This patch (combined with the subsequent ones to convert direct IO to use
it) was found to give about 10% performance improvement on a 2 socket 8
core Intel Xeon system running an OLTP workload on DB2 v9.5

 "To test the effects of the patch, an OLTP workload was run on an IBM
  x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
  2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel.  Comparing
  runs with and without the patch resulted in an overall performance
  benefit of ~9.8%.  Correspondingly, oprofiles showed that samples from
  __up_read and __down_read routines that is seen during thread contention
  for system resources was reduced from 2.8% down to .05%.  Monitoring the
  /proc/vmstat output from the patched run showed that the counter for
  fast_gup contained a very high number while the fast_gup_slow value was
  zero."

(fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
counter we had for the number of times the slowpath was invoked).

The main reason for the improvement is that DB2 has multiple threads each
issuing direct-IO.  Direct-IO uses get_user_pages, and thus the threads
contend the mmap_sem cacheline, and can also contend on page table locks.

I would anticipate larger performance gains on larger systems, however I
think DB2 uses an adaptive mix of threads and processes, so it could be
that thread contention remains pretty constant as machine size increases.
In which case, we stuck with "only" a 10% gain.

The downside of using get_user_pages_fast is that if there is not a pte
with the correct permissions for the access, we end up falling back to
get_user_pages and so the get_user_pages_fast is a bit of extra work.
However this should not be the common case in most performance critical
code.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: Kconfig fix]
[akpm@linux-foundation.org: Makefile fix/cleanup]
[akpm@linux-foundation.org: warning fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:06 -07:00
Nick Piggin a0a8f5364a x86: implement pte_special
Implement the pte_special bit for x86.  This is required to support
lockless get_user_pages, because we need to know whether or not we can
refcount a particular page given only its pte (and no vma).

[hugh@veritas.com: fix a BUG]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:05 -07:00
Huang Ying 3ab8352137 kexec jump
This patch provides an enhancement to kexec/kdump.  It implements the
following features:

- Backup/restore memory used by the original kernel before/after
  kexec.

- Save/restore CPU state before/after kexec.

The features of this patch can be used as a general method to call program in
physical mode (paging turning off).  This can be used to call BIOS code under
Linux.

kexec-tools needs to be patched to support kexec jump. The patches and
the precompiled kexec can be download from the following URL:

       source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
       patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
       binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10

Usage example of calling some physical mode code and return:

1. Compile and install patched kernel with following options selected:

CONFIG_X86_32=y
CONFIG_KEXEC=y
CONFIG_PM=y
CONFIG_KEXEC_JUMP=y

2. Build patched kexec-tool or download the pre-built one.

3. Build some physical mode executable named such as "phy_mode"

4. Boot kernel compiled in step 1.

5. Load physical mode executable with /sbin/kexec. The shell command
   line can be as follow:

   /sbin/kexec --load-preserve-context --args-none phy_mode

6. Call physical mode executable with following shell command line:

   /sbin/kexec -e

Implementation point:

To support jumping without reserving memory.  One shadow backup page (source
page) is allocated for each page used by kexeced code image (destination
page).  When do kexec_load, the image of kexeced code is loaded into source
pages, and before executing, the destination pages and the source pages are
swapped, so the contents of destination pages are backupped.  Before jumping
to the kexeced code image and after jumping back to the original kernel, the
destination pages and the source pages are swapped too.

C ABI (calling convention) is used as communication protocol between
kernel and called code.

A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
indicate that the loaded kernel image is used for jumping back.

Now, only the i386 architecture is supported.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Alexis Bruemmer 1956a96de4 x86 calgary: fix handling of devices that aren't behind the Calgary
The calgary code can give drivers addresses above 4GB which is very bad
for hardware that is only 32bit DMA addressable.

With this patch, the calgary code sets the global dma_ops to swiotlb or
nommu properly, and the dma_ops of devices behind the Calgary/CalIOC2
to calgary_dma_ops.  So the calgary code can handle devices safely that
aren't behind the Calgary/CalIOC2.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:03 -07:00
FUJITA Tomonori 8d8bb39b9e dma-mapping: add the device argument to dma_mapping_error()
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:

This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).

I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread).  So I
CC'ed this to KVM camp.  Comments are appreciated.

A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
NULL, the system-wide dma_ops pointer is used as before.

If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging).  It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.

The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
device.  Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.

The first patch adds the device argument to dma_mapping_error.  The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.

This patch:

dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations.  So we can't have dma_mapping_ops per device.

Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
argument.

[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:03 -07:00
Ingo Molnar c3cc99ff5d Merge branch 'linus' into x86/xen 2008-07-26 17:48:49 +02:00
Suresh Siddha 6ffac1e90a x64, fpu: fix possible FPU leakage in error conditions
On Thu, Jul 24, 2008 at 03:43:44PM -0700, Linus Torvalds wrote:
> So how about this patch as a starting point? This is the RightThing(tm) to
> do regardless, and if it then makes it easier to do some other cleanups,
> we should do it first. What do you think?

restore_fpu_checking() calls init_fpu() in error conditions.

While this is wrong(as our main intention is to clear the fpu state of
the thread), this was benign before commit 92d140e21f ("x86: fix taking
DNA during 64bit sigreturn").

Post commit 92d140e21f, live FPU registers may not belong to this
process at this error scenario.

In the error condition for restore_fpu_checking() (especially during the
64bit signal return), we are doing init_fpu(), which saves the live FPU
register state (possibly belonging to some other process context) into
the thread struct (through unlazy_fpu() in init_fpu()). This is wrong
and can leak the FPU data.

For the signal handler restore error condition in restore_i387(), clear
the fpu state present in the thread struct(before ultimately sending a
SIGSEGV for badframe).

For the paranoid error condition check in math_state_restore(), send a
SIGSEGV, if we fail to restore the state.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <stable@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 16:37:04 +02:00