Commit Graph

9338 Commits

Author SHA1 Message Date
Joerg Roedel be83129771 x86/amd-iommu: attach devices to pre-allocated domains early
For some devices the ACPI table may define unity map
requirements which must me met when the IOMMU is enabled. So
we need to attach devices to their domains as early as
possible so that these mappings are in place when needed.
This patch assigns the domains right after they are
allocated. Otherwise this can result in I/O page faults
before a driver binds to a device and BIOS is still using
it.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-23 12:54:17 +01:00
Joerg Roedel 9f800de38b x86/amd-iommu: un__init iommu_setup_msi
This function may be called on the resume path and can not
be dropped after booting.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-11-23 12:45:25 +01:00
Jan Beulich 0e7810be30 x86: Suppress stack overrun message for init_task
init_task doesn't get its stack end location set to
STACK_END_MAGIC, and hence the message is confusing
rather than helpful in this case.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4B06AEFE02000078000211F4@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 11:45:34 +01:00
Ingo Molnar 6e3d8330ae perf events: Do not generate function trace entries in perf code
Decreases perf overhead when function tracing is enabled,
by about 50%.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 10:19:20 +01:00
Yinghai Lu 163d3866cf x86: apic: Print out SRAT table APIC id in hex
Make it consistent with APIC MADT print out,
for big systems APIC id in hex is more readable.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B07A739.3030104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 09:56:30 +01:00
Yinghai Lu 37ef2a3029 x86: Re-get cfg_new in case reuse/move irq_desc
When irq_desc is moved, we need to make sure to use the right cfg_new.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B07A739.3030104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 09:56:05 +01:00
Yinghai Lu e670761f12 x86: apic: Remove not needed #ifdef
Suresh made dmar_table_init() already have that protection.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B07A739.3030104@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-23 09:54:15 +01:00
Ingo Molnar 96200591a3 Merge branch 'tracing/hw-breakpoints' into perf/core
Conflicts:
	arch/x86/kernel/kprobes.c
	kernel/trace/Makefile

Merge reason: hw-breakpoints perf integration is looking
              good in testing and in reviews, plus conflicts
              are mounting up - so merge & resolve.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-21 14:07:23 +01:00
Masami Hiramatsu 6f5f67267d x86: insn decoder test checks objdump version
Check objdump version before using it for insn decoder build test,
because some older objdump can't decode AVX code correctly.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
LKML-Reference: <20091120171314.6715.30390.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-11-20 23:01:04 -08:00
Masami Hiramatsu 80509e27e4 x86: Fix insn decoder test typos
Fix postest_verbose to posttest_verbose, and add posttest_64bit option
for CONFIG_64BIT != y, since old command just passed '-' instead
of '-n' when CONFIG_64BIT is not set.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
LKML-Reference: <20091120171307.6715.66099.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-11-20 22:59:36 -08:00
Masami Hiramatsu ce64c62074 x86: Instruction decoder test should generate build warning
Since some instructions are not decoded correctly by older
versions of objdump, it may cause false positive error in insn
decoder posttest.

This changes build error of insn decoder test to build warning.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20091116230631.5250.41579.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-19 21:40:13 +01:00
David S. Miller 3505d1a9fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sfc/sfe4001.c
	drivers/net/wireless/libertas/cmd.c
	drivers/staging/Kconfig
	drivers/staging/Makefile
	drivers/staging/rtl8187se/Kconfig
	drivers/staging/rtl8192e/Kconfig
2009-11-18 22:19:03 -08:00
Thomas Gleixner 6dbfe5a57d x86: Fixup last users of irq_chip->typename
The typename member of struct irq_chip was kept for migration purposes
and is obsolete since more than 2 years. Fix up the leftovers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-18 11:45:29 +01:00
Rusty Russell 8dca15e408 [CPUFREQ] speedstep-ich: fix error caused by 394122ab14
"[CPUFREQ] cpumask: avoid playing with cpus_allowed in speedstep-ich.c"
changed the code to mistakenly pass the current cpu as the "processor"
argument of speedstep_get_frequency(), whereas it should be the type of
the processor.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14340

Based on a patch by Dave Mueller.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Dominik Brodowski <linux@brodo.de>
Reported-by: Dave Mueller <dave.mueller@gmx.ch>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-17 23:15:04 -05:00
John Villalovos 293afe44d7 [CPUFREQ] acpi-cpufreq: blacklist Intel 0f68: Fix HT detection and put in notification message
Removing the SMT/HT check, since the Errata doesn't mention
Hyper-Threading.

Adding in a printk, so that the user knows why acpi-cpufreq refuses to
load.  Also, once system is blacklisted, don't repeat checks to see if
blacklisted.  This also causes the message to only be printed once,
rather than for each CPU.

Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-17 23:15:03 -05:00
Roel Kluin c53614ec17 [CPUFREQ] powernow-k8: Fix test in get_transition_latency()
Not makes it a bool before the comparison.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-17 23:15:03 -05:00
Krzysztof Helt f7f3cad060 [CPUFREQ] longhaul: select Longhaul version 2 for capable CPUs
There is a typo in the longhaul detection code so only Longhaul v1 or Longhaul v3
is selected. The Longhaul v2 is not selected even for CPUs which are capable of.

Tested on PCChips Giga Pro board. Frequency changes work and the Longhaul v2
detects that the board is not capable of changing CPU voltage.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-17 23:15:03 -05:00
Ingo Molnar a7b63425a4 Merge branch 'perf/core' into perf/probes
Resolved merge conflict in tools/perf/Makefile

Merge reason: we want to queue up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 10:17:47 +01:00
Eric W. Biederman bb9074ff58 Merge commit 'v2.6.32-rc7'
Resolve the conflict between v2.6.32-rc7 where dn_def_dev_handler
gets a small bug fix and the sysctl tree where I am removing all
sysctl strategy routines.
2009-11-17 01:01:34 -08:00
Ingo Molnar 123bf0e2ed x86: gart: Clean up the code a bit
Clean up various small stylistic details in the GART code. No
functionality changed.

Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: muli@il.ibm.com
Cc: joerg.roedel@amd.com
LKML-Reference: <1258287594-8777-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:57:00 +01:00
FUJITA Tomonori 1f7564ca83 x86: Calgary: Remove unnecessary DMA_ERROR_CODE usage
This cleans up iommu_alloc() a bit and removes unnecessary
DMA_ERROR_CODE usage.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: muli@il.ibm.com
Cc: joerg.roedel@amd.com
LKML-Reference: <1258287594-8777-4-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:53:21 +01:00
FUJITA Tomonori 8fd524b355 x86: Kill bad_dma_address variable
This kills bad_dma_address variable, the old mechanism to enable
IOMMU drivers to make dma_mapping_error() work in IOMMU's
specific way.

bad_dma_address variable was introduced to enable IOMMU drivers
to make dma_mapping_error() work in IOMMU's specific way.
However, it can't handle systems that use both swiotlb and HW
IOMMU. SO we introduced dma_map_ops->mapping_error to solve that
case.

Intel VT-d, GART, and swiotlb already use
dma_map_ops->mapping_error. Calgary, AMD IOMMU, and nommu use
zero for an error dma address. This adds DMA_ERROR_CODE and
converts them to use it (as SPARC and POWER does).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: muli@il.ibm.com
Cc: joerg.roedel@amd.com
LKML-Reference: <1258287594-8777-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:53:21 +01:00
FUJITA Tomonori 42109197eb x86: gart: Add own dma_mapping_error function
GART IOMMU is the only user of bad_dma_address variable.

This patch converts GART to use the newer mechanism, fill in
->mapping_error() in struct dma_map_ops, to make
dma_mapping_error() work in IOMMU specific way.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: muli@il.ibm.com
Cc: joerg.roedel@amd.com
LKML-Reference: <1258287594-8777-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:53:20 +01:00
Ingo Molnar 99f4c9de2b Merge commit 'v2.6.32-rc7' into core/iommu
Merge reason: Add fixes we'll depend on.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:51:07 +01:00
Masami Hiramatsu 35039eb6b1 x86: Show symbol name if insn decoder test failed
Show symbol name if insn decoder test find a difference.
This will help us to find out where the issue is.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20091116230624.5250.49813.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:16:50 +01:00
Masami Hiramatsu d65ff75fbe x86: Add verbose option to insn decoder test
Add verbose option to insn decoder test. This dumps decoded
instruction when building kernel with V=1.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20091116230618.5250.18762.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-17 07:16:48 +01:00
Cyrill Gorcunov e79c65a97c x86: io-apic: IO-APIC MMIO should not fail on resource insertion
If IO-APIC base address is 1K aligned we should not fail
on resourse insertion procedure. For this sake we define
IO_APIC_SLOT_SIZE constant which should cover all IO-APIC
direct accessible registers.

An example of a such configuration is there

	http://marc.info/?l=linux-kernel&m=118114792006520

 |
 | Quoting the message
 |
 | IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
 | IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47
 | IOAPIC[2]: apic_id 4, version 32, address 0xfec80400, GSI 48-71
 | IOAPIC[3]: apic_id 5, version 32, address 0xfec84000, GSI 72-95
 | IOAPIC[4]: apic_id 8, version 32, address 0xfec84400, GSI 96-119
 |

Reported-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20091116151426.GC5653@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-16 16:37:10 +01:00
Frederic Weisbecker 3c93ca00ee x86: Add missing might_fault() checks to copy_{to,from}_user()
On x86-64, copy_[to|from]_user() rely on assembly routines that
never call might_fault(), making us missing various lockdep
checks.

This doesn't apply to __copy_from,to_user() that explicitly
handle these calls, neither is it a problem in x86-32 where
copy_to,from_user() rely on the "__" prefixed versions that
also call might_fault().

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1258382538-30979-1-git-send-email-fweisbec@gmail.com>
[ v2: fix module export ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-16 16:09:52 +01:00
Hiroshi Shimamoto 62ad33f670 x86: Don't put iommu_shutdown_noop() in init section
It causes kernel panic on shutdown or reboot.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <4B00BC8E.50801@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-16 08:58:51 +01:00
Ingo Molnar 39dc78b651 Merge commit 'v2.6.32-rc7' into perf/core
Merge reason: pick up perf fixlets

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:50:41 +01:00
Jan Beulich 1472248583 x86-64: __copy_from_user_inatomic() adjustments
This v2.6.26 commit:

    ad2fc2c: x86: fix copy_user on x86

rendered __copy_from_user_inatomic() identical to
copy_user_generic(), yet didn't make the former just call the
latter from an inline function.

Furthermore, this v2.6.19 commit:

    b885808: [PATCH] Add proper sparse __user casts to __copy_to_user_inatomic

converted the return type of __copy_to_user_inatomic() from
unsigned long to int, but didn't do the same to
__copy_from_user_inatomic().

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <v.mayatskih@gmail.com>
LKML-Reference: <4AFD5778020000780001F8F4@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:29:47 +01:00
FUJITA Tomonori f4131c6259 x86: Make calgary_iommu_init() static
This makes calgary_iommu_init() static and moves it to remove
the forward declaration.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: muli@il.ibm.com
LKML-Reference: <20091114212603U.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:04:14 +01:00
FUJITA Tomonori 6959450e56 swiotlb: Remove duplicate swiotlb_force extern declarations
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: tony.luck@intel.com
LKML-Reference: <1258199198-16657-4-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:03:10 +01:00
FUJITA Tomonori 94a15564ac x86: Move iommu_shutdown_noop to x86_init.c
iommu_init_noop() is in arch/x86/kernel/x86_init.c but
iommu_shutdown_noop() in arch/x86/include/asm/iommu.h.

This moves iommu_shutdown_noop() to x86_init.c for consistency.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1258199198-16657-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:03:10 +01:00
FUJITA Tomonori a3b28ee109 x86: Set dma_ops to nommu_dma_ops by default
We set dma_ops to nommu_dma_ops at two different places for
x86_32 and x86_64. This unifies them by setting dma_ops to
nommu_dma_ops by default.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1258199198-16657-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:03:09 +01:00
Ingo Molnar 68efa37df7 hw-breakpoints, x86: Fix modular KVM build
This build error:

arch/x86/kvm/x86.c:3655: error: implicit declaration of function 'hw_breakpoint_restore'

Happens because in the CONFIG_KVM=m case there's no 'CONFIG_KVM' define
in the kernel - it's CONFIG_KVM_MODULE in that case.

Make the prototype available unconditionally.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
LKML-Reference: <1258114575-32655-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14 15:32:53 +01:00
Ingo Molnar 31c997cac7 x86: Fix cpu_devs[] initialization in early_cpu_init()
Yinghai Lu noticed that this commit:

  0388423: x86: Minimise printk spew from per-vendor init code

mistakenly left out the initialization of cpu_devs[] in the
!PROCESSOR_SELECT case. Fix it.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Cc: Dave Jones <davej@redhat.com>
LKML-Reference: <20091113203000.GA19160@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14 10:36:50 +01:00
Roland Dreier b01c845f0f x86: Remove CPU cache size output for non-Intel too
As Dave Jones said about the output in intel_cacheinfo.c: "They
aren't useful, and pollute the dmesg output a lot (especially on
machines with many cores).  Also the same information can be
trivially found out from userspace."

Give the generic display_cacheinfo() function the same treatment.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Dave Jones <davej@redhat.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <adaocn6dp99.fsf_-_@roland-alpha.cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14 01:51:18 +01:00
Dave Jones 0388423dba x86: Minimise printk spew from per-vendor init code
In the default case where the kernel supports all CPU vendors,
we currently print out a bunch of not useful messages on every
system.

32-bit:
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  NSC Geode by NSC
  Cyrix CyrixInstead
  Centaur CentaurHauls
  Transmeta GenuineTMx86
  Transmeta TransmetaCPU
  UMC UMC UMC UMC

64-bit:
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls

Given that "what CPUs does the kernel support" isn't useful for
the "support everything" case, we can suppress these printk's.

Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <20091113203000.GA19160@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-14 01:18:05 +01:00
Dave Jones 15cd8812ab x86: Remove the CPU cache size printk's
They aren't really useful, and they pollute the dmesg output a lot
(especially on machines with many cores).

Also the same information can be trivially found out from
userspace.

Reported-by: Mike Travis <travis@sgi.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091112231542.GA7129@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-13 09:14:55 +01:00
Eric W. Biederman 24a065624d sysctl x86: Remove dead binary sysctl support
Now that sys_sysctl is a generic wrapper around /proc/sys  .ctl_name
and .strategy members of sysctl tables are dead code.  Remove them.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-12 02:05:04 -08:00
Hiroshi Shimamoto db48cccc7c perf_event, x86: Annotate init functions and data
Annotate init functions and data with __init and __initconst.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@gmail.com>
LKML-Reference: <4AFB721E.8070203@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-12 09:18:36 +01:00
Hidetoshi Seto cffd377e58 x86, mce: Fix __init annotations
The intel_init_thermal() is called from resume path, so it
cannot be marked as __init.

OTOH mce_banks_init() is only called from
__mcheck_cpu_cap_init() which is marked as __cpuinit, so it can
be also marked as __cpuinit.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Yong Wang <yong.y.wang@linux.intel.com>
LKML-Reference: <4AFBB0B8.2070501@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-12 09:17:11 +01:00
Linus Torvalds 55871bdd03 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI: Adjust GFP mask handling for coherent allocations
  PCI ASPM: fix oops on root port removal
2009-11-11 11:34:14 -08:00
Linus Torvalds 605f37504f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, amd-ucode: Check UCODE_MAGIC before loading the container file
  x86: Fix error return sequence in __ioremap_caller()
  x86: Add Phoenix/MSC BIOSes to lowmem corruption list
2009-11-11 11:29:10 -08:00
FUJITA Tomonori b18485e7ac swiotlb: Remove the swiotlb variable usage
POWERPC doesn't expect it to be used.

This fixes the linux-next build failure reported by
Stephen Rothwell:

  lib/swiotlb.c: In function 'setup_io_tlb_npages':
  lib/swiotlb.c:114: error: 'swiotlb' undeclared (first use in this function)

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: peterz@infradead.org
LKML-Reference: <20091112000258F.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11 16:51:18 +01:00
Yong Wang ce6b5d768c x86: Mark the thermal init functions __init
Mark the thermal init functions __init so that the init memory
can be freed.

Signed-off-by: Yong Wang <yong.y.wang@intel.com>
LKML-Reference: <20091111075125.GA17900@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-11 12:33:32 +01:00
Dimitri Sivanich 200a9ae280 x86: Remove asm/apicnum.h
arch/x86/include/asm/apicnum.h is not referenced anywhere
anymore. Its definitions appear in apicdef.h. Remove it.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Mike Travis <travis@sgi.com>
LKML-Reference: <20091110195835.GA4393@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 22:07:35 +01:00
Ingo Molnar b4941a9a60 x86: Add iommu_init to x86_init_ops, fix build
Most of the time x86_init.h is included in pci-dma.c - but not always,
leading to this rare build failure:

arch/x86/kernel/pci-dma.c:296: error: 'x86_init' undeclared (first use in this function)

So include asm/x86_init.h explicitly.

Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 14:37:58 +01:00
FUJITA Tomonori 72d03802b8 x86, 32-bit: Fix swiotlb boot crash
Ingo Molnar reported this boot crash:

[    8.655620] pata_amd 0000:00:06.0: version 0.4.1
[    8.660286] BUG: unable to handle kernel NULL pointer dereference at 00000034
[    8.663572] IP: [<c100617b>] dma_supported+0x3b/0xa4
[    8.663572] *pde = 00000000

Initialize dma_ops properly in the 32-bit case.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 14:11:32 +01:00
FUJITA Tomonori 75f1cdf1dd x86: Handle HW IOMMU initialization failure gracefully
If HW IOMMU initialization fails (Intel VT-d often does this,
typically due to BIOS bugs), we fall back to nommu. It doesn't
work for the majority since nowadays we have more than 4GB
memory so we must use swiotlb instead of nommu.

The problem is that it's too late to initialize swiotlb when HW
IOMMU initialization fails. We need to allocate swiotlb memory
earlier from bootmem allocator. Chris explained the issue in
detail:

  http://marc.info/?l=linux-kernel&m=125657444317079&w=2

The current x86 IOMMU initialization sequence is too complicated
and handling the above issue makes it more hacky.

This patch changes x86 IOMMU initialization sequence to handle
the above issue cleanly.

The new x86 IOMMU initialization sequence are:

1. we initialize the swiotlb (and setting swiotlb to 1) in the case
   of (max_pfn > MAX_DMA32_PFN && !no_iommu). dma_ops is set to
   swiotlb_dma_ops or nommu_dma_ops. if swiotlb usage is forced by
   the boot option, we finish here.

2. we call the detection functions of all the IOMMUs

3. the detection function sets x86_init.iommu.iommu_init to the
   IOMMU initialization function (so we can avoid calling the
   initialization functions of all the IOMMUs needlessly).

4. if the IOMMU initialization function doesn't need to swiotlb
   then sets swiotlb to zero (e.g. the initialization is
   sucessful).

5. if we find that swiotlb is set to zero, we free swiotlb
   resource.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-10-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:32:07 +01:00
FUJITA Tomonori ad32e8cb86 swiotlb: Defer swiotlb init printing, export swiotlb_print_info()
This enables us to avoid printing swiotlb memory info when we
initialize swiotlb. After swiotlb initialization, we could find
that we don't need swiotlb.

This patch removes the code to print swiotlb memory info in
swiotlb_init() and exports the function to do that.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
Cc: tony.luck@intel.com
Cc: benh@kernel.crashing.org
LKML-Reference: <1257849980-22640-9-git-send-email-fujita.tomonori@lab.ntt.co.jp>
[ -v2: merge up conflict ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:32:00 +01:00
FUJITA Tomonori 9d5ce73a64 x86: intel-iommu: Convert detect_intel_iommu to use iommu_init hook
This changes detect_intel_iommu() to set intel_iommu_init() to
iommu_init hook if detect_intel_iommu() finds the IOMMU.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-6-git-send-email-fujita.tomonori@lab.ntt.co.jp>
[ -v2: build fix for the !CONFIG_DMAR case ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:31:36 +01:00
FUJITA Tomonori ea1b0d3945 x86: amd_iommu: Convert amd_iommu_detect() to use iommu_init hook
This changes amd_iommu_detect() to set amd_iommu_init to
iommu_init hook if amd_iommu_detect() finds the AMD IOMMU.

We can kill the code to check if we found the IOMMU in
amd_iommu_init() since amd_iommu_detect() sets amd_iommu_init()
only when it found the IOMMU.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-5-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:31:30 +01:00
FUJITA Tomonori de957628ce x86: GART: Convert gart_iommu_hole_init() to use iommu_init hook
This changes gart_iommu_hole_init() to set gart_iommu_init() to
iommu_init hook if gart_iommu_hole_init() finds the GART IOMMU.

We can kill the code to check if we found the IOMMU in
gart_iommu_init() since gart_iommu_hole_init() sets
gart_iommu_init() only when it found the IOMMU.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-4-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:31:23 +01:00
FUJITA Tomonori d7b9f7be21 x86: Calgary: Convert detect_calgary() to use iommu_init hook
This changes detect_calgary() to set init_calgary() to
iommu_init hook if detect_calgary() finds the Calgary IOMMU.

We can kill the code to check if we found the IOMMU in
init_calgary() since detect_calgary() sets init_calgary() only
when it found the IOMMU.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
LKML-Reference: <1257849980-22640-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:31:15 +01:00
FUJITA Tomonori d07c1be069 x86: Add iommu_init to x86_init_ops
We call the detections functions of all the IOMMUs then all
their initialization functions. The latter is pointless since we
don't detect multiple different IOMMUs. What we need to do is
calling the initialization function of the detected IOMMU.

This adds iommu_init hook to x86_init_ops so if an IOMMU
detection function can set its initialization function to the
hook.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
LKML-Reference: <1257849980-22640-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:31:07 +01:00
Frederic Weisbecker 59d8eb53ea hw-breakpoints: Wrap in the KVM breakpoint active state check
Wrap in the cpu dr7 check that tells if we have active
breakpoints that need to be restored in the cpu.

This wrapper makes the check more self-explainable and also
reusable for any further other uses.

Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
2009-11-10 11:23:43 +01:00
Frederic Weisbecker 9f6b3c2c30 hw-breakpoints: Fix broken a.out format dump
Fix the broken a.out format dump. For now we only dump the ptrace
breakpoints.

TODO: Dump every perf breakpoints for the current thread, not only
ptrace based ones.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
2009-11-10 11:23:05 +01:00
Joe Perches 41855b7754 x86: GART: pci-gart_64.c: Use correct length in strncmp
Signed-off-by: Joe Perches <joe@perches.com>
Cc: <stable@kernel.org> # .3x.x
LKML-Reference: <1257818330.12852.72.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 06:05:39 +01:00
Yong Wang a2202aa292 x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP value
On platforms where the BIOS handles the thermal monitor interrupt,
APIC_LVTTHMR on each logical CPU is programmed to generate a SMI
and OS must not touch it.

Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all
the LVT entries except the mask bit. Essentially this results in
all LVT entries including the thermal monitoring interrupt set
to masked (clearing the bios programmed value for APIC_LVTTHMR).

And this leads to kernel take over the thermal monitoring
interrupt on AP's but not on BSP (leaving the bios programmed
value only on BSP).

As a result of this, we have seen system hangs when the thermal
monitoring interrupt is generated.

Fix this by reading the initial value of thermal LVT entry on
BSP and if bios has taken over the control, then program the
same value on all AP's and leave the thermal monitoring
interrupt control on all the logical cpu's to the bios.

Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: stable@kernel.org
2009-11-10 05:57:55 +01:00
Cyrill Gorcunov 7abc075313 x86: apic: Do not use stacked physid_mask_t
We should not use physid_mask_t as a stack based
variable in apic code. This type depends on MAX_APICS
parameter which may be huge enough.

Especially it became a problem with apic NOOP driver which
is portable between 32 bit and 64 bit environment
(where we have really huge MAX_APICS).

So apic driver should operate with pointers and a caller
in turn should aware of allocation physid_mask_t variable.

As a side (but positive) effect -- we may use already
implemented physid_set_mask_of_physid function eliminating
default_apicid_to_cpu_present completely.

Note that physids_coerce and physids_promote turned into static
inline from macro (since macro hides the fact that parameter is
being interpreted as unsigned long, make it explicit).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20091109220659.GA5568@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 05:52:07 +01:00
Borislav Petkov 506f90eeae x86, amd-ucode: Check UCODE_MAGIC before loading the container file
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20091029134552.GC30802@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 05:46:09 +01:00
Cyrill Gorcunov f4a70c5537 x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit
In fact it's never get used on x86-64 (for 64 bit platform
we use differ technique to enumerate io-units).

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20091108131645.GD5300@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 19:46:17 +01:00
Cyrill Gorcunov 4343fe1024 x86, ioapic: Use snrpintf while set names for IO-APIC resourses
We should be ready that one day MAX_IO_APICS may raise its
number. To prevent memory overwrite we're to use safe
snprintf while set IO-APIC resourse name.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20091108155431.GC25940@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 17:06:23 +01:00
Cyrill Gorcunov 46dc281b1b x86, apic: Use PAGE_SIZE instead of numbers
The whole page is reserved for IO-APIC fixmap
due to non-cacheable requirement. So lets note
this explicitly instead of playing with numbers.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
LKML-Reference: <20091108155356.GB25940@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 17:06:22 +01:00
Jan Beulich eb647138ac x86/PCI: Adjust GFP mask handling for coherent allocations
Rather than forcing GFP flags and DMA mask to be inconsistent,
GFP flags should be determined even for the fallback device
through dma_alloc_coherent_mask()/dma_alloc_coherent_gfp_flags().

This restores 64-bit behavior as it was prior to commits
8965eb1938 and
4a367f3a9d (not sure why there are
two of them), where GFP_DMA was forced on for 32-bit, but not
for 64-bit, with the slight adjustment that afaict even 32-bit
doesn't need this without CONFIG_ISA.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Takashi Iwai <tiwai@suse.de>
LKML-Reference: <4AF18187020000780001D8AA@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-08 07:44:30 -08:00
Frederic Weisbecker 24f1e32c60 hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events
This patch rebase the implementation of the breakpoints API on top of
perf events instances.

Each breakpoints are now perf events that handle the
register scheduling, thread/cpu attachment, etc..

The new layering is now made as follows:

       ptrace       kgdb      ftrace   perf syscall
          \          |          /         /
           \         |         /         /
                                        /
            Core breakpoint API        /
                                      /
                     |               /
                     |              /

              Breakpoints perf events

                     |
                     |

               Breakpoints PMU ---- Debug Register constraints handling
                                    (Part of core breakpoint API)
                     |
                     |

             Hardware debug registers

Reasons of this rewrite:

- Use the centralized/optimized pmu registers scheduling,
  implying an easier arch integration
- More powerful register handling: perf attributes (pinned/flexible
  events, exclusive/non-exclusive, tunable period, etc...)

Impact:

- New perf ABI: the hardware breakpoints counters
- Ptrace breakpoints setting remains tricky and still needs some per
  thread breakpoints references.

Todo (in the order):

- Support breakpoints perf counter events for perf tools (ie: implement
  perf_bpcounter_event())
- Support from perf tools

Changes in v2:

- Follow the perf "event " rename
- The ptrace regression have been fixed (ptrace breakpoint perf events
  weren't released when a task ended)
- Drop the struct hw_breakpoint and store generic fields in
  perf_event_attr.
- Separate core and arch specific headers, drop
  asm-generic/hw_breakpoint.h and create linux/hw_breakpoint.h
- Use new generic len/type for breakpoint
- Handle off case: when breakpoints api is not supported by an arch

Changes in v3:

- Fix broken CONFIG_KVM, we need to propagate the breakpoint api
  changes to kvm when we exit the guest and restore the bp registers
  to the host.

Changes in v4:

- Drop the hw_breakpoint_restore() stub as it is only used by KVM
- EXPORT_SYMBOL_GPL hw_breakpoint_restore() as KVM can be built as a
  module
- Restore the breakpoints unconditionally on kvm guest exit:
  TIF_DEBUG_THREAD doesn't anymore cover every cases of running
  breakpoints and vcpu->arch.switch_db_regs might not always be
  set when the guest used debug registers.
  (Waiting for a reliable optimization)

Changes in v5:

- Split-up the asm-generic/hw-breakpoint.h moving to
  linux/hw_breakpoint.h into a separate patch
- Optimize the breakpoints restoring while switching from kvm guest
  to host. We only want to restore the state if we have active
  breakpoints to the host, otherwise we don't care about messed-up
  address registers.
- Add asm/hw_breakpoint.h to Kbuild
- Fix bad breakpoint type in trace_selftest.c

Changes in v6:

- Fix wrong header inclusion in trace.h (triggered a build
  error with CONFIG_FTRACE_SELFTEST

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jan Kiszka <jan.kiszka@web.de>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
2009-11-08 15:34:42 +01:00
Tejun Heo 2ae8bb75db x86: Fix iommu=nodac parameter handling
iommu=nodac should forbid dac instead of enabling it. Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Matteo Frigo <athena@fftw.org>
Cc: <stable@kernel.org> # .32.x and older
LKML-Reference: <4AE5B52A.4050408@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 13:19:05 +01:00
FUJITA Tomonori 338bac527e x86: Use x86_platform for iommu_shutdown
This patch cleans up pci_iommu_shutdown() a bit to use
x86_platform (similar to how IA64 initializes an IOMMU driver).

This adds iommu_shutdown() to x86_platform to avoid calling
every IOMMUs' shutdown functions in pci_iommu_shutdown() in
order. The IOMMU shutdown functions are platform specific (we
don't have multiple different IOMMU hardware) so the current way
is pointless.

An IOMMU driver sets x86_platform.iommu_shutdown to the shutdown
function if necessary.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: joerg.roedel@amd.com
LKML-Reference: <20091027163358F.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 13:12:26 +01:00
Xiaotian Feng de2a47cf2b x86: Fix error return sequence in __ioremap_caller()
kernel missed to free memtype if get_vm_area_caller failed in
__ioremap_caller.

This patch introduces error path to fix this and cleans up the
repetitive error return sequences that contributed to the
creation of the bug.

Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <1257389031-20429-1-git-send-email-dfeng@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 12:48:58 +01:00
Rusty Russell 0d0fbbddcc x86, msr, cpumask: Use struct cpumask rather than the deprecated cpumask_t
This makes the declarations match the definitions, which already
use 'struct cpumask'.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <200911052245.41803.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 11:58:38 +01:00
Masami Hiramatsu c12a229bc5 x86: Remove unused thread_return label from switch_to()
Remove unused thread_return label from switch_to() macro on
x86-64. Since this symbol cuts into schedule(), backtrace at the
latter half of schedule() was always shown as thread_return().

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap <systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
LKML-Reference: <20091105160359.5181.26225.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-08 11:57:13 +01:00
Simon Kagstrom f1b291d4c4 x86: Add Phoenix/MSC BIOSes to lowmem corruption list
We have a board with a Phoenix/MSC BIOS which also corrupts the low
64KB of RAM, so add an entry to the table.

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
LKML-Reference: <20091106154404.002648d9@marrow.netinsight.se>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-11-06 14:49:39 -08:00
Eric W. Biederman c3359fbce4 sysctl: x86 Use the compat_sys_sysctl
Now that we have a generic 32bit compatibility implementation
there is no need for x86 to implement it's own.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-06 03:53:58 -08:00
Frederic Weisbecker 2da3e160cb hw-breakpoint: Move asm-generic/hw_breakpoint.h to linux/hw_breakpoint.h
We plan to make the breakpoints parameters generic among architectures.
For that it's better to move the asm-generic header to a generic linux
header.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-11-05 23:48:01 +01:00
Linus Torvalds 7c9abfb884 Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: get_tss_base_addr() should return a gpa_t
  KVM: x86: Catch potential overrun in MCE setup
2009-11-05 13:24:15 -08:00
Chris Lalancette 2c75910f1a x86: Make sure get_user_desc() doesn't sign extend.
The current implementation of get_user_desc() sign extends the return
value because of integer promotion rules.  For the most part, this
doesn't matter, because the top bit of base2 is usually 0.  If, however,
that bit is 1, then the entire value will be 0xffff...  which is
probably not what the caller intended.

This patch casts the entire thing to unsigned before returning, which
generates almost the same assembly as the current code but replaces the
final "cltq" (sign extend) with a "mov %eax %eax" (zero-extend).  This
fixes booting certain guests under KVM.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-05 13:22:18 -08:00
Linus Torvalds 9a6fc8d0f8 Merge branch 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen: mask extended topology info in cpuid
  xen/hvc: make sure console output is always emitted, with explicit polling
2009-11-05 10:58:07 -08:00
Linus Torvalds 608221fdf9 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix kthread_bind() by moving the body of kthread_bind() to sched.c
  sched: Disable SD_PREFER_LOCAL at node level
  sched: Fix boot crash by zalloc()ing most of the cpu masks
  sched: Strengthen buddies and mitigate buddy induced latencies
2009-11-05 10:56:47 -08:00
Linus Torvalds 411094acb7 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, fs: Fix x86 procfs stack information for threads on 64-bit
  x86: Add reboot quirk for 3 series Mac mini
  x86: Fix printk message typo in mtrr cleanup code
  dma-debug: Fix compile warning with PAE enabled
  x86/amd-iommu: Un__init function required on shutdown
  x86/amd-iommu: Workaround for erratum 63
2009-11-05 10:54:08 -08:00
Gleb Natapov abb3911965 KVM: get_tss_base_addr() should return a gpa_t
If TSS we are switching to resides in high memory task switch will fail
since address will be truncated. Windows2k3 does this sometimes when
running with more then 4G

Cc: stable@kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-11-04 12:42:36 -02:00
Jan Kiszka a9e38c3e01 KVM: x86: Catch potential overrun in MCE setup
We only allocate memory for 32 MCE banks (KVM_MAX_MCE_BANKS) but we
allow user space to fill up to 255 on setup (mcg_cap & 0xff), corrupting
kernel memory. Catch these overflows.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-11-04 12:42:35 -02:00
Stefani Seibold 89240ba059 x86, fs: Fix x86 procfs stack information for threads on 64-bit
This patch fixes two issues in the procfs stack information on
x86-64 linux.

The 32 bit loader compat_do_execve did not store stack
start. (this was figured out by Alexey Dobriyan).

The stack information on a x64_64 kernel always shows 0 kbyte
stack usage, because of a missing implementation of the KSTK_ESP
macro which always returned -1.

The new implementation now returns the right value.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1257240160.4889.24.camel@wall-e>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04 13:25:03 +01:00
Rusty Russell ce7c42710e cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c
Ingo wants the certainty of a static cpumask (rather than a
cpumask_var_t), but cpumask_t will some day be undefined to
avoid on-stack declarations.

This is what DECLARE_BITMAP/to_cpumask() is for.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
LKML-Reference: <200911031453.52394.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04 13:17:53 +01:00
Hiroshi Shimamoto 09879b99d4 x86: Gitignore: arch/x86/lib/inat-tables.c
Ignore generated file arch/x86/lib/inat-tables.c.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <4AF0FBD7.7000501@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04 13:11:28 +01:00
Ingo Molnar a2e7127153 Merge commit 'v2.6.32-rc6' into perf/core
Conflicts:
	tools/perf/Makefile

Merge reason: Resolve the conflict, merge to upstream and merge in
              perf fixes so we can add a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-04 11:59:45 +01:00
Brian Gerst 97829de5a3 x86, 64-bit: Fix bstep_iret jump
This jump should be unconditional.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1257274925-15713-1-git-send-email-brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-03 20:50:02 +01:00
Jeremy Fitzhardinge 82d6469916 xen: mask extended topology info in cpuid
A Xen guest never needs to know about extended topology, and knowing
would just confuse it.

This patch just zeros ebx in leaf 0xb which indicates no topology info,
preventing a crash under Xen on cpus which support this leaf.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
2009-11-03 11:09:12 -08:00
Paul Mundt 41a48d14f6 x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread().
flush_thread() tries to do a TIF_DEBUG check before calling in to
flush_thread_hw_breakpoint() (which subsequently clears the thread flag),
but for some reason, the x86 code is manually clearing TIF_DEBUG
immediately before the test, so this path will never be taken.

This kills off the erroneous clear_tsk_thread_flag() and lets
flush_thread_hw_breakpoint() actually get invoked.

Presumably folks were getting lucky with testing and the
free_thread_info() -> free_thread_xstate() path was taking care of the
flush there.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: "K.Prasad" <prasad@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alan Stern <stern@rowland.harvard.edu>
LKML-Reference: <20091005102306.GA7889@linux-sh.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-11-03 18:05:44 +01:00
Ingo Molnar 1d87cff407 Merge branch 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2009-11-03 16:54:14 +01:00
Arjan van de Ven a489ca355e x86: Make sure we also print a Code: line for show_regs()
show_regs() is called as a mini BUG() equivalent in some places,
specifically for the "scheduling while atomic" case.

Unfortunately right now it does not print a Code: line unlike
a real bug/oops.

This patch changes the x86 implementation of show_regs() so that
it calls the same function as oopses do to print the registers
as well as the Code: line.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20091102165915.4a980fc0@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-03 16:50:22 +01:00
Joerg Roedel 342688f9db Merge branches 'amd-iommu/fixes' and 'dma-debug/fixes' into iommu/fixes 2009-11-03 12:05:40 +01:00
Mike Galbraith 6b9de613ae sched: Disable SD_PREFER_LOCAL at node level
Yanmin Zhang reported that SD_PREFER_LOCAL induces an order of
magnitude increase in select_task_rq_fair() overhead while
running heavy wakeup benchmarks (tbench and vmark).

Since SD_BALANCE_WAKE is off at node level, turn SD_PREFER_LOCAL
off as well pending further investigation.

Reported-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-03 07:24:07 +01:00
Linus Torvalds efcd9e0b91 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Make EFI RTC function depend on 32bit again
  x86-64: Fix register leak in 32-bit syscall audting
  x86: crash_dump: Fix non-pae kdump kernel memory accesses
  x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium
  x86: Remove STACKPROTECTOR_ALL
2009-11-02 09:45:17 -08:00
Suresh Siddha 5231a68614 x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs()
To ensure that we handle all the pending interrupts (destined
for this cpu that is going down) in the interrupt subsystem
before the cpu goes offline, fixup_irqs() does:

	local_irq_enable();
	mdelay(1);
	local_irq_disable();

Enabling interrupts is not a good thing as this cpu is already
offline. So this patch replaces that logic with,

	mdelay(1);
	check APIC_IRR bits
	Retrigger the irq at the new destination if any interrupt has arrived
	via IPI.

For IO-APIC level triggered interrupts, this retrigger IPI will
appear as an edge interrupt. ack_apic_level() will detect this
condition and IO-APIC RTE's remoteIRR is cleared using directed
EOI(using IO-APIC EOI register) on Intel platforms and for
others it uses the existing mask+edge logic followed by
unmask+level.

We can also remove mdelay() and then send spuriuous interrupts
to new cpu targets for all the irqs that were handled previously
by this cpu that is going offline. While it works, I have seen
spurious interrupt messages (nothing wrong but still annoying
messages during cpu offline, which can be seen during
suspend/resume etc)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230002.043281924@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:37 +01:00
Suresh Siddha b3ec0a37a7 x86: Use EOI register in io-apic on intel platforms
IO-APIC's in intel chipsets support EOI register starting from
IO-APIC version 2. Use that when ever we need to clear the
IO-APIC RTE's RemoteIRR bit explicitly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.947855317@sbs-t61.sc.intel.com>
[ Marked use_eio_reg as __read_mostly, fixed small details ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:36 +01:00
Suresh Siddha a5e74b8419 x86: Force irq complete move during cpu offline
When a cpu goes offline, fixup_irqs() try to move irq's
currently destined to the offline cpu to a new cpu. But this
attempt will fail if the irq is recently moved to this cpu and
the irq still hasn't arrived at this cpu (for non intr-remapping
platforms this is when we free the vector allocation at the
previous destination) that is about to go offline.

This will endup with the interrupt subsystem still pointing the
irq to the offline cpu, causing that irq to not work any more.

Fix this by forcing the irq to complete its move (its been a
long time we moved the irq to this cpu which we are offlining
now) and then move this irq to a new cpu before this cpu goes
offline.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.848830905@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:36 +01:00
Suresh Siddha 23359a88e7 x86: Remove move_cleanup_count from irq_cfg
move_cleanup_count for each irq in irq_cfg is keeping track of
the total number of cpus that need to free the corresponding
vectors associated with the irq which has now been migrated to
new destination. As long as this move_cleanup_count is non-zero
(i.e., as long as we have n't freed the vector allocations on
the old destinations) we were preventing the irq's further
migration.

This cleanup count is unnecessary and it is enough to not allow
the irq migration till we send the cleanup vector to the
previous irq destination, for which we already have irq_cfg's
move_in_progress.  All we need to make sure is that we free the
vector at the old desintation but we don't need to wait till
that gets freed.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:35 +01:00
Suresh Siddha 84e21493a3 x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping
In the presence of interrupt-remapping, irqs will be migrated in
the process context and we don't do (and there is no need to)
irq_chip mask/unmask while migrating the interrupt.

Similarly fix the fixup_irqs() that get called during cpu
offline and avoid calling irq_chip mask/unmask for irqs that are
ok to be migrated in the process context.

While we didn't observe any race condition with the existing
code, this change takes complete advantage of
interrupt-remapping in the newer generation platforms and avoids
any potential HW lockup's (that often worry Eric :)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: garyhade@us.ibm.com
LKML-Reference: <20091026230001.661423939@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:35 +01:00
Suresh Siddha 7a7732bc0f x86: Unify fixup_irqs() for 32-bit and 64-bit kernels
There is no reason to have different fixup_irqs() for 32-bit and
64-bit kernels. Unify by using the superior 64-bit version for
both the kernels.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
LKML-Reference: <20091026230001.562512739@sbs-t61.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:56:34 +01:00
Gottfried Haider 05154752cf x86: Add reboot quirk for 3 series Mac mini
Reboot does not work out of the box on my "Early 2009" Mac mini
(3,1). Detect this machine via DMI as we do for recent MacBooks.

Signed-off-by: Gottfried Haider <gottfried.haider@gmail.com>
Cc: Ozan Çağlayan <ozan@pardus.org.tr>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 15:46:17 +01:00
Dave Jones 16121d70fd x86: Fix printk message typo in mtrr cleanup code
Trivial typo.

Signed-off-by: Dave Jones <davej@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-02 08:36:18 +01:00
Linus Torvalds 2e2ec95235 Merge branch 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen: set up mmu_ops before trying to set any ptes
2009-10-29 15:03:36 -07:00
Linus Torvalds 6e958d73c2 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Do less agressive buddy clearing
  sched: Disable SD_PREFER_LOCAL for MC/CPU domains
2009-10-29 08:10:38 -07:00
Linus Torvalds 7811a32407 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, UV: Set DELIVERY_MODE=4 for vector=NMI_VECTOR in uv_hub_send_ipi()
  x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode()
  x86: Don't print number of MCE banks for every CPU
  x86, UV: Fix information in __uv_hub_info structure
  x86: Document linker script ASSERT() quirk
2009-10-29 08:10:26 -07:00
Ingo Molnar 9de09ace8d Merge branch 'tracing/urgent' into tracing/core
Merge reason: Pick up fixes and move base from -rc1 to -rc5.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 09:02:20 +01:00
Masami Hiramatsu 3f7e454af1 x86: Add Intel FMA instructions to x86 opcode map
Add Intel FMA(FUSED-MULTIPLY-ADD) instructions to x86 opcode map
for x86 instruction decoder.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
LKML-Reference: <20091027204235.30545.33997.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 08:47:47 +01:00
Masami Hiramatsu e0e492e99b x86: AVX instruction set decoder support
Add Intel AVX(Advanced Vector Extensions) instruction set
support to x86 instruction decoder. This adds insn.vex_prefix
field for storing VEX prefixes, and introduces some original
tags for expressing opcodes attributes.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
LKML-Reference: <20091027204226.30545.23451.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 08:47:46 +01:00
Masami Hiramatsu 82cb57028c x86: Add pclmulq to x86 opcode map
Add pclmulq opcode to x86 opcode map.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
LKML-Reference: <20091027204219.30545.82039.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 08:47:46 +01:00
Masami Hiramatsu 04d46c1b13 x86: Merge INAT_REXPFX into INAT_PFX_*
Merge INAT_REXPFX into INAT_PFX_* macro and rename it to
INAT_PFX_REX.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
LKML-Reference: <20091027204211.30545.58090.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 08:47:45 +01:00
Masami Hiramatsu 7f387d3f24 x86: Fix SSE opcode map bug
Fix superscripts position because some superscripts of SSE
opcode are not put in correct position.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
LKML-Reference: <20091027204204.30545.97296.stgit@harusame>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29 08:47:45 +01:00
Joerg Roedel ca0207114f x86/amd-iommu: Un__init function required on shutdown
The function iommu_feature_disable is required on system
shutdown to disable the IOMMU but it is marked as __init.
This may result in a panic if the memory is reused. This
patch fixes this bug.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-10-28 18:02:26 +01:00
Jeremy Fitzhardinge 973df35ed9 xen: set up mmu_ops before trying to set any ptes
xen_setup_stackprotector() ends up trying to set page protections,
so we need to have vm_mmu_ops set up before trying to do so.
Failing to do so causes an early boot crash.

[ Impact: Fix early crash under Xen. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-10-27 16:54:19 -07:00
Andreas Herrmann 6f9b41006a x86, apic: Clear APIC Timer Initial Count Register on shutdown
Commit a98f8fd24f (x86: apic reset
counter on shutdown) set the counter to max to avoid spurious
interrupts when the timer is re-enabled.

(In theory) you'll still get a spurious interrupt if spending
more than 344 seconds with this interrupt disabled and then
unmasking it.

The right thing to do is to clear the register. This disables
the interrupt from happening (at least it does on AMD hardware).

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20091027100138.GB30802@alberich.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-27 14:54:21 +01:00
Feng Tang 772be899bc x86: Make EFI RTC function depend on 32bit again
The EFI RTC functions are only available on 32 bit. commit 7bd867df
(x86: Move get/set_wallclock to x86_platform_ops) removed the 32bit
dependency which leads to boot crashes on 64bit EFI systems.

Add the dependency back. 
Solves: http://bugzilla.kernel.org/show_bug.cgi?id=14466

Tested-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Feng Tang <feng.tang@intel.com>
LKML-Reference: <20091020125402.028d66d5@feng-desktop>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-10-27 12:35:48 +01:00
Jan Beulich 81766741fe x86-64: Fix register leak in 32-bit syscall audting
Restoring %ebp after the call to audit_syscall_exit() is not
only unnecessary (because the register didn't get clobbered),
but in the sysenter case wasn't even doing the right thing: It
loaded %ebp from a location below the top of stack (RBP <
ARGOFFSET), i.e. arbitrary kernel data got passed back to user
mode in the register.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <4AE5CC4D020000780001BD13@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-26 16:23:26 +01:00
Jiri Slaby 72ed7de74e x86: crash_dump: Fix non-pae kdump kernel memory accesses
Non-PAE 32-bit dump kernels may wrap an address around 4G and
poke unwanted space. ptes there are 32-bit long, and since
pfn << PAGE_SIZE may exceed this limit, high pfn bits are
cropped and wrong address mapped by kmap_atomic_pfn in
copy_oldmem_page.

Don't allow this behavior in non-PAE kdump kernels by checking
pfns passed into copy_oldmem_page. In the case of failure,
userspace process gets EFAULT.

[v2]
- fix comments
- move ifdefs inside the function

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Paul Mundt <lethal@linux-sh.org>
LKML-Reference: <1256551903-30567-1-git-send-email-jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-26 12:38:59 +01:00
Rusty Russell ae1b22f6e4 x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium
Commit 79e1dd05d1 "x86: Provide an alternative() based
cmpxchg64()" broke lguest, even on systems which have cmpxchg8b
support.  The emulation code gets used until alternatives get
run, but it contains native instructions, not their paravirt
alternatives.

The simplest fix is to turn this code off except for 386 and 486
builds.

Reported-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: lguest@ozlabs.org
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <200910261426.05769.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-26 12:33:02 +01:00
Arjan van de Ven 14a3f40aaf x86: Remove STACKPROTECTOR_ALL
STACKPROTECTOR_ALL has a really high overhead (runtime and stack
footprint) and is not really worth it protection wise (the
normal STACKPROTECTOR is in effect for all functions with
buffers already), so lets just remove the option entirely.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Eric Sandeen <sandeen@redhat.com>
LKML-Reference: <20091023073101.3dce4ebb@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 16:35:23 +02:00
Ingo Molnar 4331595650 Merge branch 'perf/core' into perf/probes
Conflicts:
	tools/perf/Makefile

Merge reason:

 - fix the conflict
 - pick up the pr_*() infrastructure to queue up dependent patch

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-23 08:23:20 +02:00
Linus Torvalds 422b42fa79 Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Prevent kvm_init from corrupting debugfs structures
  KVM: MMU: fix pointer cast
  KVM: use proper hrtimer function to retrieve expiration time
2009-10-22 08:26:15 +09:00
Linus Torvalds 4fe71dba2f Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: aesni-intel - Fix irq_fpu_usable usage
  crypto: padlock-sha - Fix stack alignment
2009-10-22 08:16:01 +09:00
Ingo Molnar 9bf4e7fba8 x86, instruction decoder: Fix test_get_len build rules
Add the kernel source include file as well to the include files
search path, to fix this build bug:

 In file included from arch/x86/tools/test_get_len.c:28:
   arch/x86/lib/insn.c:21:26: error: linux/string.h: No such file or directory

Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap<systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 14:42:56 +02:00
Robin Holt 02dd0a0613 x86, UV: Set DELIVERY_MODE=4 for vector=NMI_VECTOR in uv_hub_send_ipi()
When sending a NMI_VECTOR IPI using the UV_HUB_IPI_INT register,
we need to ensure the delivery mode field of that register has
NMI delivery selected.

This makes those IPIs true NMIs, instead of flat IPIs. It
matters to reboot sequences and KGDB, both of which use NMI
IPIs.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Martin Hicks <mort@sgi.com>
Cc: <stable@kernel.org>
LKML-Reference: <20091020193620.877322000@alcatraz.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 13:31:13 +02:00
Masami Hiramatsu 9983d60d74 x86: Add AES opcodes to opcode map
Add Intel AES opcodes to x86 opcode map. These opcodes are
used in arch/x86/crypt/aesni-intel_asm.S.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap<systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020165531.4145.21872.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 13:25:29 +02:00
Masami Hiramatsu 06ed6ba5ec x86: Fix group attribute decoding bug
Fix a typo in inat_get_group_attribute() which should refer
inat_group_tables, not inat_escape_tables.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: systemtap<systemtap@sources.redhat.com>
Cc: DLE <dle-develop@lists.sourceforge.net>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020165524.4145.97333.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-21 13:25:28 +02:00
Huang Ying 13b79b9715 crypto: aesni-intel - Fix irq_fpu_usable usage
When renaming kernel_fpu_using to irq_fpu_usable, the semantics of the
function is changed too, from mesuring whether kernel is using FPU,
that is, the FPU is NOT available, to measuring whether FPU is usable,
that is, the FPU is available.

But the usage of irq_fpu_usable in aesni-intel_glue.c is not changed
accordingly. This patch fixes this.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-10-20 16:20:47 +09:00
Frederic Weisbecker 0f8f86c7bd Merge commit 'perf/core' into perf/hw-breakpoint
Conflicts:
	kernel/Makefile
	kernel/trace/Makefile
	kernel/trace/trace.h
	samples/Makefile

Merge reason: We need to be uptodate with the perf events development
branch because we plan to rewrite the breakpoints API on top of
perf events.
2009-10-18 01:12:33 +02:00
Ingo Molnar bb3c3e8071 Merge commit 'v2.6.32-rc5' into perf/probes
Conflicts:
	kernel/trace/trace_event_profile.c

Merge reason: update to -rc5 and resolve conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:58:25 +02:00
Masami Hiramatsu d1baf5a5a6 x86: Add AMD prefetch and 3DNow! opcodes to opcode map
Add AMD prefetch and 3DNow! opcode including FEMMS. Since 3DNow!
uses the last immediate byte as an opcode extension byte, x86
insn just treats the extenstion byte as an immediate byte
instead of a part of opcode (insn_get_opcode() decodes first
"0x0f 0x0f" bytes.)

Users who are interested in analyzing 3DNow! opcode still can
decode it by analyzing the immediate byte.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000744.16556.27881.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:53:59 +02:00
Masami Hiramatsu 8c95bc3e20 x86: Add MMX/SSE opcode groups to opcode map
Add missing MMX/SSE opcode groups to x86 opcode map.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20091017000736.16556.29061.stgit@dhcp-100-2-132.bos.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-17 09:53:58 +02:00
Frederik Deweerdt 8a8365c560 KVM: MMU: fix pointer cast
On a 32 bits compile, commit 3da0dd433d
introduced the following warnings:

arch/x86/kvm/mmu.c: In function ‘kvm_set_pte_rmapp’:
arch/x86/kvm/mmu.c:770: warning: cast to pointer from integer of different size
arch/x86/kvm/mmu.c: In function ‘kvm_set_spte_hva’:
arch/x86/kvm/mmu.c:849: warning: cast from pointer to integer of different size

The following patch uses 'unsigned long' instead of u64 to match the
pointer size on both arches.

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-16 12:30:26 -03:00
Marcelo Tosatti ace1546487 KVM: use proper hrtimer function to retrieve expiration time
hrtimer->base can be temporarily NULL due to racing hrtimer_start.
See switch_hrtimer_base/lock_hrtimer_base.

Use hrtimer_get_remaining which is robust against it.

CC: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-10-16 12:30:25 -03:00
Robin Holt 1d21e6e3ff x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode()
Create an inline function to extract the pnode from a global
physical address and then convert the broadcast assist unit to
use the newly created uv_gpa_to_pnode function.

The open-coded code was wrong as well - it might explain a
few of our unexplained bau hangs.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Cliff Whickman <cpw@sgi.com>
Cc: linux-mm@kvack.org
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20091016112920.GZ8903@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:51:53 +02:00
Borislav Petkov b33a636364 x86, mce: Add a global MCE init helper
Add an early initcall (pre SMP) which sets up global MCE
functionality.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:46:50 +02:00
Borislav Petkov 5e09954a9a x86, mce: Fix up MCE naming nomenclature
Prefix global/setup routines with "mcheck_" thus differentiating
from the internal facilities prefixed with "mce_". Also, prefix
the per cpu calls with mcheck_cpu and rename them to reflect the
MCE setup hierarchy of calls better.

There should be no functionality change resulting from this
patch.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:46:49 +02:00
Ingo Molnar 6b50f5c7c7 Merge branches 'x86/mce' and 'x86/urgent' into perf/mce
Merge reason: Put all MCE changes into this branch, we are
              queueing up a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 14:42:25 +02:00
Roland Dreier 93ae5012a7 x86: Don't print number of MCE banks for every CPU
The MCE initialization code explicitly says it doesn't handle
asymmetric configurations where different CPUs support different
numbers of MCE banks, and it prints a big warning in that case.

Therefore, printing the "mce: CPU supports <x> MCE banks"
message into the kernel log for every CPU is pure redundancy
that clutters the log significantly for systems with lots of
CPUs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
LKML-Reference: <adaeip473qt.fsf@cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 09:20:03 +02:00
Robin Holt 036ed8ba61 x86, UV: Fix information in __uv_hub_info structure
A few parts of the uv_hub_info structure are initialized
incorrectly.

 - n_val is being loaded with m_val.
 - gpa_mask is initialized with a bytes instead of an unsigned long.
 - Handle the case where none of the alias registers are used.

Lastly I converted the bau over to using the uv_hub_info->m_val
which is the correct value.

Without this patch, booting a large configuration hits a
problem where the upper bits of the gnode affect the pnode
and the bau will not operate.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Cliff Whickman <cpw@sgi.com>
Cc: stable@kernel.org
LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 08:18:34 +02:00
Ingo Molnar a5912f6b3e x86: Document linker script ASSERT() quirk
Older binutils breaks if ASSERT() is used without a sink
for the output.

For example 2.14.90.0.6 is known to be broken, the link
fails with:

  LD      .tmp_vmlinux1
  ld:arch/x86/kernel/vmlinux.lds:678: parse error

Document this quirk in all three files that use it.

  See:    http://marc.info/?l=linux-kbuild&m=124930110427870&w=2
  See[2]: d2ba8b2 ("x86: Fix assert syntax in vmlinux.lds.S")

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-16 07:18:46 +02:00
Cyrill Gorcunov f88f2b4fdb x86: apic: Allow noop operations to be called almost at any time
As only apic noop is used we allow to use almost any operation
caller wants (and which of them noop driver supports of
course).

Initially it was reported by Ingo Molnar that apic noop
issue a warning for pkg id (which is actually false positive
and should be eliminated).

So we save checking (and warning issue) for read/write
operations while allow any other ops to be freely used.

Also:
 - fix noop_cpu_to_logical_apicid, it should be 0.
 - rename noop_default_phys_pkg_id to noop_phys_pkg_id
   (we use default_ prefix for more general routines
    in apic subsystem).

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
LKML-Reference: <20091015150416.GC5331@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 17:26:53 +02:00
Ingo Molnar 713490e02e Merge branch 'tracing/core' into perf/core
Merge reason: to add event filter support we need the following
commits from the tracing tree:

 3f6fe06: tracing/filters: Unify the regex parsing helpers
 1889d20: tracing/filters: Provide basic regex support
 737f453: tracing/filters: Cleanup useless headers

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 11:34:00 +02:00
Ingo Molnar b226f744d4 Merge branch 'linus' into perf/core
Merge reason: pick up tools/perf/ changes from upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:44:44 +02:00
Ingo Molnar db8590f504 Revert "x86: linker script syntax nits"
This reverts commit e9a63a4e55.

This breaks older binutils, where sink-less asserts are broken.

See this commit for further details:

  d2ba8b2: x86: Fix assert syntax in vmlinux.lds.S

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:09:55 +02:00
Ingo Molnar a0738a688d Merge branch 'linus' into x86/urgent
Merge reason: pull in latest, to be able to revert a patch there.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:07:30 +02:00
Linus Torvalds 601adfedba Merge branch 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland
* 'topic/x86-lds-nits' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
  x86: linker script syntax nits
2009-10-14 15:33:05 -07:00
Linus Torvalds f061d83a2b Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix missing kernel-doc notation
  Revert "x86, timers: Check for pending timers after (device) interrupts"
  sched: Update the clock of runqueue select_task_rq() selected
2009-10-14 15:25:04 -07:00
Linus Torvalds ea87644105 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/paravirt: Use normal calling sequences for irq enable/disable
  x86: fix kernel panic on 32 bits when profiling
  x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop
  x86, vmi: Mark VMI deprecated and schedule it for removal
2009-10-14 15:24:32 -07:00
Roland McGrath e9a63a4e55 x86: linker script syntax nits
The linker scripts grew some use of weirdly wrong linker script syntax.
It happens to work, but it's not what the syntax is documented to be.
Clean it up to use the official syntax.

Signed-off-by: Roland McGrath <roland@redhat.com>
CC: Ian Lance Taylor <iant@google.com>
2009-10-14 14:16:38 -07:00
Thomas Gleixner 05d86412ea x86: Remove BKL from apm_32
The lock/unlock kernel pair in do_open() got there with the BKL push
down and protects nothing. Remove it.

Replace the lock/unlock kernel in the ioctl code with a mutex to
protect standbys_pending and suspends_pending.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091010153349.365236337@linutronix.de>
2009-10-14 17:04:48 +02:00
Thomas Gleixner ac06ea2cd0 x86: Remove BKL from microcode
cycle_lock_kernel() in microcode_open() is a worthless exercise as
there is nothing to wait for. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091010153349.196074920@linutronix.de>
2009-10-14 17:04:48 +02:00
Ingo Molnar 7ec13187ef x86, apic: Fix prototype in hw_irq.h
This warning:

 In file included from arch/x86/include/asm/ipi.h:23,
                  from arch/x86/kernel/apic/apic_noop.c:27:
 arch/x86/include/asm/hw_irq.h:105: warning: ‘struct irq_desc’ declared inside parameter list
 arch/x86/include/asm/hw_irq.h:105: warning: its scope is only this definition or declaration, which is probably not what you want

triggers because irq_desc is defined after hw_irq.h is included
in irq.h. Since it's pointer reference only, a forward declaration
of the type will solve the problem.

LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 15:06:42 +02:00
Peter Zijlstra 799e2205ec sched: Disable SD_PREFER_LOCAL for MC/CPU domains
Yanmin reported that both tbench and hackbench were significantly
hurt by trying to keep tasks local on these domains, esp on small
cache machines.

So disable it in order to promote spreading outside of the cache
domains.

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1255083400.8802.15.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 15:02:34 +02:00
Li Hong 89ccf465ab x86, perf_event: Rename 'performance counter interrupt'
In 'cdd6c482c9ff9c55475ee7392ec8f672eddb7be6', we renamed
Performance Counters -> Performance Events.

The name showed up in /proc/interrupts also needs a change. I use
PMI (Performance monitoring interrupt) here, since it is the
official name used in Intel's documents.

Signed-off-by: Li Hong <lihong.hi@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091014105039.GA22670@uhli>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 14:37:24 +02:00
Frederic Weisbecker c44fc77084 tracing: Move syscalls metadata handling from arch to core
Most of the syscalls metadata processing is done from arch.
But these operations are mostly generic accross archs. Especially now
that we have a common variable name that expresses the number of
syscalls supported by an arch: NR_syscalls, the only remaining bits
that need to reside in arch is the syscall nr to addr translation.

v2: Compare syscalls symbols only after the "sys" prefix so that we
    avoid spurious mismatches with archs that have syscalls wrappers,
    in which case syscalls symbols have "SyS" prefixed aliases.
    (Reported by: Heiko Carstens)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
2009-10-14 09:53:56 +02:00
Dimitri Sivanich 9338ad6ffb x86, apic: Move SGI UV functionality out of generic IO-APIC code
Move UV specific functionality out of the generic IO-APIC code.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20091013203236.GD20543@sgi.com>
[ Cleaned up the code some more in their new places. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:09 +02:00
Dimitri Sivanich 6c2c502910 x86: SGI UV: Fix irq affinity for hub based interrupts
This patch fixes handling of uv hub irq affinity.  IRQs with ALL or
NODE affinity can be routed to cpus other than their originally
assigned cpu.  Those with CPU affinity cannot be rerouted.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20090930160259.GA7822@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:01 +02:00
Cyrill Gorcunov 2626eb2b2f x86, apic: Limit apic dumping, introduce new show_lapic= setup option
In case if a system has a large number of cpus printing apics
contents may consume a long time period.

We limit such an output by 1 apic by default. But to have an
ability to see all apics or some part of them we introduce
"show_lapic" setup option which allow us to limit/unlimit the
number of APICs being dumped.

Example: apic=debug show_lapic=5, or apic=debug show_lapic=all

Also move apic_verbosity checking upper that way so helper routines
do not need to inspect it at all.

Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.926793122@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:01 +02:00
Cyrill Gorcunov a933c61829 x86, apic: Use apic noop driver
In case if apic were disabled we may use the whole apic NOOP driver
instead of sparse poking the some functions in apic driver.

Also NOOP would catch any inappropriate apic operation calls (not
just read/write).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.747817361@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:00 +02:00
Cyrill Gorcunov 9844ab11c7 x86, apic: Introduce the NOOP apic driver
Introduce NOOP APIC driver. We should use it in case if apic was
disabled due to hardware of software/firmware problems (including
user requested to disable it case).

The driver is attempting to catch any inappropriate apic operation
call with warning issue.

Also it is possible to use some apic operation like IPI calls,
read/write without checking for apic presence which should make
callers code easier.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: yinghai@kernel.org
Cc: macro@linux-mips.org
LKML-Reference: <20091013201022.534682104@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 09:17:00 +02:00
Steven Rostedt 194ec34184 function-graph/x86: Replace unbalanced ret with jmp
The function graph tracer replaces the return address with a hook
to trace the exit of the function call. This hook will finish by
returning to the real location the function should return to.

But the current implementation uses a ret to jump to the real
return location. This causes a imbalance between calls and ret.
That is the original function does a call, the ret goes to the
handler and then the handler does a ret without a matching call.

Although the function graph tracer itself still breaks the branch
predictor by replacing the original ret, by using a second ret and
causing an imbalance, it breaks the predictor even more.

This patch replaces the ret with a jmp to keep the calls and ret
balanced. I tested this on one box and it showed a 1.7% increase in
performance. Another box only showed a small 0.3% increase. But no
box that I tested this on showed a decrease in performance by
making this change.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091013203425.042034383@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14 08:13:53 +02:00
Linus Torvalds 80fa680d22 Merge git://git.infradead.org/~dwmw2/iommu-2.6.32
* git://git.infradead.org/~dwmw2/iommu-2.6.32:
  x86: Move pci_iommu_init to rootfs_initcall()
  Run pci_apply_final_quirks() sooner.
  Mark pci_apply_final_quirks() __init rather than __devinit
  Rename pci_init() to pci_apply_final_quirks(), move it to quirks.c
  intel-iommu: Yet another BIOS workaround: Isoch DMAR unit with no TLB space
  intel-iommu: Decode (and ignore) RHSA entries
  intel-iommu: Make "Unknown DMAR structure" message more informative
2009-10-13 10:04:40 -07:00
Hidetoshi Seto 8968f9d3dc perf_event, x86, mce: Use TRACE_EVENT() for MCE logging
This approach is the first baby step towards solving many of the
structural problems the x86 MCE logging code is having today:

 - It has a private ring-buffer implementation that has a number
   of limitations and has been historically fragile and buggy.

 - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE
   specific. /dev/mcelog is not part of any larger logging
   framework and hence has remained on the fringes for many years.

 - The MCE logging code is still very unclean partly due to its ABI
   limitations. Fields are being reused for multiple purposes, and
   the whole message structure is limited and x86 specific to begin
   with.

All in one, the x86 tree would like to move away from this private
implementation of an event logging facility to a broader framework.

By using perf events we gain the following advantages:

 - Multiple user-space agents can access MCE events. We can have an
   mcelog daemon running but also a system-wide tracer capturing
   important events in flight-recorder mode.

 - Sampling support: the kernel and the user-space call-chain of MCE
   events can be stored and analyzed as well. This way actual patterns
   of bad behavior can be matched to precisely what kind of activity
   happened in the kernel (and/or in the app) around that moment in
   time.

 - Coupling with other hardware and software events: the PMU can track a
   number of other anomalies - monitoring software might chose to
   monitor those plus the MCE events as well - in one coherent stream of
   events.

 - Discovery of MCE sources - tracepoints are enumerated and tools can
   act upon the existence (or non-existence) of various channels of MCE
   information.

 - Filtering support: we just subscribe to and act upon the events we
   are interested in. Then even on a per event source basis there's
   in-kernel filter expressions available that can restrict the amount
   of data that hits the event channel.

 - Arbitrary deep per cpu buffering of events - we can buffer 32
   entries or we can buffer as much as we want, as long as we have
   the RAM.

 - An NMI-safe ring-buffer implementation - mappable to user-space.

 - Built-in support for timestamping of events, PID markers, CPU
   markers, etc.

 - A rich ABI accessible over system call interface. Per cpu, per task
   and per workload monitoring of MCE events can be done this way. The
   ABI itself has a nice, meaningful structure.

 - Extensible ABI: new fields can be added without breaking tooling.
   New tracepoints can be added as the hardware side evolves. There's
   various parsers that can be used.

 - Lots of scheduling/buffering/batching modes of operandi for MCE
   events. poll() support. mmap() support. read() support. You name it.

 - Rich tooling support: even without any MCE specific extensions added
   the 'perf' tool today offers various views of MCE data: perf report,
   perf stat, perf trace can all be used to view logged MCE events and
   perhaps correlate them to certain user-space usage patterns. But it
   can be used directly as well, for user-space agents and policy action
   in mcelog, etc.

With this we hope to achieve significant code cleanup and feature
improvements in the MCE code, and we hope to be able to drop the
/dev/mcelog facility in the end.

This patch is just a plain dumb dump of mce_log() records to
the tracepoints / perf events framework - a first proof of
concept step.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 09:43:38 +02:00
Ingo Molnar 9dbdd6c41c Merge commit 'v2.6.32-rc4' into perf/core
Merge reason: we were on an -rc1 base, merge up to -rc4.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 09:31:34 +02:00
Ingo Molnar 2c96c142e9 Merge branch 'tracing/urgent' into tracing/core
Merge reason: Pick up tracing/filters fix from the urgent queue,
              we will queue up dependent patches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 09:24:59 +02:00
Jeremy Fitzhardinge 71999d9862 x86/paravirt: Use normal calling sequences for irq enable/disable
Bastian Blank reported a boot crash with stackprotector enabled,
and debugged it back to edx register corruption.

For historical reasons irq enable/disable/save/restore had special
calling sequences to make them more efficient.  With the more
recent introduction of higher-level and more general optimisations
this is no longer necessary so we can just use the normal PVOP_
macros.

This fixes some residual bugs in the old implementations which left
edx liable to inadvertent clobbering. Also, fix some bugs in
__PVOP_VCALLEESAVE which were revealed by actual use.

Reported-by: Bastian Blank <bastian@waldi.eu.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4AD3BC9B.7040501@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 09:22:01 +02:00
Arnaldo Carvalho de Melo a2e2725541 net: Introduce recvmmsg socket syscall
Meaning receive multiple messages, reducing the number of syscalls and
net stack entry/exit operations.

Next patches will introduce mechanisms where protocols that want to
optimize this operation will provide an unlocked_recvmsg operation.

This takes into account comments made by:

. Paul Moore: sock_recvmsg is called only for the first datagram,
  sock_recvmsg_nosec is used for the rest.

. Caitlin Bestler: recvmmsg now has a struct timespec timeout, that
  works in the same fashion as the ppoll one.

  If the underlying protocol returns a datagram with MSG_OOB set, this
  will make recvmmsg return right away with as many datagrams (+ the OOB
  one) it has received so far.

. Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen
  datagrams and then recvmsg returns an error, recvmmsg will return
  the successfully received datagrams, store the error and return it
  in the next call.

This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg,
where we will be able to acquire the lock only at batch start and end, not at
every underlying recvmsg call.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-12 23:40:10 -07:00
Ingo Molnar 7a693d3f0d perf_events, x86: Fix event constraints code
There was namespace overlap due to a rename i did - this caused
the following build warning, reported by Stephen Rothwell against
linux-next x86_64 allmodconfig:

  arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx':
  arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function

This is a real bug not just a warning: fix it by renaming the
global event-constraints table pointer to 'event_constraints'.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 08:19:53 +02:00
H. Peter Anvin 98272ed0d2 x86: use kernel_stack_pointer() in kprobes.c
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
2009-10-12 14:19:35 -07:00
H. Peter Anvin 5ca6c0ca5d x86: use kernel_stack_pointer() in kgdb.c
The way to obtain a kernel-mode stack pointer from a struct
pt_regs in 32-bit mode is "subtle": the stack doesn't actually
contain the stack pointer, but rather the location where it would
have been marks the actual previous stack frame.  For clarity, use
kernel_stack_pointer() instead of coding this weirdness
explicitly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
2009-10-12 14:19:35 -07:00
H. Peter Anvin a343c75d33 x86: use kernel_stack_pointer() in dumpstack.c
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.

Furthermore, user_mode() is only valid when the process is known to
not run in V86 mode.  Use the safer user_mode_vm() instead.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12 14:19:34 -07:00
H. Peter Anvin def3c5d0a3 x86: use kernel_stack_pointer() in process_32.c
The way to obtain a kernel-mode stack pointer from a struct pt_regs in
32-bit mode is "subtle": the stack doesn't actually contain the stack
pointer, but rather the location where it would have been marks the
actual previous stack frame.  For clarity, use kernel_stack_pointer()
instead of coding this weirdness explicitly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12 14:19:34 -07:00
Arjan van de Ven ad8f4356af x86: Don't use the strict copy checks when branch profiling is in use
The branch profiling creates very complex code for each if
statement, to the point that gcc has trouble even analyzing
something as simple as

  if (count > 5)
      count = 5;

This then means that causing an error on code that gcc cannot
analyze for copy_from_user() and co is not very productive.

This patch excludes the strict copy checks in the case of branch
profiling being enabled.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20091006070452.5e1fc119@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:29:51 +02:00
H. Peter Anvin d1705c558c x86: fix kernel panic on 32 bits when profiling
Latest kernel has a kernel panic in booting on i386 machine when
profile=2 setting in cmdline.  It is due to 'sp' being incorrect in
profile_pc().

BUG: unable to handle kernel NULL pointer dereference at 00000246
IP: [<c01288b6>] profile_pc+0x2a/0x48
*pde = 00000000
Oops: 0000 [#1] SMP

This differs from the original version by Alex Shi in that we use the
kernel_stack_pointer() inline already defined in <asm/ptrace.h> for
this purpose, instead of #ifdef.

Originally-by: Alex Shi <alex.shi@intel.com>
Cc: "Chen, Tim C" <tim.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-12 11:53:51 -07:00
Brian Gerst ae24ffe5ec x86, 64-bit: Move K8 B step iret fixup to fault entry asm
Move the handling of truncated %rip from an iret fault to the fault
entry path.

This allows x86-64 to use the standard search_extable() function.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <1255357103-5418-1-git-send-email-brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 18:29:46 +02:00
Jan Beulich 7a4b7e5e74 x86: Fix Suspend to RAM freeze on Acer Aspire 1511Lmi laptop
Move the trampoline and accessors back out of .cpuinit.* for the
case of 64-bits+ACPI_SLEEP.

This solves s2ram hangs reported in:

  http://bugzilla.kernel.org/show_bug.cgi?id=14279

Reported-and-bisected-by: Christian Casteyde <casteyde.christian@free.fr>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: <bugzilla-daemon@bugzilla.kernel.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 18:06:48 +02:00
David Woodhouse 9a821b2316 x86: Move pci_iommu_init to rootfs_initcall()
We want this to happen after the PCI quirks, which are now running at
the very end of the fs_initcalls.

This works around the BIOS problems which were originally addressed by
commit db8be50c43 ('USB: Work around BIOS
bugs by quiescing USB controllers earlier'), which was reverted in
commit d93a8f829f.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2009-10-12 14:42:11 +01:00
Borislav Petkov fb2531953f mce, edac: Use an atomic notifier for MCEs decoding
Add an atomic notifier which ensures proper locking when conveying
MCE info to EDAC for decoding. The actual notifier call overrides a
default, negative priority notifier.

Note: make sure we register the default decoder only once since
mcheck_init() runs on each CPU.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20091003065752.GA8935@liondog.tnic>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 12:24:45 +02:00
Joe Perches 3c355863fb testmmiotrace.c: Add and use pr_fmt(fmt)
- Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt.
- Strip MODULE_NAME from pr_<level>s.
- Remove MODULE_NAME definition.

Signed-off-by: Joe Perches <joe@perches.com>
LKML-Reference: <3bb66cc7f85f77b9416902e1be7076f7e3f4ad48.1254701151.git.joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 08:05:41 +02:00
Joe Perches 3bb258bf43 ftrace.c: Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
- Remove prefixes from pr_<level>, use pr_fmt(fmt).

No change in output.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <9b377eefae9e28c599dd4a17bdc81172965e9931.1254701151.git.joe@perches.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 08:05:40 +02:00
Yinghai Lu 15b812f1d0 pci: increase alignment to make more space for hidden code
As reported in

	http://bugzilla.kernel.org/show_bug.cgi?id=13940

on some system when acpi are enabled, acpi clears some BAR for some
devices without reason, and kernel will need to allocate devices for
them.  It then apparently hits some undocumented resource conflict,
resulting in non-working devices.

Try to increase alignment to get more safe range for unassigned devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-11 14:43:36 -07:00
Alexey Dobriyan d43c36dc6b headers: remove sched.h from interrupt.h
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-10-11 11:20:58 -07:00
Joerg Roedel c5cca146aa x86/amd-iommu: Workaround for erratum 63
There is an erratum for IOMMU hardware which documents
undefined behavior when forwarding SMI requests from
peripherals and the DTE of that peripheral has a sysmgt
value of 01b. This problem caused weird IO_PAGE_FAULTS in my
case.
This patch implements the suggested workaround for that
erratum into the AMD IOMMU driver.  The erratum is
documented with number 63.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-10-09 18:37:46 +02:00
Ingo Molnar e7ab0f7b50 Revert "x86, timers: Check for pending timers after (device) interrupts"
This reverts commit 9bcbdd9c58.

The real bug producing LatencyTop latencies has been fixed in:

  f5dc375: sched: Update the clock of runqueue select_task_rq() selected

And the commit being reverted here triggers local timer processing
from every device IRQ. If device IRQs come in at a high frequency,
this could cause a performance regression.

The commit being reverted here purely 'fixed' the reported latency
as a side effect, because CPUs were being moved out of idle more
often.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <20091008064041.67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09 15:58:20 +02:00
Peter Zijlstra f3834b9ef6 x86: Generate cmpxchg build failures
Rework the x86 cmpxchg() implementation to generate build failures
when used on improper types.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1254771187.21044.22.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09 15:57:00 +02:00
Peter Zijlstra fe9081cc9b perf, x86: Add simple group validation
Refuse to add events when the group wouldn't fit onto the PMU
anymore.

Naive implementation.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@gmail.com>
LKML-Reference: <1254911461.26976.239.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09 15:56:14 +02:00
Stephane Eranian b690081d4d perf_events: Add event constraints support for Intel processors
On some Intel processors, not all events can be measured in all
counters. Some events can only be measured in one particular
counter, for instance. Assigning an event to the wrong counter does
not crash the machine but this yields bogus counts, i.e., silent
error.

This patch changes the event to counter assignment logic to take
into account event constraints for Intel P6, Core and Nehalem
processors. There is no contraints on Intel Atom. There are
constraints on Intel Yonah (Core Duo) but they are not provided in
this patch given that this processor is not yet supported by
perf_events.

As a result of the constraints, it is possible for some event
groups to never actually be loaded onto the PMU if they contain two
events which can only be measured on a single counter. That
situation can be detected with the scaling information extracted
with read().

Signed-off-by: Stephane Eranian <eranian@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09 15:56:12 +02:00
Stephane Eranian 04a705df47 perf_events: Check for filters on fixed counter events
Intel fixed counters do not support all the filters possible with a
generic counter. Thus, if a fixed counter event is passed but with
certain filters set, then the fixed_mode_idx() function must fail
and the event must be measured in a generic counter instead.

Reject filters are: inv, edge, cnt-mask.

Signed-off-by: Stephane Eranian <eranian@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-09 15:56:10 +02:00
John Kacur 5a943617ef x86, cpuid: Simplify the code in cpuid_open
Peter picked up my patch for tip/x86/cpu that removes the bkl in
cpuid_open. Ingo subsequently merged that into tip/master.

This patch folds back in tglx's 55968ede164ae523692f00717f50cd926f1382a0
to my patch that removed the bkl.

This simplifies the code, and makes it consistent with the changes to
kill the bkl in msr.c as well.

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-08 16:14:02 -07:00
Alok Kataria d0153ca35d x86, vmi: Mark VMI deprecated and schedule it for removal
Add text in feature-removal.txt indicating that VMI will be removed in
the 2.6.37 timeframe.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
LKML-Reference: <1254193238.13456.48.camel@ank32.eng.vmware.com>
[ removed a bogus Kconfig change, marked (DEPRECATED) in Kconfig ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 22:27:55 +02:00
Linus Torvalds 624235c5b3 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, pci: Correct spelling in a comment
  x86: Simplify bound checks in the MTRR code
  x86: EDAC: carve out AMD MCE decoding logic
  initcalls: Add early_initcall() for modules
  x86: EDAC: MCE: Fix MCE decoding callback logic
2009-10-08 12:06:36 -07:00
Arjan van de Ven 9bcbdd9c58 x86, timers: Check for pending timers after (device) interrupts
Now that range timers and deferred timers are common, I found a
problem with these using the "perf timechart" tool. Frans Pop also
reported high scheduler latencies via LatencyTop, when using
iwlagn.

It turns out that on x86, these two 'opportunistic' timers only get
checked when another "real" timer happens. These opportunistic
timers have the objective to save power by hitchhiking on other
wakeups, as to avoid CPU wakeups by themselves as much as possible.

The change in this patch runs this check not only at timer
interrupts, but at all (device) interrupts. The effect is that:

 1) the deferred timers/range timers get delayed less

 2) the range timers cause less wakeups by themselves because
    the percentage of hitchhiking on existing wakeup events goes up.

I've verified the working of the patch using "perf timechart", the
original exposed bug is gone with this patch. Frans also reported
success - the latencies are now down in the expected ~10 msec
range.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Frans Pop <elendil@planet.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091008064041.67219b13@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 17:27:27 +02:00
John Kacur 170a0bc380 x86, cpuid: Remove the bkl from cpuid_open()
Most of the variables are local to the function. It IS possible that
for struct cpuinfo_x86 *c c could point to the same area. However,
this is used read only.

Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0910072016190.15183@localhost.localdomain>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-07 15:41:21 -07:00
Frederic Weisbecker d6c304055b x86, msr: Remove the bkl from msr_open()
Remove the big kernel lock from msr_open() as it doesn't protect
anything there.

The only racy event that can happen here is a concurrent cpu shutdown.

So let's look at what could be racy during/after the above event:

- The cpu_online() check is racy, but the bkl doesn't help about
  that anyway it disables preemption but we may be chcking another
  cpu than the current one.
  Also the cpu can still become offlined between open and read calls.

- The cpu_data(cpu) returns a safe pointer too. It won't be released on
  cpu offlining. But some fields can be changed from
  arch/x86/kernel/smpboot.c:remove_siblinginfo() :

	- phys_proc_id
	- cpu_core_id

  Those are not read from msr_open(). What we are checking is the
  x86_capability that is left untouched on offlining.

So this removal looks safe.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sven-Thorsten Dietrich <sdietrich@suse.de>
LKML-Reference: <1254944602-7382-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-07 13:47:19 -07:00
Linus Torvalds 19d031e052 Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: add support for change_pte mmu notifiers
  KVM: MMU: add SPTE_HOST_WRITEABLE flag to the shadow ptes
  KVM: MMU: dont hold pagecount reference for mapped sptes pages
  KVM: Prevent overflow in KVM_GET_SUPPORTED_CPUID
  KVM: VMX: flush TLB with INVEPT on cpu migration
  KVM: fix LAPIC timer period overflow
  KVM: s390: fix memsize >= 4G
  KVM: SVM: Handle tsc in svm_get_msr/svm_set_msr correctly
  KVM: SVM: Fix tsc offset adjustment when running nested
2009-10-05 12:07:39 -07:00
Linus Torvalds 46302b46e5 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Don't leak 64-bit kernel register values to 32-bit processes
  x86, SLUB: Remove unused CONFIG FAST_CMPXCHG_LOCAL
  x86: earlyprintk: Fix regression to handle serial,ttySn as 1 arg
  x86: Don't generate cmpxchg8b_emu if CONFIG_X86_CMPXCHG64=y
  x86: Fix csum_ipv6_magic asm memory clobber
  x86: Optimize cmpxchg64() at build-time some more
2009-10-05 12:02:18 -07:00
Izik Eidus 3da0dd433d KVM: add support for change_pte mmu notifiers
this is needed for kvm if it want ksm to directly map pages into its
shadow page tables.

[marcelo: cast pfn assignment to u64]

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04 17:04:53 +02:00
Izik Eidus 1403283acc KVM: MMU: add SPTE_HOST_WRITEABLE flag to the shadow ptes
this flag notify that the host physical page we are pointing to from
the spte is write protected, and therefore we cant change its access
to be write unless we run get_user_pages(write = 1).

(this is needed for change_pte support in kvm)

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04 17:04:50 +02:00
Izik Eidus acb66dd051 KVM: MMU: dont hold pagecount reference for mapped sptes pages
When using mmu notifiers, we are allowed to remove the page count
reference tooken by get_user_pages to a specific page that is mapped
inside the shadow page tables.

This is needed so we can balance the pagecount against mapcount
checking.

(Right now kvm increase the pagecount and does not increase the
mapcount when mapping page into shadow page table entry,
so when comparing pagecount against mapcount, you have no
reliable result.)

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-10-04 17:04:48 +02:00