kmalloc for flush_words resulted in zero size allocation when no
k8_northbridges existed. Short circuit the code path for this case.
Also remove uneeded zeroing of num_k8_northbridges just after checking if
it is zero.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix:
mm/slab: fix section mismatch warning
mm: fix section mismatch warnings
init/main: use __init_refok to fix section mismatch
kbuild: introduce __init_refok/__initdata_refok to supress section mismatch warnings
all-archs: consolidate .data section definition in asm-generic
all-archs: consolidate .text section definition in asm-generic
kbuild: add "Section mismatch" warning whitelist for powerpc
kbuild: make better section mismatch reports on i386, arm and mips
kbuild: make modpost section warnings clearer
kconfig: search harder for curses library in check-lxdialog.sh
kbuild: include limits.h in sumversion.c for PATH_MAX
powerpc: Fix the MODALIAS generation in modpost for of devices
The vsyscall time() function basically returns the second portion of
xtime directly. This however means that there is about a ticks worth of
time each second where time() will return a second value less then what
gettimeofday() does.
Additionally, this window where vtime() is behind vgettimeofday() grows
when dynticks is enabled, so its probably good to get this in before
dynticks lands.
Big thanks to Sripathi for noticing this issue and creating a test case
to work with!
This patch changes the vtime() implemenation to call vgettimeofday(),
much as syscall time() implementation calls gettimeofday().
2.6.21 stable candidate too
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In
commit d358788f3f
Author: Russell King <rmk@dyn-67.arm.linux.org.uk>
Date: Mon Mar 20 20:00:09 2006 +0000
Glen Turner reported that writing LFCR rather than the more
traditional CRLF causes issues with some terminals.
Since this afflicts many serial drivers, extract the common code to a
library function (uart_console_write) and arrange for each driver to
supply a "putchar" function.
but early_printk is left out.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit f64da958df.
Andi Kleen is unhappy with the changes, and they really do not seem
worth it. IPMI could use DIE_NMI_IPI instead of the new callback, even
though that ends up having its own set of problems too, mainly because
the IPMI code cannot really know the NMI was from IPMI or not.
Manually fix up conflicts in arch/x86_64/kernel/traps.c and
drivers/char/ipmi/ipmi_watchdog.c.
Cc: Andi Kleen <ak@suse.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Corey Minyard <minyard@acm.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The earlier change to call the bp mtrr init from bugs.c broke
on some configurations due to missing includes. Noticed
by "Avuton Olrich" <avuton@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code was ok, but triggered warnings for calling __init from
__cpuinit. Instead call it from check_bugs instead.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I'm using a custom BIOS to configure the northbridge GART at address
0x80000000, size 2G. Linux complains:
"Aperture from northbridge cpu 0 beyond 4GB. Ignoring."
I think there's an off-by-two error in arch/x86_64/kernel/aperture.c:
AK: use correct types for i386
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'audit.b38' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] Abnormal End of Processes
[PATCH] match audit name data
[PATCH] complete message queue auditing
[PATCH] audit inode for all xattr syscalls
[PATCH] initialize name osid
[PATCH] audit signal recipients
[PATCH] add SIGNAL syscall class (v3)
[PATCH] auditing ptrace
o x86_64 kernel needs to be compiled for 2MB aligned addresses. Currently
we are using BUILD_BUG_ON() to warn the user if he has not done so. But
looks like folks are not finding message very intutive and don't open
the respective c file to find problem source. (Bug 8439)
arch/x86_64/kernel/head64.c: In function 'x86_64_start_kernel':
arch/x86_64/kernel/head64.c:70: error: size of array 'type name' is negative
o Using preprocessor directive #error to print a better message if
CONFIG_PHYSICAL_START is not aligned to 2MB boundary.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When auditing syscalls that send signals, log the pid and security
context for each target process. Optimize the data collection by
adding a counter for signal-related rules, and avoiding allocating an
aux struct unless we have more than one target process. For process
groups, collect pid/context data in blocks of 16. Move the
audit_signal_info() hook up in check_kill_permission() so we audit
attempts where permission is denied.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
sound: convert "sound" subdirectory to UTF-8
MAINTAINERS: Add cxacru website/mailing list
include files: convert "include" subdirectory to UTF-8
general: convert "kernel" subdirectory to UTF-8
documentation: convert the Documentation directory to UTF-8
Convert the toplevel files CREDITS and MAINTAINERS to UTF-8.
remove broken URLs from net drivers' output
Magic number prefix consistency change to Documentation/magic-number.txt
trivial: s/i_sem /i_mutex/
fix file specification in comments
drivers/base/platform.c: fix small typo in doc
misc doc and kconfig typos
Remove obsolete fat_cvf help text
Fix occurrences of "the the "
Fix minor typoes in kernel/module.c
Kconfig: Remove reference to external mqueue library
Kconfig: A couple of grammatical fixes in arch/i386/Kconfig
Correct comments in genrtc.c to refer to correct /proc file.
Fix more "deprecated" spellos.
Fix "deprecated" typoes.
...
Fix trivial comment conflict in kernel/relay.c.
Recently a few direct accesses to the thread_info in the task structure snuck
back, so this wraps them with the appropriate wrapper.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress. This
patch introduces such notifications and causes them to be used during
suspend and resume transitions. It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).
[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the misspellings of "propogate", "writting" and (oh, the shame
:-) "kenrel" in the source tree.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Make x86 COM ports into platform devices and don't probe for them
if we have PNP.
This prevents double discovery, where a device was found both by
the legacy probe and by 8250_pnp, e.g.,
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:02: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
This also means IRDA devices without a UART PNP ID will no longer be
claimed by the serial driver, which might require changes in IRDA
drivers and administration.
In addition to this patch, you may need to configure a setserial init
script, e.g., /etc/init.d/setserial, so it doesn't poke legacy UART
stuff back in. On Debian, "dpkg-reconfigure setserial" with the "kernel"
option does this.
To force the old legacy probe behavior even when we have PNPBIOS or
ACPI, load the new legacy_serial module (or build 8250 static) with
the "legacy_serial.force" option.
[akpm@linux-foundation.org: fix makefiles]
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Keith Owens <kaos@ocs.com.au>
Cc: Len Brown <lenb@kernel.org>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Matthieu CASTET <castet.matthieu@free.fr>
Cc: Jean Tourrilhes <jt@hpl.hp.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Ville Syrjala <syrjala@sci.fi>
Cc: Russell King <rmk+serial@arm.linux.org.uk>
Cc: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch provides a debugfs knob to turn kprobes on/off
o A new file /debug/kprobes/enabled indicates if kprobes is enabled or
not (default enabled)
o Echoing 0 to this file will disarm all installed probes
o Any new probe registration when disabled will register the probe but
not arm it. A message will be printed out in such a case.
o When a value 1 is echoed to the file, all probes (including ones
registered in the intervening period) will be enabled
o Unregistration will happen irrespective of whether probes are globally
enabled or not.
o Update Documentation/kprobes.txt to reflect these changes. While there
also update the doc to make it current.
We are also looking at providing sysrq key support to tie to the disabling
feature provided by this patch.
[akpm@linux-foundation.org: Use bool like a bool!]
[akpm@linux-foundation.org: add printk facility levels]
[cornelia.huck@de.ibm.com: Add the missing arch_trampoline_kprobe() for s390]
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- consolidate duplicate code in all arch_prepare_kretprobe instances
into common code
- replace various odd helpers that use hlist_for_each_entry to get
the first elemenet of a list with either a hlist_for_each_entry_save
or an opencoded access to the first element in the caller
- inline add_rp_inst into it's only remaining caller
- use kretprobe_inst_table_head instead of opencoding it
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In certain cases like when the real return address can't be found or when
the number of tracked calls to a kretprobed function is less than the
number of returns, we may not be able to find the correct return address
after processing a kretprobe. Currently we just do a BUG_ON, but no
information is provided about the actual failing kretprobe.
Print out details of the kretprobe before calling BUG().
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.
Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves the die notifier handling to common code. Previous
various architectures had exactly the same code for it. Note that the new
code is compiled unconditionally, this should be understood as an appel to
the other architecture maintainer to implement support for it aswell (aka
sprinkling a notify_die or two in the proper place)
arm had a notifiy_die that did something totally different, I renamed it to
arm_notify_die as part of the patch and made it static to the file it's
declared and used at. avr32 used to pass slightly less information through
this interface and I brought it into line with the other architectures.
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
[bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The console subsystem already has an idea of a boot console, using the
CON_BOOT flag. The implementation has some flaws though. The major
problem is that presence of a boot console makes register_console() ignore
any other console devices (unless explicitly specified on the kernel
command line).
This patch fixes the console selection code to *not* consider a boot
console a full-featured one, so the first non-boot console registering will
become the default console instead. This way the unregister call for the
boot console in the register_console() function actually triggers and the
handover from the boot console to the real console device works smoothly.
Added a printk for the handover, so you know which console device the
output goes to when the boot console stops printing messages.
The disable_early_printk() call is obsolete with that patch, explicitly
disabling the early console isn't needed any more as it works automagically
with that patch.
I've walked through the tree, dropped all disable_early_printk() instances
found below arch/ and tagged the consoles with CON_BOOT if needed. The
code is tested on x86, sh (thanks to Paul) and mips (thanks to Ralf).
Changes to last version: Rediffed against -rc3, adapted to mips cleanups by
Ralf, fixed "udbg-immortal" cmd line arg on powerpc.
Signed-off-by: Gerd Hoffmann <kraxel@exsuse.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Convert over to the new NMI handling for getting IPMI watchdog timeouts via an
NMI. This add config options to know if there is the ability to receive NMIs
and if it has an NMI post processing call. Then it modifies the IPMI watchdog
to take advantage of this so that it can know if an NMI comes in.
It also adds testing that the IPMI NMI watchdog works.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make swsusp use memory bitmaps instead of page flags for marking 'nosave' and
free pages. This allows us to 'recycle' two page flags that can be used for
other purposes. Also, the memory needed to store the bitmaps is allocated
when necessary (ie. before the suspend) and freed after the resume which is
more reasonable.
The patch is designed to minimize the amount of changes and there are some
nice simplifications and optimizations possible on top of it. I am going to
implement them separately in the future.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Handle MAP_FIXED in x86_64 arch_get_unmapped_area(), simple case, just return
the address as passed in
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was broken. It adds complexity, for no good reason. Rather than
separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
and preferably __pa() too - and just use "virt_to_phys()" instead, which
is more readable and has nicer semantics.
However, right now, just undo the separation, and make __pa_symbol() be
the exact same as __pa(). That fixes the bugs this patch introduced,
and we can do the fairly obvious cleanups later.
Do the new __phys_addr() function (which is now the actual workhorse for
the unified __pa()/__pa_symbol()) as a real external function, that way
all the potential issues with compile/link-time optimizations of
constant symbol addresses go away, and we can also, if we choose to, add
more sanity-checking of the argument.
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
[PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
[PATCH] i386: type may be unused
[PATCH] i386: Some additional chipset register values validation.
[PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
[PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
[PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
[PATCH] i386: white space fixes in i387.h
[PATCH] i386: Drop noisy e820 debugging printks
[PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
[PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
[PATCH] x86-64: Share identical video.S between i386 and x86-64
[PATCH] x86-64: Remove CONFIG_REORDER
[PATCH] x86-64: Print type and size correctly for unknown compat ioctls
[PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
[PATCH] i386: Little cleanups in smpboot.c
[PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
[PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
[PATCH] i386: Add X86_FEATURE_RDTSCP
[PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
[PATCH] i386: Implement alternative_io for i386
...
Fix up trivial conflict in include/linux/highmem.h manually.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (59 commits)
PCI: Free resource files in error path of pci_create_sysfs_dev_files()
pci-quirks: disable MSI on RS400-200 and RS480
PCI hotplug: Use menuconfig objects
PCI: ZT5550 CPCI Hotplug driver fix
PCI: rpaphp: Remove semaphores
PCI: rpaphp: Ensure more pcibios_add/pcibios_remove symmetry
PCI: rpaphp: Use pcibios_remove_pci_devices() symmetrically
PCI: rpaphp: Document is_php_dn()
PCI: rpaphp: Document find_php_slot()
PCI: rpaphp: Rename rpaphp_register_pci_slot() to rpaphp_enable_slot()
PCI: rpaphp: refactor tail call to rpaphp_register_slot()
PCI: rpaphp: remove rpaphp_set_attention_status()
PCI: rpaphp: remove print_slot_pci_funcs()
PCI: rpaphp: Remove setup_pci_slot()
PCI: rpaphp: remove a call that does nothing but a pointer lookup
PCI: rpaphp: Remove another wrappered function
PCI: rpaphp: Remve another call that is a wrapper
PCI: rpaphp: remove a function that does nothing but wrap debug printks
PCI: rpaphp: Remove un-needed goto
PCI: rpaphp: Fix a memleak; slot->location string was never freed
...
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.
set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.
The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.
Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix:
In file included from include2/asm/apic.h:5,
from include2/asm/smp.h:15,
from linux/arch/x86_64/kernel/genapic_flat.c:18:
linux/include/linux/pm.h: In function ‘call_platform_enable_wakeup’:
linux/include/linux/pm.h:331: error: ‘EIO’ undeclared (first use in this function)
linux/include/linux/pm.h:331: error: (Each undeclared identifier is reported only once
linux/include/linux/pm.h:331: error: for each function it appears in.)
Signed-off-by: Andi Kleen <ak@suse.de>
The option never worked well and functionlist wasn't well maintained.
Also it made the build very slow on many binutils version.
So just remove it.
Cc: arjan@linux.intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
- Use 64bit TSC calculations to avoid handling overflow
- Use 32bit unsigned arithmetic for the APIC timer. This
way overflows are handled correctly.
- Fix exit check of loop to account for apic timer counting down
Signed-off-by: dpreed@reed.com
Signed-off-by: Andi Kleen <ak@suse.de>
Background:
We've found that MCEs (specifically DRAM SBEs) tend to come in bunches,
especially when we are trying really hard to stress the system out. The
current MCE poller uses a static interval which does not care whether it
has or has not found MCEs recently.
Description:
This patch makes the MCE poller adjust the polling interval dynamically.
If we find an MCE, poll 2x faster (down to 10 ms). When we stop finding
MCEs, poll 2x slower (up to check_interval seconds). The check_interval
tunable becomes the max polling interval. The "Machine check events
logged" printk() is rate limited to the check_interval, which should be
identical behavior to the old functionality.
Result:
If you start to take a lot of correctable errors (not exceptions), you
log them faster and more accurately (less chance of overflowing the MCA
registers). If you don't take a lot of errors, you will see no change.
Alternatives:
I considered simply reducing the polling interval to 10 ms immediately
and keeping it there as long as we continue to find errors. This felt a
bit heavy handed, but does perform significantly better for the default
check_interval of 5 minutes (we're using a few seconds when testing for
DRAM errors). I could be convinced to go with this, if anyone felt it
was not too aggressive.
Testing:
I used an error-injecting DIMM to create lots of correctable DRAM errors
and verified that the polling interval accelerates. The printk() only
happens once per check_interval seconds.
Patch:
This patch is against 2.6.21-rc7.
Signed-Off-By: Tim Hockin <thockin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data:cpu_llc_id from __ksymtab between '__ksymtab_cpu_llc_id' (at offset 0x4a0) and '__ksymtab_smp_num_siblings'
It is strange to export a __cpuinitdata symbols to modules, and no module
appears to use it anyway.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
We apparently hit the 1024 limit of vsyscall_0 zone when some debugging
options are set, or if __vsyscall_gtod_data is 64 bytes larger.
In order to save 128 bytes from the vsyscall_0 zone, we move __vgetcpu_mode
& __jiffies to vsyscall_2 zone where they really belong, since they are
used only from vgetcpu() (which is in this vsyscall_2 area).
After patch is applied, new layout is :
ffffffffff600000 T vgettimeofday
ffffffffff60004e t vsysc2
ffffffffff600140 t vread_hpet
ffffffffff600150 t vread_tsc
ffffffffff600180 D __vsyscall_gtod_data
ffffffffff600400 T vtime
ffffffffff600413 t vsysc1
ffffffffff600800 T vgetcpu
ffffffffff600870 D __vgetcpu_mode
ffffffffff600880 D __jiffies
ffffffffff600c00 T venosys_1
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:18 +02:00
Fernando Luis [** ISO-8859-1 charset **] VzquezCao
Implement __send_IPI_dest_field which can be used to send IPIs when the
"destination shorthand" field of the ICR is set to 00 (destination
field). Use it whenever possible.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
inquire_remote_apic is used for APIC debugging, so use
safe_apic_wait_icr_idle instead of apic_wait_icr_idle to avoid possible
lockups when APIC delivery fails.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
The functionality provided by the new safe_apic_wait_icr_idle is being
open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle
instead to consolidate code and ease maintenance.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
apic_wait_icr_idle looks like this:
static __inline__ void apic_wait_icr_idle(void)
{
while (apic_read(APIC_ICR) & APIC_ICR_BUSY)
cpu_relax();
}
The busy loop in this function would not be problematic if the
corresponding status bit in the ICR were always updated, but that does
not seem to be the case under certain crash scenarios. Kdump uses an IPI
to stop the other CPUs in the event of a crash, but when any of the
other CPUs are locked-up inside the NMI handler the CPU that sends the
IPI will end up looping forever in the ICR check, effectively
hard-locking the whole system.
Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3:
"A local APIC unit indicates successful dispatch of an IPI by
resetting the Delivery Status bit in the Interrupt Command
Register (ICR). The operating system polls the delivery status
bit after sending an INIT or STARTUP IPI until the command has
been dispatched.
A period of 20 microseconds should be sufficient for IPI dispatch
to complete under normal operating conditions. If the IPI is not
successfully dispatched, the operating system can abort the
command. Alternatively, the operating system can retry the IPI by
writing the lower 32-bit double word of the ICR. This “time-out”
mechanism can be implemented through an external interrupt, if
interrupts are enabled on the processor, or through execution of
an instruction or time-stamp counter spin loop."
Intel's documentation suggests the implementation of a time-out
mechanism, which, by the way, is already being open-coded in some parts
of the kernel that tinker with ICR.
Create a apic_wait_icr_idle replacement that implements the time-out
mechanism and that can be used to solve the aforementioned problem.
AK: moved both functions out of line
AK: Added improved loop from Keith Owens
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>