* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
percpu: Generic support for this_cpu_cmpxchg_double()
alpha: use L1_CACHE_BYTES for cacheline size in the linker script
percpu: align percpu readmostly subsection to cacheline
Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
posix-clocks: Check write permissions in posix syscalls
hrtimer: Remove empty hrtimer_init_hres_timer()
hrtimer: Update hrtimer->state documentation
hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
timers: Export CLOCK_BOOTTIME via the posix timers interface
timers: Add CLOCK_BOOTTIME hrtimer base
time: Extend get_xtime_and_monotonic_offset() to also return sleep
time: Introduce get_monotonic_boottime and ktime_get_boottime
hrtimers: extend hrtimer base code to handle more then 2 clockids
ntp: Remove redundant and incorrect parameter check
mn10300: Switch do_timer() to xtimer_update()
posix clocks: Introduce dynamic clocks
posix-timers: Cleanup namespace
posix-timers: Add support for fd based clocks
x86: Add clock_adjtime for x86
posix-timers: Introduce a syscall for clock tuning.
time: Splitout compat timex accessors
ntp: Add ADJ_SETOFFSET mode bit
time: Introduce timekeeping_inject_offset
posix-timer: Update comment
...
Fix up new system-call-related conflicts in
arch/x86/ia32/ia32entry.S
arch/x86/include/asm/unistd_32.h
arch/x86/include/asm/unistd_64.h
arch/x86/kernel/syscall_table_32.S
(name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
due to movement of get_jiffies_64() in:
kernel/time.c
New helpers: user_statfs() and fd_statfs(), taking userland pathname and
descriptor resp. and filling struct kstatfs. Syscalls of statfs family
(native, compat and foreign - osf and hpux on alpha and parisc resp.)
switched to those. Removes some boilerplate code, simplifies cleanup
on errors...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The irq descriptors are initialized IRQ_DISABLED in the generic
code. No need to fiddle with them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
xtime_update() takes the xtime_lock itself.
timer_interrupt() is only called on the boot cpu. See do_entInt(). So
"state" in timer_interrupt does not require protection by xtime_lock.
Signed-off-by: Torben Hohn <torbenh@gmx.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: johnstul@us.ibm.com
Cc: hch@infradead.org
Cc: yong.zhang0@gmail.com
LKML-Reference: <20110127145915.23248.20919.stgit@localhost>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently the linker script uses 64 for cacheline size which isn't
optimal for all cases. Include asm/cache.h and use L1_CACHE_BYTES
instead as suggested by Sam Ravnborg.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.
This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.
This is based on Shaohua's x86 only patch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
Interrupts ought to be disabled _before_ irq_enter().
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matt Turner <mattst88@monolith.freenet-rz.de>
Remove the leftover from the commit 14e2acd868 ("select:
fix alpha OSF wrapper").
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Good riddance... Nuke a pile of redundant handlers that the
generic code takes care of as well.
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Stop touching irq_desc[irq] directly, instead use accessor
functions provided. Use irq_has_action instead of directly
testing the irq_desc.
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Also kill superfluous IRQ_DISABLED initialization, since that's the
default state of the irq_desc[i].status field.
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Occasionally the system gets into a state where the CMOS clock has gotten
slightly ahead of current time and the periodic update of RTC fails. The
message is a nuisance and repeats spamming the log.
See: http://www.ntp.org/ntpfaq/NTP-s-trbl-spec.htm#Q-LINUX-SET-RTC-MMSS
Rather than just removing the message, make it show only once and reduce
severity since it indicates a normal and non urgent condition.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.
Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.
If we want to enumerate the PMUs they need a name, provide one.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.259707703@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).
The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.
Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.
Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
dma_addr_t is always 64 bit on alpha. So let's use dma_addr_t instead.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
T2 are the only alpha SMP systems that do HAE switching at runtime, which
is fundamentally racy on SMP. This patch limits MMIO space on T2 to HAE0
only, like we did on MCPCIA (rawhide) long ago. This leaves us with only
112 Mb of PCI MMIO (128 Mb HAE aperture minus 16 Mb reserved for EISA),
but since linux PCI allocations are reasonably tight, it should be enough
for sane hardware configurations.
Also, fix a typo in MCPCIA_FROB_MMIO macro which shouldn't call set_hae()
if MCPCIA_ONE_HAE_WINDOW is defined. It's more for correctness, as
set_hae() is a no-op anyway in that case.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.
Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.
The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.
Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit c52c2ddc1d ("alpha: switch osf_sigprocmask() to use of
sigprocmask()") had several problems. The more obvious compile issues
got fixed in commit 0f44fbd297 ("alpha: fix compile problem in
arch/alpha/kernel/signal.c"), but it also caused a regression.
Since _BLOCKABLE is already the set of signals that can be blocked, the
code should do "newmask & _BLOCKABLE" rather than inverting _BLOCKABLE
before masking.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Patch-by: Al Viro <viro@zeniv.linux.org.uk>
Patch-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tssk. Apparently Al hadn't checked commit c52c2ddc1d ("alpha: switch
osf_sigprocmask() to use of sigprocmask()") at all. It doesn't compile.
Fixed as per suggestions from Michael Cree.
Reported-by: Michael Cree <mcree@orcon.net.nz>
Cc: Al Viro <viro@ftp.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
get rid of a useless wrapper, while we are at it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rdusp() gives us the right value only for the current thread...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We want interrupts disabled on all paths leading to RESTORE_ALL;
otherwise, we are risking an IRQ coming between the updates of
alpha_mv->hae_cache and *alpha_mv->hae_register and set_hae()
within the IRQ getting badly confused.
RESTORE_ALL used to play with disabling IRQ itself, but that got
removed back in 2002, without making sure we had them disabled
on all paths. It's cheaper to make sure we have them disabled than
to revert to original variant...
Remove the detritus left from that commit back in 2002; we used to
need a reload of $0 and $1 since swpipl would change those, but
doing that had become pointless when we stopped doing swpipl in
there...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unlike the other targets, alpha sets _one_ sigframe and
buggers off until the next syscall/interrupt, even if
more signals are pending. It leads to quite a few unpleasant
inconsistencies, starting with SIGSEGV potentially arriving
not where it should and including e.g. mess with sigsuspend();
consider two pending signals blocked until sigsuspend()
unblocks them. We pick the first one; then, if we are hit
by interrupt while in the handler, we process the second one
as well. If we are not, and if no syscalls had been made,
we get out of the first handler and leave the second signal
pending; normally sigreturn() would've picked it anyway, but
here it starts with restoring the original mask and voila -
the second signal is blocked again. On everything else we
get both delivered consistently.
It's actually easy to fix; the only thing to watch out for
is prevention of double syscall restart. Fortunately, the
idea I've nicked from arm fix by rmk works just fine...
Testcase demonstrating the behaviour in question; on alpha
we get one or both flags set (usually one), on everything
else both are always set.
#include <signal.h>
#include <stdio.h>
int had1, had2;
void f1(int sig) { had1 = 1; }
void f2(int sig) { had2 = 1; }
main()
{
sigset_t set1, set2;
sigemptyset(&set1);
sigemptyset(&set2);
sigaddset(&set2, 1);
sigaddset(&set2, 2);
signal(1, f1);
signal(2, f2);
sigprocmask(SIG_SETMASK, &set2, NULL);
raise(1);
raise(2);
sigsuspend(&set1);
printf("had1:%d had2:%d\n", had1, had2);
}
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL,
all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support.
What happens is direct return to ret_from_syscall, in order to bypass
mangling of a3 (error indicator) and prevent other mutilations of
registers (e.g. by syscall restart). That's fine, but... the entire
TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall
stopping/notifying the tracer is after the syscall. And the normal
path we are forcibly switching to doesn't have it.
So we end up with *one* stop in traced sigreturn() vs. two in other
syscalls. And yes, strace is visibly broken by that; try to strace
the following
#include <signal.h>
#include <stdio.h>
void f(int sig) {}
main()
{
signal(SIGHUP, f);
raise(SIGHUP);
write(1, "eeeek\n", 6);
}
and watch the show. The
close(1) = 405
in the end of strace output is coming from return value of write() (6 ==
__NR_close on alpha) and syscall number of exit_group() (__NR_exit_group ==
405 there).
The fix is fairly simple - the only thing we end up missing is the call
of syscall_trace() and we can tell whether we'd been called from the
SYSCALL_TRACE path by checking ra value. Since we are setting the
switch_stack up (that's what sys_sigreturn() does), we have the right
environment for calling syscall_trace() - just before we call
undo_switch_stack() and return. Since undo_switch_stack() will overwrite
s0 anyway, we can use it to store the result of "has it been called from
SYSCALL_TRACE path?" check. The same thing applies in rt_sigreturn().
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Old code used to set regs->r0 and regs->r19 to force the right
return value. Leaving that after switch to ERESTARTNOHAND
was a Bad Idea(tm), since now that screws the restart - if we
hit the case when get_signal_to_deliver() returns 0, we will
step back to syscall insn, with v0 set to EINTR and a3 to 1.
The latter won't matter, since EINTR is 4, aka __NR_write.
Testcase:
#include <signal.h>
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
main()
{
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGCONT);
sigprocmask(SIG_SETMASK, &mask, NULL);
kill(0, SIGCONT);
syscall(__NR_sigsuspend, 1, "b0rken\n", 7);
}
results on alpha in immediate message to stdout...
Fix is obvious; moreover, since we don't need regs anymore, we can
switch to normal prototypes for these guys and lose the wrappers.
Even better, rt_sigsuspend() is identical to generic version in
kernel/signal.c now.
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
same thing as had been done on other targets back in 2003 -
move setting ->restart_block.fn into {rt_,}sigreturn().
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>