Do not trace arch_local_save_flags(), arch_local_irq_*() and friends.
Although they are marked inline, gcc may still make a function out of
them and add it to the pool of functions that are traced by the function
tracer. This can cause undesirable results (kernel panic, triple faults,
etc).
Add the notrace notation to prevent them from ever being traced.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a NODE level to the generic cache events which is used to measure
local vs remote memory accesses. Like all other cache events, an
ACCESS is HIT+MISS, if there is no way to distinguish between reads
and writes do reads only etc..
The below needs filling out for !x86 (which I filled out with
unsupported events).
I'm fairly sure ARM can leave it like that since it doesn't strike me as
an architecture that even has NUMA support. SH might have something since
it does appear to have some NUMA bits.
Sparc64, PowerPC and MIPS certainly want a good look there since they
clearly are NUMA capable.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Anton Blanchard <anton@samba.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.
For the various event classes:
- hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
the PMI-tail (ARM etc.)
- tracepoint: nmi=0; since tracepoint could be from NMI context.
- software: nmi=[0,1]; some, like the schedule thing cannot
perform wakeups, and hence need 0.
As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).
The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
of nodes. But, it's not defined in linux/mmzone.h but defined in
/arch/???/include/mmzone.h which is included only under
CONFIG_NEED_MULTIPLE_NODES=y.
Then, we see
mm/page_cgroup.c: In function 'page_cgroup_init':
mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'
So, fixiing page_cgroup.c is an idea...
But node_start_pfn()/node_end_pfn() is a very generic macro and
should be implemented in the same manner for all archs.
(m32r has different implementation...)
This patch removes definitions of node_start/end_pfn() in each archs
and defines a unified one in linux/mmzone.h. It's not under
CONFIG_NEED_MULTIPLE_NODES, now.
A result of macro expansion is here (mm/page_cgroup.c)
for !NUMA
start_pfn = ((&contig_page_data)->node_start_pfn);
end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});
for NUMA (x86-64)
start_pfn = ((node_data[nid])->node_start_pfn);
end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});
Changelog:
- fixed to avoid using "nid" twice in node_end_pfn() macro.
Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several fixes as well where the +1 was missing.
Done via coccinelle scripts like:
@@
struct resource *ptr;
@@
- ptr->end - ptr->start + 1
+ resource_size(ptr)
and some grep and typing.
Mostly uncompiled, no cross-compilers.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
During converting per-cpu ticker to genirq layer some
IRQ initialization code was removed by commit
2cf9530420 ("sparc32,leon:
per-cpu ticker use genirq per-cpu handler").
This patch reintroduces the code at the same place it was
removed from. IRQ12 - IRQ14 will crash on LEON SMP without
this patch because it will run the SUN4M IRQ trap handler.
Reported-by: Jan Andersson <jan@gaisler.com>
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Three new IPIs were introduced by commit
ecbc42b70a ("sparc32, sun4m:
Implemented SMP IPIs support for SUN4M machines"), the
old handler was already prepared for IPIs but handled only
IRQ14 and IRQ13, this patch adds support for the new IPI at
IRQ12.
The IPI trap handler looks at the mask rather than the
pending IRQ/IPI, this bug may have masked the problem
above, introduced by the same commit.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All archs do more or less the same thing now, move it into
a single generic place.
I chose pci.h rather than of_pci.h to avoid having to change
all call-sites to include the later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
powerpc has two different ways of matching PCI devices to their
corresponding OF node (if any) for historical reasons. The ppc64 one
does a scan looking for matching bus/dev/fn, while the ppc32 one does a
scan looking only for matching dev/fn on each level in order to be
agnostic to busses being renumbered (which Linux does on some
platforms).
This removes both and instead moves the matching code to the PCI core
itself. It's the most logical place to do it: when a pci_dev is created,
we know the parent and thus can do a single level scan for the matching
device_node (if any).
The benefit is that all archs now get the matching for free. There's one
hook the arch might want to provide to match a PHB bus to its device
node. A default weak implementation is provided that looks for the
parent device device node, but it's not entirely reliable on powerpc for
various reasons so powerpc provides its own.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Semicolons are not necessary after switch/while/for/if braces
so remove them.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices that can generate interrupts are connected directly to the
CPU through the bootbus on sun4d. This patch allows IRQs to be allocated
for such devices. The information used for allocating interrupts for
sbus devices are present at the corresponding SBI node. For bootbus
devices this information is present in the bootbus node.
Signed-off-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During the introduction of genirq on sparc32 bugs were introduced in
the interrupt handler for sun4d. The interrupts handler checks the status
of the various sbus interfaces in the system and generates a virtual
interrupt, based upon the location of the interrupt source. This lookup
was broken by restructuring the code in such a way that index and shift
operations were performed prior to comparing this against the values
read from the interrupt controllers.
This could cause the handler to loop eternally as the interrupt source
could be skipped before any check was performed. Additionally
sun4d_encode_irq performs shifting internally, so it should not be performed
twice.
In sun4d_unmask interrupts were not correctly acknowledged, as the
corresponding bit it the interrupt mask was not actually cleared.
Signed-off-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sun4d_build_device_irq was called without a valid platform_device when
the system timer was initialized on sun4d systems. This caused a NULL
pointer crash.
Josip Rodin suggested that the current sun4d_build_device_irq should be
split into two functions. So that the timer initialization could skip
the slot and sbus interface detection code in sun4d_build_device_irq, as
this does not make sence due to the timer interrupts not being generated
from a device located on sbus.
Signed-off-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Config option GENERIC_HARDIRQS_NO_DEPRECATED was removed in commit
78c8982564 ("genirq: Remove the now obsolete
config options and select statements"), but the select was accidentally
reintroduced in commit 6baa9b20a6
("sparc32: genirq support")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DMA region must be accessible in order for PCI peripheral
drivers to work, the sparc32 has DMA in the normal memory
zone which requires the GRPCI2 to PCI target BARs so that all
kernel low mem (192MB) can be mapped 1:1 to PCI address
space. The GRPCI2 has resizeable target BARs, by default the
first is made 256MB and all other BARs are disabled.
I/O space are always located on 0x1000-0x10000, but accessed
through the GRPCI2 PCI I/O Window memory mapped to virtual
address space.
Configuration space is accessed through the 64KB GRPCI2 PCI
CFG Window using LDA bypassing the MMU.
The GRPCI2 has a single PCI Window for prefetchable and non-
prefetchable address space, it is up to the AHB master
requesting PCI data to determine access type. Memory space
is mapped 1:1.
The GRPCI2 core can be configured in 4 different IRQ modes,
where PCI Interrupt, Error Interrupt and DMA Interrupt are
shared on a single IRQ line or at most 5 IRQs are used. The
GRPCI2 can mask/unmask PCI interrupts, Err and DMA in the control
and check status bits which tells us which IRQ really happended.
The GENIRQ layer is used to unmask/mask each individual IRQ
source by creating virtual IRQs and implementing a IRQ chip.
The optional DMA functionality of the GRPCI2 is not supported
by this patch.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LEON architecture does not have a BIOS or bootloader that
initializes PCI for us, instead Linux generic PCI layer is used
to set up resources and IRQ.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
32bit and 64bit on x86 are tested and working. The rest I have looked
at closely and I can't find any problems.
setns is an easy system call to wire up. It just takes two ints so I
don't expect any weird architecture porting problems.
While doing this I have noticed that we have some architectures that are
very slow to get new system calls. cris seems to be the slowest where
the last system calls wired up were preadv and pwritev. avr32 is weird
in that recvmmsg was wired up but never declared in unistd.h. frv is
behind with perf_event_open being the last syscall wired up. On h8300
the last system call wired up was epoll_wait. On m32r the last system
call wired up was fallocate. mn10300 has recvmmsg as the last system
call wired up. The rest seem to at least have syncfs wired up which was
new in the 2.6.39.
v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
v4: Moved wiring up of the system call to another patch
v5: ported to v2.6.39-rc6
v6: rebased onto parisc-next and net-next to avoid syscall conflicts.
v7: ported to Linus's latest post 2.6.39 tree.
> arch/blackfin/include/asm/unistd.h | 3 ++-
> arch/blackfin/mach-common/entry.S | 1 +
Acked-by: Mike Frysinger <vapier@gentoo.org>
Oh - ia64 wiring looks good.
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
to test for existence of find bitops anymore.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way. Move it
to lib/Kconfig.debug so each arch doesn't have to define it. This
obviously makes the option generic, but that's fine because the config is
already used in generic code.
It's not obvious to me that sysrq-P actually does anything caution by
keeping the most inclusive wording.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fold all the mmu_gather rework patches into one for submission
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rework the sparc mmu_gather usage to conform to the new world order :-)
Sparc mmu_gather does two things:
- tracks vaddrs to unhash
- tracks pages to free
Split these two things like powerpc has done and keep the vaddrs
in per-cpu data structures and flush them on context switch.
The remaining bits can then use the generic mmu_gather.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Architectures that implement their own show_mem() function did not pass
the filter argument to show_free_areas() to appropriately avoid emitting
the state of nodes that are disallowed in the current context. This patch
now passes the filter argument to show_free_areas() so those nodes are now
avoided.
This patch also removes the show_free_areas() wrapper around
__show_free_areas() and converts existing callers to pass an empty filter.
ia64 emits additional information for each node, so skip_free_areas_zone()
must be made global to filter disallowed nodes and it is converted to use
a nid argument rather than a zone for this use case.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Unify input section names
percpu: Avoid extra NOP in percpu_cmpxchg16b_double
percpu: Cast away printk format warning
percpu: Always align percpu output section to PAGE_SIZE
Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (28 commits)
sparc32: fix build, fix missing cpu_relax declaration
SCHED_TTWU_QUEUE is not longer needed since sparc32 now implements IPI
sparc32,leon: Remove unnecessary page_address calls in LEON DMA API.
sparc: convert old cpumask API into new one
sparc32, sun4d: Implemented SMP IPIs support for SUN4D machines
sparc32, sun4m: Implemented SMP IPIs support for SUN4M machines
sparc32,leon: Implemented SMP IPIs for LEON CPU
sparc32: implement SMP IPIs using the generic functions
sparc32,leon: SMP power down implementation
sparc32,leon: added some SMP comments
sparc: add {read,write}*_be routines
sparc32,leon: don't rely on bootloader to mask IRQs
sparc32,leon: operate on boot-cpu IRQ controller registers
sparc32: always define boot_cpu_id
sparc32: removed unused code, implemented by generic code
sparc32: avoid build warning at mm/percpu.c:1647
sparc32: always register a PROM based early console
sparc32: probe for cpu info only during startup
sparc: consolidate show_cpuinfo in cpu.c
sparc32,leon: implement genirq CPU affinity
...
Fix following sparc (32 bit) build error:
CC arch/sparc/kernel/asm-offsets.s
In file included from include/linux/seqlock.h:29:0,
from include/linux/time.h:8,
from include/linux/timex.h:56,
from include/linux/sched.h:57,
from arch/sparc/kernel/asm-offsets.c:13:
include/linux/spinlock.h: In function 'spin_unlock_wait':
include/linux/spinlock.h:360:2: error: implicit declaration of function 'cpu_relax'
Most likely caused by commit e66eed651f ("list: remove
prefetching from regular list iterators") due to include
changes.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
macvlan: fix panic if lowerdev in a bond
tg3: Add braces around 5906 workaround.
tg3: Fix NETIF_F_LOOPBACK error
macvlan: remove one synchronize_rcu() call
networking: NET_CLS_ROUTE4 depends on INET
irda: Fix error propagation in ircomm_lmp_connect_response()
irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
be2net: Kill set but unused variable 'req' in lancer_fw_download()
irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
isdn: capi: Use pr_debug() instead of ifdefs.
tg3: Update version to 3.119
tg3: Apply rx_discards fix to 5719/5720
...
Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
as per Davem.
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
sched: Fix and optimise calculation of the weight-inverse
sched: Avoid going ahead if ->cpus_allowed is not changed
sched, rt: Update rq clock when unthrottling of an otherwise idle CPU
sched: Remove unused parameters from sched_fork() and wake_up_new_task()
sched: Shorten the construction of the span cpu mask of sched domain
sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG
sched: Remove unused 'this_best_prio arg' from balance_tasks()
sched: Remove noop in alloc_rt_sched_group()
sched: Get rid of lock_depth
sched: Remove obsolete comment from scheduler_tick()
sched: Fix sched_domain iterations vs. RCU
sched: Next buddy hint on sleep and preempt path
sched: Make set_*_buddy() work on non-task entities
sched: Remove need_migrate_task()
sched: Move the second half of ttwu() to the remote cpu
sched: Restructure ttwu() some more
sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
sched: Remove rq argument from ttwu_stat()
sched: Remove rq->lock from the first half of ttwu()
sched: Drop rq->lock from sched_exec()
...
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix rt_rq runtime leakage bug
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
perf stat: Add more cache-miss percentage printouts
perf stat: Add -d -d and -d -d -d options to show more CPU events
ftrace/kbuild: Add recordmcount files to force full build
ftrace: Add self-tests for multiple function trace users
ftrace: Modify ftrace_set_filter/notrace to take ops
ftrace: Allow dynamically allocated function tracers
ftrace: Implement separate user function filtering
ftrace: Free hash with call_rcu_sched()
ftrace: Have global_ops store the functions that are to be traced
ftrace: Add ops parameter to ftrace_startup/shutdown functions
ftrace: Add enabled_functions file
ftrace: Use counters to enable functions to trace
ftrace: Separate hash allocation and assignment
ftrace: Create a global_ops to hold the filter and notrace hashes
ftrace: Use hash instead for FTRACE_FL_FILTER
ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
perf bench, x86: Add alternatives-asm.h wrapper
x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
...
Commit b826291c, "drivercore/dt: add a match table pointer to struct
device" added an of_match pointer to struct device to cache the
of_match_table entry discovered at driver match time. This was unsafe
because matching is not an atomic operation with probing a driver. If
two or more drivers are attempted to be matched to a driver at the
same time, then the cached matching entry pointer could get
overwritten.
This patch reverts the of_match cache pointer and reworks all users to
call of_match_device() directly instead.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The function mmu_inval_dma_area takes a virtual address as a parameter
which is problematic in case the buffer is located in highmem and the
mapping currently is unavailable.
Since the function was only implemented for LEON this patch removes
calls to it in non LEON code paths and renames it to dma_make_coherent
which instead takes a physical address (which for now is unused since we
flush the whole cache). This way it is possible to remove several unnecessary
calls to page_address which will fail if the virtual mapping is unavailable.
Signed-off-by: Kristoffer Glembo <kristoffer@gaisler.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adapt new API. Almost change is trivial, most important change are to
remove following like =operator.
cpumask_t cpu_mask = *mm_cpumask(mm);
cpus_allowed = current->cpus_allowed;
Because cpumask_var_t is =operator unsafe. These usage might prevent
kernel core improvement.
No functional change.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sun4d does not seem to have a distingstion between soft and hard
IRQs. When generating IPIs the generated IRQ looks like a hard IRQ,
this patch adds a "IPI check" in the sun4d irq trap handler at a
predefined IRQ number (SUN4D_IPI_IRQ). Before generating an IPI
a per-cpu memory structure is modified for the "IPI check" to
successfully detect a IPI request to a specific processor, the check
clears the IPI work requested.
All three IPIs (resched, single and cpu-mask) use the same IRQ
number.
The IPI IRQ should preferrably be on a separate IRQ and definitly
not shared with IRQ handlers requesting IRQ with IRQF_SHARED.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the three IPIs (resched, single and cpu-mask) generation
and interrupt handler catch. The sun4m has 15 soft-IRQs and three
of them is used with this patch, the three IPIs was previously
implemented with the cross-call IRQ15 which does not work with
locking routines such as spinlocks because IRQ15 is NMI, it may
cause deadlock.
The IRQ trap handler code assumes (in the same spritit as the old
it seems) that hard interrupts will be generated until handled
(level), when a IRQ happens the IRQ pending register is checked
for pending soft-IRQs. When both hard and soft IRQ happens at the
same time only soft-IRQs are handled.
The old code implemented a soft-IRQ traphandler at IRQ14 which
called smp_reschedule_irq which in turn called set_need_resched.
It seems to be an old relic and is replaced with the interrupt
traphander exit code RESTORE_ALL, it calls schedule() when
appropriate.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements SMP IPIs on LEON using software generated
IRQs to signal between CPUs.
The IPI IRQ number is set by using the ipi_num property in the
device tree, or defaults to 13. LEON SMP systems should reserve
IRQ 13 (and IRQ 15) to Linux in order for the defaults to work.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current sparc32 SMP IPI generation is implemented the
cross call function. The cross call function uses IRQ15 the
NMI, this is has the effect that IPIs will interrupt IRQ
critical areas and hang the system. Typically on/after
spin_lock_irqsave calls can be aborted.
The cross call functionality must still exist to flush
cache/TLBS.
This patch provides CPU models a custom way to implement
generation of IPIs on the generic code's request. The
typical approach is to generate an IRQ for each IPI case.
After this patch each sparc32 SMP CPU model needs to
implement IPIs in order to function properly.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds {read,write}*_be big endian memory access
routines to the io.h header used on SPARC32 and SPARC64.
Tested on SPARC32 (LEON)
Signed-off-by: Jan Andersson <jan@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits,
but it's not guaranteed one of them is zero. So we can get unaligned memory access
in label ccte. Example of parameters which lead to this:
%o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3
With the parameters I had a memory corruption, when the additional 5 bytes were rewritten.
This patch corrects the error.
One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft
stores word or less.
Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a multiple message send syscall and is the send
version of the existing recvmmsg syscall. This is heavily
based on the patch by Arnaldo that added recvmmsg.
I wrote a microbenchmark to test the performance gains of using
this new syscall:
http://ozlabs.org/~anton/junkcode/sendmmsg_test.c
The test was run on a ppc64 box with a 10 Gbit network card. The
benchmark can send both UDP and RAW ethernet packets.
64B UDP
batch pkts/sec
1 804570
2 872800 (+ 8 %)
4 916556 (+14 %)
8 939712 (+17 %)
16 952688 (+18 %)
32 956448 (+19 %)
64 964800 (+20 %)
64B raw socket
batch pkts/sec
1 1201449
2 1350028 (+12 %)
4 1461416 (+22 %)
8 1513080 (+26 %)
16 1541216 (+28 %)
32 1553440 (+29 %)
64 1557888 (+30 %)
We see a 20% improvement in throughput on UDP send and 30%
on raw socket send.
[ Add sparc syscall entries. -DaveM ]
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
include/linux/perf_event.h
Merge reason: pick up the latest jump-label enhancements, they are cooked ready.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* proper initialization of boot_cpu_id (no hardcoding to 0)
* use boot_cpu_id index to address into the IRQ controller where
appropriate
Each CPU has a separate set of IRQ controller registers, this
patch makes sure that the boot-cpu registers are used instead
of CPU0's.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define boot_cpu_id in single-processor kernels as well. This is
to support architectures which can boot on other than CPU0.
Sam Ravnborg has written the cleanup parts by extracting
boot_cpu_id from smp_32.c into setup_32.c and cleaned up
sun4d_irq.c.
boot_cpu_id was initialized before BSS was cleared in
sun4c_continue_boot, instead boot_cpu_id is set to 0xff to
avoid BSS. If boot_cpu_id is untouched (0xff) by bootup code
it will be overwritten to 0. boot_cpu_id4 is automatically
calculated in common code.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sparcstation 5 I have available has no MID property for the CPU.
This resulted in a panic when booting a SMP kernel on this box.
The assigned field in cpu_data is never used, so if we fail
to read the MID property then inform user and continue booting.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following warning:
mm/percpu.c: In function 'pcpu_embed_first_chunk':
mm/percpu.c:1647:3: warning: format '%lx' expects type 'long unsigned int', but argument 3 has type 'unsigned int'
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
[sam: added warning message to changelog, use _AC()]
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not require user to add "-p" to boot arguments to see
early info printed to prom console.
This is similar to the sparc64 functionality - which was added with:
3c62a2d347 ("[SPARC64]: Always register
a PROM based early console.")
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We did a cpu_probe() call each time a CPU got online - which
only effect was to save latest CPU/FPU info for use by show_cpuinfo().
Use same setup as for sparc64 where we probe for this info during startup,
and only once.
This allowed us to annotate a few functions __init which again
fixed the following section mismatch warnings:
WARNING: vmlinux.o(.text+0x65f0): Section mismatch in reference from the function set_cpu_and_fpu() to the (unknown reference) .init.rodata:(unknown)
WARNING: vmlinux.o(.text+0x65f8): Section mismatch in reference from the function set_cpu_and_fpu() to the (unknown reference) .init.rodata:(unknown)
WARNING: vmlinux.o(.text+0x664c): Section mismatch in reference from the function set_cpu_and_fpu() to the variable .init.rodata:manufacturer_info
WARNING: vmlinux.o(.text+0x6650): Section mismatch in reference from the function set_cpu_and_fpu() to the variable .init.rodata:manufacturer_info
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have all the cpu related info in cpu.c - so move
the remaining functions to support /proc/cpuinfo to this file.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In all cases there were a struct of_device_id variable defined __initdata.
But it was referenced from struct platform_driver.of_match_table
which is not guaranteed to be used during init only.
So drop the __initdata annotation.
This fixes following warnings:
WARNING: arch/sparc/kernel/built-in.o(.data+0x810): Section mismatch in reference from the variable clock_driver to the variable .init.data:clock_match
The variable clock_driver references
the variable __initdata clock_match
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
WARNING: arch/sparc/kernel/built-in.o(.data+0xcec): Section mismatch in reference from the variable apc_driver to the variable .init.data:apc_match
The variable apc_driver references
the variable __initdata apc_match
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
WARNING: arch/sparc/kernel/built-in.o(.data+0xd60): Section mismatch in reference from the variable pmc_driver to the variable .init.data:pmc_match
The variable pmc_driver references
the variable __initdata pmc_match
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A simple implementation of CPU affinity, the first CPU in
the affinity CPU mask always takes the IRQ.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleaned up leon_init_timers() by removing unnecessary double checking
and one indentation level. Changed LEON_IMASK to LEON_IMASK(cpu).
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The extended IRQ controller gives the LEON 16 more IRQs.
The patch installs a custom handler for the exetended controller
IRQ, where a register is read and the "real" IRQ causing IRQ is
determined.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LEON interrupt controller has one single mask register for all
IRQs per CPU, even though the genirq layer protects us from accessing
the same IRQ at the same time other IRQs share the same mask register
and may thus interfere. Some other IRQ controllers has a mask register
or similar per IRQ instead which makes spinlocks unncessary.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conversion of sparc32 to genirq is based on original work done
by David S. Miller.
Daniel Hellstrom has helped in the conversion and implemented
the shutdowm functionality.
Marcel van Nies <morcles@gmail.com> has tested this on Sparc Station 20
Test status:
sun4c - not tested
sun4m,pci - not tested
sun4m,sbus - tested (Sparc Classic, Sparc Station 5, Sparc Station 20)
sun4d - not tested
leon - tested on various combinations of leon boards,
including SMP variants
generic
Introduce use of GENERIC_HARDIRQS and GENERIC_IRQ_SHOW
Allocate 64 IRQs - which is enough even for SS2000
Use a table of irq_bucket to maintain uses IRQs
irq_bucket is also used to chain several irq's that
must be called when the same intrrupt is asserted
Use irq_link to link a interrupt source to the irq
All plafforms must now supply their own build_device_irq method
handler_irq rewriten to use generic irq support
floppy
Read FLOPPY_IRQ from platform device
Use generic request_irq to register the floppy interrupt
Rewrote sparc_floppy_irq to use the generic irq support
pcic:
Introduce irq_chip
Store mask in chip_data for use in mask/unmask functions
Add build_device_irq for pcic
Use pcic_build_device_irq in pci_time_init
allocate virtual irqs in pcic_fill_irq
sun4c:
Introduce irq_chip
Store mask in chip_data for use in mask/unmask functions
Add build_device_irq for sun4c
Use sun4c_build_device_irq in sun4c_init_timers
sun4m:
Introduce irq_chip
Introduce dedicated mask/unmask methods
Introduce sun4m_handler_data that allow easy access to necessary
data in the mask/unmask functions
Add a helper method to enable profile_timer (used from smp)
Added sun4m_build_device_irq
Use sun4m_build_device_irq in sun4m_init_timers
TODO:
There is no replacement for smp_rotate that always scheduled
next CPU as interrupt target upon an interrupt
sun4d:
Introduce irq_chip
Introduce dedicated mask/unmask methods
Introduce sun4d_handler_data that allow easy access to
necessary data in mask/unmask fuctions
Rewrote sun4d_handler_irq to use generic irq support
TODO:
The original implmentation of enable/disable had:
if (irq < NR_IRQS)
return;
The new implmentation does not distingush between SBUS and cpu
interrupts.
I am no sure what is right here. I assume we need to do
something for the cpu interrupts.
I have not succeeded booting my sun4d box (with or without this patch)
and my understanding of this platfrom is limited.
So I would be a bit suprised if this works.
leon:
Introduce irq_chip
Store mask in chip_data for use in mask/unmask functions
Add build_device_irq for leon
Use leon_build_device_irq in leon_init_timers
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Daniel Hellstrom <daniel@gaisler.com>
Tested-by: Daniel Hellstrom <daniel@gaisler.com>
Tested-by: Marcel van Nies <morcles@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the ifdeffery to a header file to make the logic more
obvious where we decide between PCI or SBUS init
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new name reflects the actual usage much better.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.
In order to do so we need a generic callback from the existing scheduler IPI.
This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.
BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
This compile error triggers on Sparc64:
kernel/sched.c:7140: error: 'cpu_coregroup_mask' undeclared here (not in a function)
Because after the recent scheduler domain cleanups the scheduler
uses this arch method as a function pointer in a scheduler
topology data structure - which is not possible with a macro.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110412140040.3020ef55.sfr@canb.auug.org.au
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Introduce:
static __always_inline bool static_branch(struct jump_label_key *key);
instead of the old JUMP_LABEL(key, label) macro.
In this way, jump labels become really easy to use:
Define:
struct jump_label_key jump_key;
Can be used as:
if (static_branch(&jump_key))
do unlikely code
enable/disale via:
jump_label_inc(&jump_key);
jump_label_dec(&jump_key);
that's it!
For the jump labels disabled case, the static_branch() becomes an
atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
atomic_dec() operations. We show testing results for this change below.
Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
Since we now require a 'struct jump_label_key *key', we can store a pointer into
the jump table addresses. In this way, we can enable/disable jump labels, in
basically constant time. This change allows us to completely remove the previous
hashtable scheme. Thanks to Peter Zijlstra for this re-write.
Testing:
I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
configurations, where tracepoints were disabled.
jump label configured in
avg: 815.6
jump label *not* configured in (using atomic reads)
avg: 800.1
jump label *not* configured in (regular reads)
avg: 803.4
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110316212947.GA8792@redhat.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Suggested-by: H. Peter Anvin <hpa@linux.intel.com>
Tested-by: David Daney <ddaney@caviumnetworks.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc32: Pass task_struct to schedule_tail() in ret_from_fork
apbuart: Depend upon sparc.
sparc64: Fix section mis-match errors.
sparc32,leon: Fixed APBUART frequency detection
sparc32, leon: APBUART driver must use archdata to get IRQ number
sparc: Hook up syncfs system call.
We have to pass task_struct of previous process to function
schedule_tail(). Currently in ret_from_fork previous thread_info
is passed:
switch_to: mov %g6, %g3 /* previous thread_info in g6 */
ret_from_fork: call schedule_tail
mov %g3, %o0 /* previous thread_info is passed */
void schedule_tail(struct task_struct *prev);
Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix all of the problems spotted by CONFIG_DEBUG_SECTION_MISMATCH under
arch/sparc during a 64-bit defconfig build.
They fall into two categorites:
1) of_device_id is marked as __initdata, and we can never do this
since these objects sit in the device core data structures way
past boot. So even if a driver will never be reloaded, we have
to keep the device ID table around.
Mark such cases const instead.
2) The bootmem alloc/free handling code in mdesc.c was not fully
marked __init as it should be, thus generating a reference
to free_bootmem_late() (which is __init) from non-__init code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the new features in genirq:
1) Set the chip flag IRCHIP_EOI_IF_HANDLED, which ensures in the
core code that irq_eoi() is only called when the interrupt was
handled. That removes the extra status check in the callback.
2) Use the preflow handler, which is called from the fasteoi core code
before the device handler. That avoids another status check and the
open coded handler redirection.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: sparclinux@vger.kernel.org
Commit ddd588b5dd ("oom: suppress nodes that are not allowed from
meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which
resulted in build warnings on all architectures that implement their own
versions of show_mem():
lib/lib.a(show_mem.o): In function `show_mem':
show_mem.c:(.text+0x1f4): multiple definition of `show_mem'
arch/sparc/mm/built-in.o:(.text+0xd70): first defined here
The fix is to remove __show_mem() and add its argument to show_mem() in
all implementations to prevent this breakage.
Architectures that implement their own show_mem() actually don't do
anything with the argument yet, but they could be made to filter nodes
that aren't allowed in the current context in the future just like the
generic implementation.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During the preparation for testing the recent changes made to the SUN4D
specific code in the kernel by Sam Ravnborg the following was discovered:
Since the removal of of_platform_bus_type (commit: eca3930163 )
multiboard SUN4Ds have not been able to boot. The kernel crashes due to a
zero-pointer error encountered when registering multiple M48T59 RTCs
(There is one on each board).
A patch for the was previously submitted, but the problem was not a
serious at that time, as it would only generate warnings. Now the kernel
will crash and stop executing before the serial console has been started.
(Crash output can be viewed by using the -p boot flag)
Signed-off-by: Kjetil Oftedal <oftedal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Percpu allocator honors alignment request upto PAGE_SIZE and both the
percpu addresses in the percpu address space and the translated kernel
addresses should be aligned accordingly. The calculation of the
former depends on the alignment of percpu output section in the kernel
image.
The linker script macros PERCPU_VADDR() and PERCPU() are used to
define this output section and the latter takes @align parameter.
Several architectures are using @align smaller than PAGE_SIZE breaking
percpu memory alignment.
This patch removes @align parameter from PERCPU(), renames it to
PERCPU_SECTION() and makes it always align to PAGE_SIZE. While at it,
add PCPU_SETUP_BUG_ON() checks such that alignment problems are
reliably detected and remove percpu alignment comment recently added
in workqueue.c as the condition would trigger BUG way before reaching
there.
For um, this patch raises the alignment of percpu area. As the area
is in .init, there shouldn't be any noticeable difference.
This problem was discovered by David Howells while debugging boot
failure on mn10300.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
There is no user now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
minix bit operations are only used by minix filesystem and useless by
other modules. Because byte order of inode and block bitmaps is different
on each architecture like below:
m68k:
big-endian 16bit indexed bitmaps
h8300, microblaze, s390, sparc, m68knommu:
big-endian 32 or 64bit indexed bitmaps
m32r, mips, sh, xtensa:
big-endian 32 or 64bit indexed bitmaps for big-endian mode
little-endian bitmaps for little-endian mode
Others:
little-endian bitmaps
In order to move minix bit operations from asm/bitops.h to architecture
independent code in minix filesystem, this provides two config options.
CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
m32r, mips, sh, xtensa). The architectures which always use little-endian
bitmaps do not select these options.
Finally, we can remove minix bit operations from asm/bitops.h for all
architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the result of conversions, there are no users of ext2 non-atomic bit
operations except for ext2 filesystem itself. Now we can put them into
architecture independent code in ext2 filesystem, and remove from
asm/bitops.h for all architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce little-endian bit operations to the big-endian architectures
which do not have native little-endian bit operations and the
little-endian architectures. (alpha, avr32, blackfin, cris, frv, h8300,
ia64, m32r, mips, mn10300, parisc, sh, sparc, tile, x86, xtensa)
These architectures can just include generic implementation
(asm-generic/bitops/le.h).
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This introduces CONFIG_GENERIC_FIND_BIT_LE to tell whether to use generic
implementation of find_*_bit_le() in lib/find_next_bit.c or not.
For now we select CONFIG_GENERIC_FIND_BIT_LE for all architectures which
enable CONFIG_GENERIC_FIND_NEXT_BIT.
But m68knommu wants to define own faster find_next_zero_bit_le() and
continues using generic find_next_{,zero_}bit().
(CONFIG_GENERIC_FIND_NEXT_BIT and !CONFIG_GENERIC_FIND_BIT_LE)
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All architectures can use the common dma_addr_t typedef now. We can
remove the arch specific dma_addr_t.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a node parameter to alloc_thread_info(), and change its name to
alloc_thread_info_node()
This change is needed to allow NUMA aware kthread_create_on_cpu()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Similarly to irq_of_parse_and_map(), find the platform_device
object and return the pre-computed resource.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits)
KVM: unbreak userspace that does not sets tss address
KVM: MMU: cleanup pte write path
KVM: MMU: introduce a common function to get no-dirty-logged slot
KVM: fix rcu usage in init_rmode_* functions
KVM: fix kvmclock regression due to missing clock update
KVM: emulator: Fix permission checking in io permission bitmap
KVM: emulator: Fix io permission checking for 64bit guest
KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n
KVM: x86: Remove useless regs_page pointer from kvm_lapic
KVM: improve comment on rcu use in irqfd_deassign
KVM: MMU: remove unused macros
KVM: MMU: cleanup page alloc and free
KVM: MMU: do not record gfn in kvm_mmu_pte_write
KVM: MMU: move mmu pages calculated out of mmu lock
KVM: MMU: set spte accessed bit properly
KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits
KVM: Start lock documentation
KVM: better readability of efer_reserved_bits
KVM: Clear async page fault hash after switching to real mode
KVM: VMX: Initialize vm86 TSS only once.
...
Make __get_user_pages return -EHWPOISON for HWPOISON page only if
FOLL_HWPOISON is specified. With this patch, the interested callers
can distinguish HWPOISON pages from general FAULT pages, while other
callers will still get -EFAULT for all these pages, so the user space
interface need not to be changed.
This feature is needed by KVM, where UCR MCE should be relayed to
guest for HWPOISON page, while instruction emulation and MMIO will be
tried for general FAULT page.
The idea comes from Andrew Morton.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
When we try to handle vmalloc faults, we can take a code
path which uses "code" before we actually set it.
Amusingly gcc-3.3 notices this yet gcc-4.x does not.
Reported-by: Bob Breuer <breuerr@mc.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
gas used to accept (and ignore?) .size directives which referred to
undefined symbols, as this does. In binutils 2.21 these are treated
as errors.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>