ST.as only takes S9 (255) for offset. This was going out of range when
accessing a task_struct field with 4k NR_CPUS (due to 128b of coumaks
itself in there).
Workaround by using an intermediate register to do the address scaling.
There is some duplication of fix for ctx_sw.c and ctx_sw_asm.S however
given that C version will go away soon I'm not bothering to factor out
the common code.
Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- Add mm_cpumask setting (aggregating only, unlike some other arches)
used to restrict the TLB flush cross-calling
- cross-calling versions of TLB flush routines (thanks to Noam)
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Need export symbol for it, or can not pass compiling, the related error
with allmodconfig:
MODPOST 2994 modules
ERROR: "pm_power_off" [drivers/mfd/retu-mfd.ko] undefined!
ERROR: "pm_power_off" [drivers/char/ipmi/ipmi_poweroff.ko] undefined!
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Need export its symbol just like other architectures done, or can not
pass compiling with allmodconfig, the related error:
MODPOST 2994 modules
ERROR: "save_stack_trace" [kernel/backtracetest.ko] undefined!
ERROR: "save_stack_trace" [drivers/md/persistent-data/dm-persistent-data.ko] undefined!
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
get_hw_config_num_irq() may be called by normal iss_model_init_smp()
which is a function pointer for 'init_smp' which may be called by
first_lines_of_secondary() which also need be normal too.
The related warning (with allmodconfig):
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x5814): Section mismatch in reference from the function iss_model_init_smp() to the function .init.text:get_hw_config_num_irq()
The function iss_model_init_smp() references
the function __init get_hw_config_num_irq().
This is often because iss_model_init_smp lacks a __init
annotation or the annotation of get_hw_config_num_irq is wrong.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
first_lines_of_secondary() is a '__init' function, but it may be called
by __cpu_up() by _cpu_up() by cpu_up() which is a normal export symbol
function. So recommend to remove '__init'.
The related warning (with allmodconfig):
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x315c): Section mismatch in reference from the function __cpu_up() to the function .init.text:first_lines_of_secondary()
The function __cpu_up() references
the function __init first_lines_of_secondary().
This is often because __cpu_up lacks a __init
annotation or the annotation of first_lines_of_secondary is wrong.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
They haven't '__init' in definition, but has '__init' in declaration.
And normal function start_kernel_secondary() may call setup_processor()
which will call arc_init_IRQ().
So need remove '__init' for both of them. The related warning (with
allmodconfig):
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x3084): Section mismatch in reference from the function start_kernel_secondary() to the function .init.text:setup_processor()
The function start_kernel_secondary() references
the function __init setup_processor().
This is often because start_kernel_secondary lacks a __init
annotation or the annotation of setup_processor is wrong.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
arc supports kgdb, but need update -- add function kgdb_roundup_cpus(),
or can not pass compiling. At present, add the simple generic one just
like other architectures(e.g. tile, mips ...).
The related error (with allmodconfig):
kernel/built-in.o: In function `kgdb_cpu_enter':
kernel/debug/debug_core.c:580: undefined reference to `kgdb_roundup_cpus'
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
------------------>8----------------------
arch/arc/mm/tlb.c: In function ‘do_tlb_overlap_fault’:
arch/arc/mm/tlb.c:688:13: warning: array subscript is above array bounds
[-Warray-bounds]
(pd0[n] & PAGE_MASK)) {
^
------------------>8----------------------
While at it, remove the usless last iteration of outer loop when reading
a TLB SET for duplicate entries.
Suggested-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Lockdep required a small fix to stacktrace API which was incorrectly
unwindign out of __switch_to for the current call frame.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In case bootloader has changed the priority of one/more IRQ lines
Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
switch the args (address, pt_regs) to match with all the other "C"
exception handlers.
This removes the awkwardness in EV_ProtV for page fault vs. unaligned
access.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Line op needs vaddr (indexing) and paddr (tag match). For page sized
flushes (V-P const), each line op will need a different index, but the
tag bits wil remain constant, hence paddr can be setup once outside the
loop.
This improves select LMBench numbers for Aliasing dcache where we have
more "preventive" cache flushing.
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.11-rc7- Linux 3.11.0- 80 4.66 8.88 69.7 112. 268. 8.60 28.0 3489 13.K 27.K # Non alias ARC700
3.11-rc7- Linux 3.11.0- 80 4.64 8.51 68.6 98.5 271. 8.58 28.1 4160 15.K 32.K # Aliasing
3.11-rc7- Linux 3.11.0- 80 4.64 8.51 69.8 99.4 270. 8.73 27.5 3880 15.K 31.K # PTAG loop Inv
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With Line length being constant now, we can fold the 2 helpers into 1.
This allows applying any optimizations (forthcoming) to single place.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC dcache supports 3 ops - Inv, Flush, Flush-n-Inv.
The programming model however provides 2 commands FLUSH, INV.
INV will either discard or flush-n-discard (based on DT_CTRL bit)
The leaf helper __dc_line_loop() used to take the AUX register
(corresponding to the 2 commands). Now we push that to within the
helper, paving way for code consolidations to follow.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is
address calculation via the form &__get_cpu_var(x). This calculates the address for
the instance of the percpu variable of the current processor based on an offset.
Other use cases are for storing and retrieving data from the current processors percpu area.
__get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.
__get_cpu_var() is defined as :
#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
__get_cpu_var() always only does an address determination. However, store and retrieve operations
could use a segment prefix (or global register on other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
optimized assembly code to read and write per cpu variables.
This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
and less registers are used when code is generated.
At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.
The patchset includes passes over all arches as well. Once these operations are used throughout then
specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
f.e. using a global register that may be set to the per cpu base.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu variable.
DEFINE_PER_CPU(int, u);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(this_cpu_ptr(&x), y, sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
this_cpu_inc(y)
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current
task's "active_mm". ARC vmalloc fault handler however was using mm.
A vmalloc fault for non user task context (actually pre-userland, from
init thread's open for /dev/console) caused the handler to deref NULL mm
(for mm->pgd)
The reasons it worked so far is amazing:
1. By default (!SMP), vmalloc fault handler uses a cached value of PGD.
In SMP that MMU register is repurposed hence need for mm pointer deref.
2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in
pre-userland code path - it was introduced with commit 20bafb3d23
"n_tty: Move buffers into n_tty_data"
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: stable@vger.kernel.org #3.10 and 3.11
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Resolve cherry-picking conflicts:
Conflicts:
mm/huge_memory.c
mm/memory.c
mm/mprotect.c
See this upstream merge commit for more details:
52469b4fcd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ARCompact TRAP_S insn used for breakpoints, commits before exception is
taken (updating architectural PC). So ptregs->ret contains next-PC and
not the breakpoint PC itself. This is different from other restartable
exceptions such as TLB Miss where ptregs->ret has exact faulting PC.
gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for
@stop_pc which returns ptregs->ret vs. EFA depending on the
situation.
However, writing stop_pc (SETREGSET request), which updates ptregs->ret
doesn't makes sense stop_pc doesn't always correspond to that reg as
described above.
This was not an issue so far since user_regs->ret / user_regs->stop_pc
had same value and both writing to ptregs->ret was OK, needless, but NOT
broken, hence not observed.
With gdb "jump", they diverge, and user_regs->ret updating ptregs is
overwritten immediately with stop_pc, which this patch fixes.
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Now that prom.h is optional, all the empty prom.h headers can be removed.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc,
but it is only used for /proc/device-teee and sparc does not enable
/proc/device-tree. So this option is redundant. Remove the option and
always enable it. This has the side effect of fixing /proc/device-tree
on arches such as arm64 which failed to define this option.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Remove unnecessary prom.h includes in preparation to remove prom.h.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Convert arc to use the common of_flat_dt_match_machine function.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
All arches do essentially the same thing now for
early_init_dt_setup_initrd_arch, so it can now be removed.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Use the common unflatten_and_copy_device_tree to copy the built-in FDT
out of init section.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh
T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM
I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE
UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I
QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii
8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs=
=xSFJ
-----END PGP SIGNATURE-----
Merge tag 'v3.12-rc4' into sched/core
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.
Conflicts:
arch/avr32/include/asm/Kbuild
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Previously, when a signal was registered with SA_SIGINFO, parameters 2
and 3 of the signal handler were written to registers r1 and r2 before
the register set was saved. This led to corruption of these two
registers after returning from the signal handler (the wrong values were
restored).
With this patch, registers are now saved before any parameters are
passed, thus maintaining the processor state from before signal entry.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
clockevents_config_and_register is more clever and correct than doing it
by hand; so use it.
[vgupta: fixed build failure due to missing ; in patch]
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.
So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.
The spinlock operation itself looks as follows:
mov reg, 1 ; 1=locked, 0=unlocked
retry:
EX reg, [lock] ; load existing, store 1, atomically
BREQ reg, 1, rety ; if already locked, retry
In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.
Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:
core1 core2
-------- --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read) - LOCKED
3. spin unlock [ST 0] - UNLOCKED
spin lock [EX r=0,w=1] - LOCKED
-- resched core 1----
5. spin lock [EX r=1] - ALREADY-LOCKED
-- resched core 2----
6. rwlock(Write) - READER-LOCKED
7. spin unlock [ST 0]
8. rwlock failed, retry again
9. spin lock [EX r=0, w=1]
-- resched core 1----
10 spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
-- resched core 2----
...
...
The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Anton reported
| LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
| similarly in one testcase test_iov_invalid -> lvec->iov_base.
| Testcase expects errno EFAULT and return code -1,
| but it gets return code 1 and ERRNO is 0 what means success.
Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would
pass for @addr == -1
Fixed that by rewriting as [@addr <= TASK_SIZE - @sz]
Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
If a load or store is the last instruction in a zero-overhead-loop, and
it's misaligned, the loop would execute only once.
This fixes that problem.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge more patches from Andrew Morton:
"The rest of MM. Plus one misc cleanup"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (35 commits)
mm/Kconfig: add MMU dependency for MIGRATION.
kernel: replace strict_strto*() with kstrto*()
mm, thp: count thp_fault_fallback anytime thp fault fails
thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page()
thp: do_huge_pmd_anonymous_page() cleanup
thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
mm: cleanup add_to_page_cache_locked()
thp: account anon transparent huge pages into NR_ANON_PAGES
truncate: drop 'oldsize' truncate_pagecache() parameter
mm: make lru_add_drain_all() selective
memcg: document cgroup dirty/writeback memory statistics
memcg: add per cgroup writeback pages accounting
memcg: check for proper lock held in mem_cgroup_update_page_stat
memcg: remove MEMCG_NR_FILE_MAPPED
memcg: reduce function dereference
memcg: avoid overflow caused by PAGE_ALIGN
memcg: rename RESOURCE_MAX to RES_COUNTER_MAX
memcg: correct RESOURCE_MAX to ULLONG_MAX
mm: memcg: do not trap chargers with full callstack on OOM
mm: memcg: rework and document OOM waiting and wakeup
...
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The memcg code can trap tasks in the context of the failing allocation
until an OOM situation is resolved. They can hold all kinds of locks
(fs, mm) at this point, which makes it prone to deadlocking.
This series converts memcg OOM handling into a two step process that is
started in the charge context, but any waiting is done after the fault
stack is fully unwound.
Patches 1-4 prepare architecture handlers to support the new memcg
requirements, but in doing so they also remove old cruft and unify
out-of-memory behavior across architectures.
Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
faults, because they can gracefully unwind the stack with -ENOMEM. OOM
handling is restricted to user triggered faults that have no other
option.
Patch 6 reworks memcg's hierarchical OOM locking to make it a little
more obvious wth is going on in there: reduce locked regions, rename
locking functions, reorder and document.
Patch 7 implements the two-part OOM handling such that tasks are never
trapped with the full charge stack in an OOM situation.
This patch:
Back before smart OOM killing, when faulting tasks were killed directly on
allocation failures, the arch-specific fault handlers needed special
protection for the init process.
Now that all fault handlers call into the generic OOM killer (see commit
609838cfed97: "mm: invoke oom-killer from remaining unconverted page
fault handlers"), which already provides init protection, the
arch-specific leftovers can be removed.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 05b016ecf5 "ARC: Setup Vector Table Base in early boot" moved
the Interrupt vector Table setup out of arc_init_IRQ() which is called
for all CPUs, to entry point of boot cpu only, breaking booting of others.
Fix by adding the same to entry point of non-boot CPUs too.
read_arc_build_cfg_regs() printing IVT Base Register didn't help the
casue since it prints a synthetic value if zero which is totally bogus,
so fix that to print the exact Register.
[vgupta: Remove the now stale comment from header of arc_init_IRQ and
also added the commentary for halt-on-reset]
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Cc: <stable@vger.kernel.org> #3.11
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding the
entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSL12LAAoJEEFnBt12D9kB64gP/RBipnYbo3RPanHg+lE/J1V7
KSVFNGKWJHxTg47VVC1YJGIG21jqxAilpdS2MQL5FP7iyd+IzvtHpQiJgp+2G+pq
di06yrdyrYErxRgZgGQi8IpR538ZzOEVLCKJGdb09YelkRzPT5au7CC1MAsX3qco
yba7PHk0/Nc4hZE4aGbgR1DlRmn86ob7mM0KFE/LORaSN2BueMgWcwKhQXYNGyoh
assX4yNhAbUG6Bgw7paBLDGqHh8c5Ei5AppU8yPb+N094jgYHBJryUoDlzzUHD23
qqiEqHhUKT0TpgHNs8KH0WZFugcmjKvYEbzdzadBxqfXnJN4fKSEcdfF3iz4T14j
U6EZks89GoHwA523OghUZkKNOqlsUdWfdKz+8/grQqKisYwDcf3fCxEYk/4weDCQ
b6fFlOv6+AI3btjXp6F511ZKxyT4ZZzkHjp/ZSrhBygyamNZfax0ma0j+ZS9AZql
kPxQS0nOve6NKaP7vXxMmW5sGMnL19ER/Hm31wthGcWI43GVebUdklnzfGaEeSjs
pmP8oiCNemceqVpiPKxcOxiguf/eyIjP1SFXbguASygUmQeTDbbJ8n1FYznCitue
xJgWttKWsEf/aMR3eJtQ3aBmHR3rijAV4E28Wlq8XMkocwvpQm2zMocS2Z5BJ80S
hi1kQVy8+RxNX96tOSp1
=GSWl
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree core updates from Grant Likely:
"Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding
the entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either"
Tim Bird questions whether the boot time cost of the random feeding may
be noticeable. And "add_device_randomness()" is definitely not some
speed deamon of a function.
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/platform: add error reporting to of_amba_device_create()
irq/of: Fix comment typo for irq_of_parse_and_map
of: Feed entire flattened device tree into the random pool
of/fdt: Clean up casting in unflattening path
of/fdt: Remove duplicate memory clearing on FDT unflattening
gpio: implement gpio-ranges binding document fix
of: call __of_parse_phandle_with_args from of_parse_phandle
of: introduce of_parse_phandle_with_fixed_args
of: move of_parse_phandle()
of: move documentation of of_parse_phandle_with_args
of: Fix missing memory initialization on FDT unflattening
of: consolidate definition of early_init_dt_alloc_memory_arch()
of: Make of_get_phy_mode() return int i.s.o. const int
include: dt-binding: input: create a DT header defining key codes.
of/platform: Staticize of_platform_device_create_pdata()
of: Specify initrd location using 64-bit
dt: Typo fix
OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled
--------------->8--------------------
WARNING: vmlinux.o(.text+0x708): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_cache_bcr()
WARNING: vmlinux.o(.text+0x702): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_mmu_bcr()
--------------->8--------------------
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.
Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
It prevents kernel parameters such as 'loglevel' from doing their job.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Some drivers require these, and ARC didn't had them yet.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This helps remove asid-to-mm reverse map
While mm->context.id contains the ASID assigned to a process, our ASID
allocator also used asid_mm_map[] reverse map. In a new allocation
cycle (mm->ASID >= @asid_cache), the Round Robin ASID allocator used this
to check if new @asid_cache belonged to some mm2 (from prev cycle).
If so, it could locate that mm using the ASID reverse map, and mark that
mm as unallocated ASID, to force it to refresh at the time of switch_mm()
However, for SMP, the reverse map has to be maintained per CPU, so
becomes 2 dimensional, hence got rid of it.
With reverse map gone, it is NOT possible to reach out to current
assignee. So we track the ASID allocation generation/cycle and
on every switch_mm(), check if the current generation of CPU ASID is
same as mm's ASID; If not it is refreshed.
(Based loosely on arch/sh implementation)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ASID allocation changes/2
Use the fact that switch_mm() and activate_mm() are exactly same code
now while acknowledging the semantical difference in comment
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ASID allocation changes/1
This patch does 2 things:
(1) get_new_mmu_context() NOW moves mm->ASID to a new value ONLY if it
was from a prev allocation cycle/generation OR if mm had no ASID
allocated (vs. before would unconditionally moving to a new ASID)
Callers desiring unconditional update of ASID, e.g.local_flush_tlb_mm()
(for parent's address space invalidation at fork) need to first force
the parent to an unallocated ASID.
(2) get_new_mmu_context() always sets the MMU PID reg with unchanged/new
ASID value.
The gains are:
- consolidation of all asid alloc logic into get_new_mmu_context()
- avoiding code duplication in switch_mm() for PID reg setting
- Enables future change to fold activate_mm() into switch_mm()
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-Asm code already has values of SW and HW ASID values, so they can be
passed to the printing routine.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This reorganizes the current TLB operations into psuedo-ops to better
pair with MMUv4's native Insert/Delete operations
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With previous commit freeing up PTE bits, reassign them so as to:
- Match the bit to H/w counterpart where possible
(e.g. MMUv2 GLOBAL/PRESENT, this avoids a shift in create_tlb())
- Avoid holes in _PAGE_xxx definitions
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The current ARC VM code has 13 flags in Page Table entry: some software
(accesed/dirty/non-linear-maps) and rest hardware specific. With 8k MMU
page, we need 19 bits for addressing page frame so remaining 13 bits is
just about enough to accomodate the current flags.
In MMUv4 there are 2 additional flags, SZ (normal or super page) and WT
(cache access mode write-thru) - and additionally PFN is 20 bits (vs. 19
before for 8k). Thus these can't be held in current PTE w/o making each
entry 64bit wide.
It seems there is some scope of compressing the current PTE flags (and
freeing up a few bits). Currently PTE contains fully orthogonal distinct
access permissions for kernel and user mode (Kr, Kw, Kx; Ur, Uw, Ux)
which can be folded into one set (R, W, X). The translation of 3 PTE
bits into 6 TLB bits (when programming the MMU) can be done based on
following pre-requites/assumptions:
1. For kernel-mode-only translations (vmalloc: 0x7000_0000 to
0x7FFF_FFFF), PTE additionally has PAGE_GLOBAL flag set (and user
space entries can never be global). Thus such a PTE can translate
to Kr, Kw, Kx (as appropriate) and zero for User mode counterparts.
2. For non global entries, the PTE flags can be used to create mirrored
K and U TLB bits. This is true after commit a950549c67
"ARC: copy_(to|from)_user() to honor usermode-access permissions"
which ensured that user-space translations _MUST_ have same access
permissions for both U/K mode accesses so that copy_{to,from}_user()
play fair with fault based CoW break and such...
There is no such thing as free lunch - the cost is slightly infalted
TLB-Miss Handlers.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* reduce editor lines taken by pt_regs
* ARCompact ISA specific part of TLB Miss handlers clubbed together
* cleanup some comments
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Most architectures use the same implementation. Collapse the common ones
into a single weak function that can be overridden.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
In the exception return path, for both U/K cases, intr are already
disabled (for various existing reasons). So when we drop down to
@restore_regs, we need not redo that.
There was subtle issue - when intr were NOT being disabled for
ret-to-kernel-but-no-preemption case - now fixed by moving the
IRQ_DISABLE further up in @resume_kernel_mode.
So what do we gain:
* Shaves off a few insn in return path.
* Eliminates the need for IRQ_DISABLE_SAVE assembler macro for ARCv2
hence allows for entry code sharing.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
After the recent cleanups, all the exception handlers now have same
boilerplate prologue code. Move that into common macro.
This reduces readability but helps greatly with sharing / duplicating
entry code with ARCv2 ISA where the handlers are pretty much the same,
just the entry prologue is different (due to hardware assist).
Also while at it, add the missing FAKE_RET_FROM_EXCPN calls in couple of
places to drop down to pure kernel mode (from exception mode) before
jumping off into "C" code.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit. These systems need the ability to specify the
initrd location using 64-bit numbers.
This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.
There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"
More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690https://lkml.org/lkml/2012/9/13/544
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
corresponding drivers (net/ethernet/arc/*, irqctl/irq-tb10x.c) have now
been merged into your tree.
Ideally these shd have been part of same submissions, oh well...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR3OKpAAoJEGnX8d3iisJeB5EP/i4HlYQXrZWpT/6Bm7gdyzqA
hvA0oObRvGfgVbxG7R/9sKHU/FDx479pwrZbXPU4I+WM34QRnueXtAodBzrK7jQp
FdcBVN+YCcKUJh4Gxs+a5jtKj8v6u8+NaJc3L/S2+4vJlznoAY6CxnlQX3SK0SAC
d7irz9ZlP58q8eFHBJXa96HMIf7A2EACmwG3Uo6oDj15LVx+XW5i2o6rABvfUNDx
eKnSekoav1CRJyW4RJYI/hJdUM3vcbbRcz/2IyqSquWn8EqyZWY9iijdIwoUha3c
6geZ2YXi4rikU5kA/3cVuEQa68gj7gTaceEE7RT8y5H0DQjpgB07Tkyr9kFLpnRI
b+SGpQVA6FsLtDFNiRjsu2Ft/411UnRfhgv5d7SsuCqeVVPkCEd13Ttj4g49EJjw
Z3FWXTY0f/zYOSzU/6UQ6KZ/ZEvnnasLSCzDPrEK7WVu4lFuMz9BJglzSAbMMUQI
XZaWsm8ForzuYYUDFXQ79ORGsGKjhq3Pl1kF4CQ4TYjL75DwWgEAi9TI6ENuLH8+
+Ox4l385V51/YgcasawgpmGG1APgLtqw5ZFu6GUu6mytQs2huenaoT9+ov2FSu2N
B0o04nmm6h1rgSdiq7ZKDHv5lS1RXYYgoIDYYLqp5Lxk2JcMxvZYfMhbb/4KV+7/
y6w3ui2SNA5ttV7v41zW
=QgUU
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.11-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull second set of ARC architecture updates from Vineet Gupta:
"Couple of Platform updates (Device Tree files primarily) given that
the corresponding drivers (net/ethernet/arc/*, irqctl/irq-tb10x.c)
have now been merged into your tree.
Ideally these shd have been part of same submissions, oh well..."
* tag 'arc-v3.11-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [TB10x] Updates for irqchip driver
ARC: [plat-arcfpga] Enable arc_emac for ARCAngle4 Board
A few remaining architectures directly kill the page faulting task in an
out of memory situation. This is usually not a good idea since that
task might not even use a significant amount of memory and so may not be
the optimal victim to resolve the situation.
Since 2.6.29's 1c0fe6e ("mm: invoke oom-killer from page fault") there
is a hook that architecture page fault handlers are supposed to call to
invoke the OOM killer and let it pick the right task to kill. Convert
the remaining architectures over to this hook.
To have the previous behavior of simply taking out the faulting task the
vm.oom_kill_allocating_task sysctl can be set to 1.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i2c updates from Wolfram Sang:
- new drivers: Kontron PLD, Wondermedia VT
- mv64xxx driver gained sun4i support and a bigger cleanup
- duplicate driver 'intel-mid' removed
- added generic device tree binding for sda holding time (and
designware driver already uses it)
- we tried to allow driver probing with only device tree and no i2c
ids, but I had to revert it because of side effects. Needs some
rethinking.
- driver bugfixes, cleanups...
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (34 commits)
i2c-designware: use div_u64 to fix link
i2c: Kontron PLD i2c bus driver
i2c: iop3xxx: fix build failure after waitqueue changes
i2c-designware: make SDA hold time configurable
i2c: mv64xxx: Set bus frequency to 100kHz if clock-frequency is not provided
i2c: imx: allow autoloading on dt ids
i2c: mv64xxx: Fix transfer error code
i2c: i801: SMBus patch for Intel Coleto Creek DeviceIDs
i2c: omap: correct usage of the interrupt enable register
i2c-pxa: prepare clock before use
Revert "i2c: core: make it possible to match a pure device tree driver"
i2c: nomadik: allocate adapter number dynamically
i2c: nomadik: support elder Nomadiks
i2c: mv64xxx: Add Allwinner sun4i compatible
i2c: mv64xxx: make the registers offset configurable
i2c: mv64xxx: Add macros to access parts of registers
i2c: vt8500: Add support for I2C bus on Wondermedia SoCs
i2c: designware: fix race between subsequent xfers
i2c: bfin-twi: Read and write the FIFO in loop
i2c: core: make it possible to match a pure device tree driver
...
Merge Kconfig menu diet patches from Dave Hansen:
"I think the "Kernel Hacking" menu has gotten a bit out of hand. It is
over 120 lines long on my system with everything enabled and options
are scattered around it haphazardly.
http://sr71.net/~dave/linux/kconfig-horror.png
Let's try to introduce some sanity. This set takes that 120 lines
down to 55 and makes it vastly easier to find some things. It's a
start.
This set stands on its own, but there is plenty of room for follow-up
patches. The arch-specific debug options still end up getting stuck
in the top-level "kernel hacking" menu. OPTIMIZE_INLINING, for
instance, could obviously go in to the "compiler options" menu, but
the fact that it is defined in arch/ in a separate Kconfig file keeps
it on its own for the moment.
The Signed-off-by's in here look funky. I changed employers while
working on this set, so I have signoffs from both email addresses"
* emailed patches from Dave Hansen <dave@sr71.net>:
hang and lockup detection menu
kconfig: consolidate printk options
group locking debugging options
consolidate compilation option configs
consolidate runtime testing configs
order memory debugging Kconfig options
consolidate per-arch stack overflow debugging options
Original posting:
http://lkml.kernel.org/r/20121214184202.F54094D9@kernel.stglabs.ibm.com
Several architectures have similar stack debugging config options.
They all pretty much do the same thing, some with slightly
differing help text.
This patch changes the architectures to instead enable a Kconfig
boolean, and then use that boolean in the generic Kconfig.debug
to present the actual menu option. This removes a bunch of
duplication and adds consistency across arches.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge first patch-bomb from Andrew Morton:
- various misc bits
- I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
distracted. There has been quite a bit of activity.
- About half the MM queue
- Some backlight bits
- Various lib/ updates
- checkpatch updates
- zillions more little rtc patches
- ptrace
- signals
- exec
- procfs
- rapidio
- nbd
- aoe
- pps
- memstick
- tools/testing/selftests updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits)
tools/testing/selftests: don't assume the x bit is set on scripts
selftests: add .gitignore for kcmp
selftests: fix clean target in kcmp Makefile
selftests: add .gitignore for vm
selftests: add hugetlbfstest
self-test: fix make clean
selftests: exit 1 on failure
kernel/resource.c: remove the unneeded assignment in function __find_resource
aio: fix wrong comment in aio_complete()
drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
drivers/memstick/host/r592.c: convert to module_pci_driver
drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
pps-gpio: add device-tree binding and support
drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
drivers/parport/share.c: use kzalloc
Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
aoe: update internal version number to v83
aoe: update copyright date
aoe: perform I/O completions in parallel
...
Prepare for removing num_physpages and simplify mem_init().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> # for arch/arc
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().
With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
pattern '0'. This could be used to get rid of poison_init_mem()
on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:
arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
Also address some minor code review comments.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights of changes:
-Continuation of ARC MM changes from 3.10 including
zero page optimization;
Setting pagecache pages dirty by default;
Non executable stack by default;
Reducing dcache flushes for aliasing VIPT config
-Long overdue rework of pt_regs machinery - removing the unused word gutters
and adding ECR register to baseline (helps cleanup lot of low level code)
-Support for ARC gcc 4.8
-Few other preventive fixes, cosmetics, usage of Kconfig helper..
The diffstat is larger than normal primarily because of arcregs.h header split
as well as beautification of macros in entry.h
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR0mEbAAoJEGnX8d3iisJe0lEP+gPtakLaOBprpxrVttK3DCLw
E28LMVdR+33jfqugqk8xMVQh9gc5mvSGGJkZFGPoQX9gumdDjJP4vRxqlZxLgpNH
siNr+LvyzUOUa3bQrB6q639jlqlvPcKO19Yjzx1vqKRkMBsV8+YG4+8tdBNvh2K4
dr6nB6gqmZiCWmuUpTQ+D2J7vjlNwi5PWHBmlP0uFH433Z78Wk2tT2LWUHihl6Do
+Uhp/Ldu7EDFTVLDlqc5pv0EUNnGzuR9dpN3XuxfZLknKMtJKpwqLq0/6bnfWqck
g+Cmh8jvj5SAutbzZiG/xggZMtSHOXsqY06+MALquQ7tcoXrm7U0Ubbplt1B55y/
OW8yWD/b1S+2pVVCmbIWV2EinNmSMpdX18nVUmWSggQYBjKQnMq611Y7q1EGBmen
cTCY9u7jv7tD+Qem4H/rze7b+EKlcSFDhzL/Df/iHHnvAmEjb+trmD3qJ6PtA93w
24b+fVaBi5j5dHZx4NEggDKTkmUVcqkD2qfg4Ckg4mqq7cBHfJponuoDJX4KM7t1
qozSpKLMMSzRd3L7DQhePqNJfrlOD2rhD4ELRCJA131dUANr4Wpi2yvIASro135A
4gmpYX5WtjohWjL73aRCwsRcLTnyNmqjPfibT35ZaAygW7wXII7zNVzKY09oEPN3
keol0YTXDokCzGN432NI
=65Fq
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull first batch of ARC changes from Vineet Gupta:
"There's a second bunch to follow next week - which depends on commits
on other trees (irq/net). I'd have preferred the accompanying ARC
change via respective trees, but it didn't workout somehow.
Highlights of changes:
- Continuation of ARC MM changes from 3.10 including
zero page optimization
Setting pagecache pages dirty by default
Non executable stack by default
Reducing dcache flushes for aliasing VIPT config
- Long overdue rework of pt_regs machinery - removing the unused word
gutters and adding ECR register to baseline (helps cleanup lot of
low level code)
- Support for ARC gcc 4.8
- Few other preventive fixes, cosmetics, usage of Kconfig helper..
The diffstat is larger than normal primarily because of arcregs.h
header split as well as beautification of macros in entry.h"
* tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (32 commits)
ARC: warn on improper stack unwind FDE entries
arc: delete __cpuinit usage from all arc files
ARC: [tlb-miss] Fix bug with CONFIG_ARC_DBG_TLB_MISS_COUNT
ARC: [tlb-miss] Extraneous PTE bit testing/setting
ARC: Adjustments for gcc 4.8
ARC: Setup Vector Table Base in early boot
ARC: Remove explicit passing around of ECR
ARC: pt_regs update #5: Use real ECR for pt_regs->event vs. synth values
ARC: stop using pt_regs->orig_r8
ARC: pt_regs update #4: r25 saved/restored unconditionally
ARC: K/U SP saved from one location in stack switching macro
ARC: Entry Handler tweaks: Simplify branch for in-kernel preemption
ARC: Entry Handler tweaks: Avoid hardcoded LIMMS for ECR values
ARC: Increase readability of entry handlers
ARC: pt_regs update #3: Remove unused gutter at start of callee_regs
ARC: pt_regs update #2: Remove unused gutter at start of pt_regs
ARC: pt_regs update #1: Align pt_regs end with end of kernel stack page
ARC: pt_regs update #0: remove kernel stack canary
ARC: [mm] Remove @write argument to do_page_fault()
ARC: [mm] Make stack/heap Non-executable by default
...
Device tree and Kconfig updates for irqchip driver.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/arc uses of the __cpuinit macros from
all C files. Currently arc does not have any __CPUINIT used in
assembly files.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
LOAD_FAULT_PTE macro is expected to set r2 with faulting vaddr.
However in case of CONFIG_ARC_DBG_TLB_MISS_COUNT, it was getting
clobbered with statistics collection code.
Fix latter by using a different register.
Note that only I-TLB Miss handler was potentially affected.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* No need to check for READ access in I-TLB Miss handler
* Redundant PAGE_PRESENT update in PTE
Post TLB entry installation, in updating PTE for software accessed/dity
bits, no need to update PAGE_PRESENT since it will already be set.
Infact the entry won't have installed if !PAGE_PRESENT.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* DWARF unwinder related
+ Force DWARF2 compliant .debug_frame (gcc 4.8 defaults to DWARF4
which kernel unwinder can't grok).
+ Discard the additional .eh_frame generated
+ Discard the dwarf4 debug info generated by -gdwarf-2 for normal
no debug case
* 4.8 already uses arc600 multilibs for -mno-mpy
* switch to using uclibc compiler (to get -mmedium-calls and -mno-sdata)
and also since buildroot can only use 1 toolchain
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This patch makes the SDA hold time configurable through device tree.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> for arch/arc bits
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Otherwise early boot exceptions such as instructions errors due to
configuration mismatch between kernel and hardware go off to la-la land,
as opposed to hitting the handler and panic()'ing properly.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With ECR now part of pt_regs
* No need to propagate from lowest asm handlers as arg
* No need to save it in tsk->thread.cause_code
* Avoid bit chopping to access the bit-fields
More code consolidation, cleanup
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
pt_regs->event was set with artificial values to identify the low level
system event (syscall trap / breakpoint trap / exceptions / interrupts)
With r8 saving out of the way, the full word can be used to save real
ECR (Exception Cause Register) which helps idenify the event naturally,
including additional info such as cause code, param.
Only for Interrupts, where ECR is not applicable, do we resort to
synthetic non ECR values.
SAVE_ALL_TRAP/EXCEPTIONS can now be merged as they both use ECR with
different runtime values.
The ptrace helpers now use the sub-fields of ECR to distinguish the
events (e.g. vector 0x25 is trap, param 0 is syscall...)
The following benefits will follow:
(1) This centralizes the location of where ECR is saved and will allow
the cleanup of task->thread.cause_code ECR placeholder which is set
in non-uniform way. Then ARC VM code can safely rely on it being
there for purpose of finer grained VM_EXEC dcache flush (based on
exec fault: I-TLB Miss)
(2) Further, ECR being passed around from low level handlers as arg can
be eliminated as it is part of standard reg-file in pt_regs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Historically, pt_regs have had orig_r8, an overloaded container for
(1) backup copy of r8 (syscall number Trap Exceptions)
(2) additional system state: (syscall/Exception/Interrupt)
There is no point in keeping (1) since syscall number is never clobbered
in-place, in pt_regs, unlike r0 which duals as first syscall arg as well
as syscall return value and in case of syscall restart, the orig arg0
needs restoring (from orig_r0) after having been updated in-place with
syscall ret value.
This further paves way to convert (2) to contain ECR itself (rather than
current madeup values)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
(This is a VERY IMP change for low level interrupt/exception handling)
-----------------------------------------------------------------------
WHAT
-----------------------------------------------------------------------
* User 25 now saved in pt_regs->user_r25 (vs. tsk->thread_info.user_r25)
* This allows Low level interrupt code to unconditionally save r25
(vs. the prev version which would only do it for U->K transition).
Ofcourse for nested interrupts, only the pt_regs->user_r25 of
bottom-most frame is useful.
* simplifies the interrupt prologue/epilogue
* Needed for ARCv2 ISA code and done here to keep design similar with
ARCompact event handling
-----------------------------------------------------------------------
WHY
-------------------------------------------------------------------------
With CONFIG_ARC_CURR_IN_REG, r25 is used to cache "current" task pointer
in kernel mode. So when entering kernel mode from User Mode
- user r25 is specially safe-kept (it being a callee reg is NOT part of
pt_regs which are saved by default on each interrupt/trap/exception)
- r25 loaded with current task pointer.
Further, if interrupt was taken in kernel mode, this is skipped since we
know that r25 already has valid "current" pointer.
With 2 level of interrupts in ARCompact ISA, detecting this is difficult
but still possible, since we could be in kernel mode but r25 not already saved
(in fact the stack itself might not have been switched).
A. User mode
B. L1 IRQ taken
C. L2 IRQ taken (while on 1st line of L1 ISR)
So in #C, although in kernel mode, r25 not saved (infact SP not
switched at all)
Given that ARcompact has manual stack switching, we could use a bit of
trickey - The low level code would make sure that SP is only set to kernel
mode value at the very end (after saving r25). So a non kernel mode SP,
even if in kernel mode, meant r25 was NOT saved.
The same paradigm won't work in ARCv2 ISA since SP is auto-switched so
it's setting can't be delayed/constrained.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This paves way for further simplifications.
There's an overhead of 1 insn for the non-common case of interrupt taken
from kernel mode.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* use artificial PUSH/POP contructs for CORE Reg save/restore to stack
* use artificial PUSHAX/POPAX contructs for Auxiliary Space regs
* macro'ize multiple copies of callee-reg-save/restore (SAVE_R13_TO_R24)
* use BIC insn for inverse-and operation
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This is trickier than prev two:
* context switching code saves kernel mode callee regs in the format of
struct callee_regs thus needs adjustment. This also reduces the height
of topmost kernel stack frame by 1 word.
* Since kernel stack unwinder is sensitive to height of topmost kernel
stack frame, that needs a word of adjustment too.
ptrace needs a bit of updating since pt_regs now diverges from
user_regs_struct.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Historically, pt_regs would end at offset of 1 word from end of stack
page.
----------------- -> START of page (task->stack)
| |
| thread_info |
-----------------
| |
^ ~ ~
| ~ ~
| | |
| | | <---- pt_regs used to END here
-----------------
| 1 word GUTTER |
----------------- -> End of page (START of kernel stack)
This required special "one-off" considerations in low level code.
The root cause is very likely assumption of "empty" SP by the original
ARC kernel hackers, despite ARC700 always been "full" SP.
So finally RIP one word gutter !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This can be ascertained within do_page_fault() since it gets the full
ECR (Exception Cause Register).
Further, for both the callers of do_page_fault(): Prot-V / D-TLB-Miss,
the cause sub-fields in ECR are same for same type of access, making the
code much more simpler.
D-TLB-Miss [LD] 0x00_21_01_00
Prot-V [LD] 0x00_23_01_00
^^
D-TLB-Miss [ST] 0x00_21_02_00
Prot-V [ST] 0x00_23_02_00
^^
D-TLB-Miss [EX] 0x00_21_03_00
Prot-V [EX] 0x00_23_03_00
^^
This helps code consolidation, which is even better when moving code from
assembler to "C".
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
1. For VM_EXEC based delayed dcache/icache flush, reduces the number of
flushes.
2. Makes this security feature ON by default rather than OFF before.
3. Applications can use mprotect() to selectively override this.
4. ELF binaries have a GNU_STACK segment which can easily override the
kernel default permissions.
For nested-functions/trampolines, gcc already auto-enables executable
stack in elf. Others needing this can use -Wl,-z,execstack option.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Non-congruent SRC page in copy_user_page() is dcache clean in the end -
so record that fact, to avoid a subsequent extraneous flush.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Number of (i|d)cache ways can be retrieved from BCRs and hence no need
to cross check with with built-in constants
* Use of IS_ENABLED() to check for a Kconfig option
* is_not_cache_aligned() not used anymore
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Move the various sub-system defines/types into relevant files/functions
(reduces compilation time)
* move CPU specific stuff out of asm/tlb.h into asm/mmu.h
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This fixes the following:
- CONFIG_ARC_SERIAL_BAUD is only defined when CONFIG_SERIAL_ARC is defined.
Make sure that it isn't referenced otherwise.
- There is no use for initializing arc_uart_info[] when CONFIG_SERIAL_ARC is
not defined.
[vgupta: tweaked changelog title, used IS_ENABLED() kconfig helper]
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
gdbserver inserting a breakpoint ends up calling copy_user_page() for a
code page. The generic version of which (non-aliasing config) didn't set
the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
corresponding dynamic loader code page - causing garbade to be executed.
So now aliasing versions of copy_user_highpage()/clear_page() are made
default. There is no significant overhead since all of special alias
handling code is compiled out for non-aliasing build
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The current code uses 2 bits for determining page's dcache color, thus
sorting pages into 4 bins, whereas the aliasing dcache really has 2 bins
(8k page, 64k dcache - 4 way-set-assoc).
This can cause extraneous flushes - e.g. color 0 and 2.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The VM_EXEC check in update_mmu_cache() was getting optimized away
because of a stupid error in definition of macro addr_not_cache_congruent()
The intention was to have the equivalent of following:
if (a || (1 ? b : 0))
but we ended up with following:
if (a || 1 ? b : 0)
And because precedence of '||' is more that that of '?', gcc was optimizing
away evaluation of <a>
Nasty Repercussions:
1. For non-aliasing configs it would mean some extraneous dcache flushes
for non-code pages if U/K mappings were not congruent.
2. For aliasing config, some needed dcache flush for code pages might
be missed if U/K mappings were congruent.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This manifested as grep failing psuedo-randomly:
-------------->8---------------------
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$
[ARCLinux]$ ip address show lo | grep inet
inet 127.0.0.1/8 scope host lo
-------------->8---------------------
ARC700 MMU provides fully orthogonal permission bits per page:
Ur, Uw, Ux, Kr, Kw, Kx
The user mode page permission templates used to have all Kernel mode
access bits enabled.
This caused a tricky race condition observed with uClibc buffered file
read and UNIX pipes.
1. Read access to an anon mapped page in libc .bss: write-protected
zero_page mapped: TLB Entry installed with Ur + K[rwx]
2. grep calls libc:getc() -> buffered read layer calls read(2) with the
internal read buffer in same .bss page.
The read() call is on STDIN which has been redirected to a pipe.
read(2) => sys_read() => pipe_read() => copy_to_user()
3. Since page has Kernel-write permission (despite being user-mode
write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss
Exception (page-fault for ARC). core-MM is unaware that kernel
erroneously wrote to the reserved read-only zero-page (BUG #1)
4. Control returns to userspace which now does a write to same .bss page
Since Linux MM is not aware that page has been modified by kernel, it
simply reassigns a new writable zero-init page to mapping, loosing the
prior write by kernel - effectively zero'ing out the libc read buffer
under the hood - hence grep doesn't see right data (BUG #2)
The fix is to make all kernel-mode access permissions mirror the
user-mode ones. Note that the kernel still has full access to pages,
when accessed directly (w/o MMU) - this fix ensures that kernel-mode
access in copy_to_from() path uses the same faulting access model as for
pure user accesses to keep MM fully aware of page state.
The issue is peudo-random because it only shows up if the TLB entry
installed in #1 is present at the time of #3. If it is evicted out, due
to TLB pressure or some-such, then copy_to_user() does take a TLB Miss
Exception, with a routine write-to-anon COW processing installing a
fresh page for kernel writes and also usable as it is in userspace.
Further the issue was dormant for so long as it depends on where the
libc internal read buffer (in .bss) is mapped at runtime.
If it happens to reside in file-backed data mapping of libc (in the
page-aligned slack space trailing the file backed data), loader zero
padding the slack space, does the early cow page replacement, setting
things up at the very beginning itself.
With gcc 4.8 based builds, the libc buffer got pushed out to a real
anon mapping which triggers the issue.
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Flush and INVALIDATE the dcache page.
This helper is only used for writeback of CODE pages to memory. So
there's no value in keeping the dcache lines around. Infact it is risky
as a writeback on natural eviction under pressure can cause un-needed
writeback with weird issues on aliasing dcache configurations.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The TB10x platform port includes a custom mechanism using to set up
default pin controller configurations using abilis,simple-default
pin configurations of nodes compatible with abilis,simple-pinctrl. This
mechanism is redundant with the Linux standard "default" pin
configuration, see commit ab78029ecc
"drivers/pinctrl: grab default handles from device core".
This patch removes the TB10x custom mechanism in favour of the Linux
standard.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
I'm satisified with testing, specially with fuse which has historically given
grief to VIPT arches (ARM/PARISC...)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRjPVlAAoJEGnX8d3iisJeF60P/36JELOrdTOJB8GDn/i22yu4
zxFrroH4EpXCobe4JzrmhXU0vdMKmyhaRsdmQt2pV474LX1ajQeoRi6n7xuiYymr
p0H8/jde7ZDHGxJRP9OIuIIU57N+sVwQvrXvfnz/dc4c92MD9EWEwvoytnCVuKSx
k/YIQ5oDj6hFH13V68eXXs+KBrXyPiYO9UIWYmwRZv/0Bm6P1EpXQSMWzeCP0X/y
FHVEc4i92bIkqpOadUnhCBHGZqjkrhwFL2FCI35/12TgSwjPU1Q6IJCtH7Lg9ygI
oFqut5wN+dhkxlfiyvgTXErmnvhDOHLbMQJYC9svHin8TtfznQhJDAWYeSH/6Mde
/hE/bXV55ucdJ48ZT6JRvxHeJQgTAvbXDIIQYe/wAJEvXidWEo4Y/qRTIXP03Ixm
Ie69d9WSmUlx4FmYxBQodDQFK9slnc+0UfsWcCG+OrT0wwrKBRr3To2WYEFowQer
fpIk6/68+LiQK+IzFAhPtBppSgcQoEBsXAD/MDKsQcRk84W1nKPt6SiT/wCAzOZ7
ezTL1eSlfzWXNbtohdm8xN8frt/4fZLZTaZkqtnMPBi0iIC5DAsJy27YlBQDaRd/
Hzd0y6bJQDcsjjWmZuUkFZVbCyYlE0CIW6sbHC91i51egKReYHy7T6Ef/n+qn+ho
jxohq5/JvG+0Vha0Q2h6
=r4lw
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull second set of arc arch updates from Vineet Gupta:
"Aliasing VIPT dcache support for ARC
I'm satisified with testing, specially with fuse which has
historically given grief to VIPT arches (ARM/PARISC...)"
* tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [TB10x] Remove GENERIC_GPIO
ARC: [mm] Aliasing VIPT dcache support 4/4
ARC: [mm] Aliasing VIPT dcache support 3/4
ARC: [mm] Aliasing VIPT dcache support 2/4
ARC: [mm] Aliasing VIPT dcache support 1/4
ARC: [mm] refactor the core (i|d)cache line ops loops
ARC: [mm] serious bug in vaddr based icache flush
* Support for two new platforms based on ARC700
- Abilis TB10x SoC [Chritisian/Pierrick]
- Simulator only System-C Model [Mischa]
* ARC specific MM improvements
- Avoid full TLB flush (ASID increment) on munmap (even single page)
- VIPT Cache Flushing improvements
+ Delayed dcache flush for non-aliasing dcache (big performance boost)
+ icache flush aliasing agnostic (no need to kill all possible aliases)
* Others
- Avoid needless rebuild of DTB files for every kernel build
- Remove builtin cmdline as that is already provided by DeviceTree/bootargs
- Fixing unaligned access emulation corner case
- checkpatch fixes [Sachin]
- Various fixlets [Noam]
- Minor build failures/cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRiydEAAoJEGnX8d3iisJewBkQAJ/cvIjrIuMMdeDo0bokzZN2
bfsG8U+V7S0CKjqLyUD1bMRXo7rTgus8hp/klVORRXoAwSKiWhkj0p6dqJjCGUGc
LqtEPlxaLR1X+pe5wKtpB3j6kTpRwictVhFUqkFACUxGZx1GbYFxdeL8+3oDsepW
lwidBYgoya+Q4puRfmY/sCTtVhJlwTUi6g0lmpaEPWc3T2s83/u1GnlVBYd9B7yA
fCYkUceC2ZnuRMfH00sRsjQ3USyyYptGpY8U1nv9STFJ3fC+EestBPppAAwAzcm0
iiRgo3s314EzfeNRgzjjHrbULnVUJWkrdFXPisWepyPyVjsgnVr84YkO3kiEN6l7
JM1cUCvJUbKZaOb2iUwrlPOqISNjCyY9QZWwqW6PCG3TBpfQsJM2mnXKnPLvV2v2
5Ttba4+ETZ3rn4pPIuTeC6REr0aV/Zl1LKeWuk8PXz4aljWrlOrifBN6QkXWr4HU
4z6X3j2nw4T2LzqwbNYF+xLaDaZZpV8UdvQwOkliCxZ04myx135ImvgOim7Hh9j+
Ow1Jp9mRZha/44qM3jXdZ6Cv55pEOR4oQJsl6OBNUXgb5bnDcFHz1UpZtzoZ2V+s
RPozsfnleNXxDIJlCGK96+PG5qVTRQvklowsz6aTP8r+EEbQnrRlnrhi82cQWqnM
sMxzN320zrt/MWXxec89
=WeDp
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.10-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC port updates from Vineet Gupta:
"Support for two new platforms based on ARC700:
- Abilis TB10x SoC [Chritisian/Pierrick]
- Simulator only System-C Model [Mischa]
ARC specific MM improvements:
- Avoid full TLB flush (ASID increment) on munmap (even single page)
- VIPT Cache Flushing improvements
+ Delayed dcache flush for non-aliasing dcache (big performance boost)
+ icache flush aliasing agnostic (no need to kill all possible aliases)
Others:
- Avoid needless rebuild of DTB files for every kernel build
- Remove builtin cmdline as that is already provided by DeviceTree/bootargs
- Fixing unaligned access emulation corner case
- checkpatch fixes [Sachin]
- Various fixlets [Noam]
- Minor build failures/cleanups"
* tag 'arc-v3.10-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (35 commits)
ARC: [mm] Lazy D-cache flush (non aliasing VIPT)
ARC: [mm] micro-optimize page size icache invalidate
ARC: [mm] remove the pessimistic all-alias-invalidate icache helpers
ARC: [mm] consolidate icache/dcache sync code
ARC: [mm] optimise icache flush for kernel mappings
ARC: [mm] optimise icache flush for user mappings
ARC: [mm] optimize needless full mm TLB flush on munmap
ARC: Add support for nSIM OSCI System C model
ARC: [TB10x] Adapt device tree to new compatible string
ARC: [TB10x] Add support for TB10x platform
ARC: [TB10x] Device tree of TB100 and TB101 Development Kits
ARC: Prepare interrupt code for external controllers
ARC: Allow embedded arc-intc to be properly placed in DT intc hierarchy
ARC: [cmdline] Don't overwrite u-boot provided bootargs
ARC: [cmdline] Remove CONFIG_CMDLINE
ARC: [plat-arcfpga] defconfig update
ARC: unaligned access emulation broken if callee-reg dest of LD/ST
ARC: unaligned access emulation error handling consolidation
ARC: Debug/crash-printing Improvements
ARC: fix typo with clock speed
...
This is the meat of the series which prevents any dcache alias creation
by always keeping the U and K mapping of a page congruent.
If a mapping already exists, and other tries to access the page, prev
one is flushed to physical page (wback+inv)
Essentially flush_dcache_page()/copy_user_highpage() create K-mapping
of a page, but try to defer flushing, unless U-mapping exist.
When page is actually mapped to userspace, update_mmu_cache() flushes
the K-mapping (in certain cases this can be optimised out)
Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page()
handle the puring of stale userspace mappings on exit/munmap...
flush_anon_page() handles the existing U-mapping for anon page before
kernel reads it via the GUP path.
Note that while not complete, this is enough to boot a simple
dynamically linked Busybox based rootfs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This preps the low level dcache flush helpers to take vaddr argument in
addition to the existing paddr to properly flush the VIPT dcache
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Nothing semantical
* simplify the alignement code by using & operation only
* rename variables clearly as paddr
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
vaddr used to index the cache was clipped from the wrong end, and thus
would potentially fail to flush the correct lines.
The problem was dorment for so long because up until the recent
optimizations it was only used for ptrace break-point only flushes.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
flush_dcache_page( ) is MM hook to ensure that a page has consistent
views between kernel and userspace. Thus it is called when
* kernel writes to a page which at some later point could get mapped to
userspace (so kernel mapping needs to be flushed-n-inv)
* kernel is about to read from a page with possible userspace mappings
(so userspace mappings needs to be made coherent with kernel ones)
However for Non aliasing VIPT dcache, any userspace mapping will always
be congruent to kernel mapping. Thus d-cache need need not be flushed at
all (or delayed indefinitely).
The only reason it does need to be flushed is when mapping code pages.
Since icache doesn't snoop dcache, those dirty dcache lines need to be
written back to memory and icache line invalidated so that icache lines
fetch will get the right data.
Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks.
(1) FPGA @ 80 MHZ
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K
3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K
3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K
^^^^ ^^^^ ^^^
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8
3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4
3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6
^^^^^ ^^^^^ ^^^^^ ^^^^
(2) Customer Silicon @ 500 MHz (166 MHz mem)
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K
abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K
^^^^ ^^^^ ^^^
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
start address is already page aligned and size is const PAGE_SIZE,
thus fixups for alignment not needed in generated code.
bloat-o-meter vmlinux-mm5 vmlinux
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-32 (-32)
function old new delta
__inv_icache_page 82 50 -32
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Now that we have same helper used for all icache invalidates (i.e.
vaddr+paddr based exact line invalidate), consolidate the open coded
calls into one place.
Also rename flush_icache_range_vaddr => __sync_icache_dcache
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This change continues the theme from prev commit - this time icache
handling for kernel's own code modification (vmalloc: loadable modules,
breakpoints for kprobes/kgdb...)
flush_icache_range() calls the CDU icache helper with vaddr to enable
exact line invalidate.
For a true kernel-virtual mapping, the vaddr is actually virtual hence
valid as index into cache. For kprobes breakpoint however, the vaddr arg
is actually paddr - since that's how normal kernel is mapped in ARC
memory map. This implies that CDU will use the same addr for
indexing as for tag match - which is fine since kernel code would only
have that "implicit" mapping and none other.
This should speed up module loading significantly - specially on default
ARC700 icache configurations (32k) which alias.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC icache doesn't snoop dcache thus executable pages need to be made
coherent before mapping into userspace in flush_icache_page().
However ARC700 CDU (hardware cache flush module) requires both vaddr
(index in cache) as well as paddr (tag match) to correctly identify a
line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus
the paddr only based flush_icache_page() API couldn't be implemented
efficiently. It had to loop thru all possible alias indexes and perform
the invalidate operation (ofcourse the cache op would only succeed at
the index(es) where tag matches - typically only 1, but the cost of
visiting all the cache-bins needs to paid nevertheless).
Turns out however that the vaddr (along with paddr) is available in
update_mmu_cache() hence better suits ARC icache flush semantics.
With both vaddr+paddr, exactly one flush operation per line is done.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
munmap ends up calling tlb_flush() which for ARC was flushing the entire
TLB unconditionally (by moving the MMU to a new ASID)
do_munmap
unmap_region
unmap_vmas
unmap_single_vma
unmap_page_range
tlb_start_vma
zap_pud_range
tlb_end_vma()
tlb_finish_mmu
tlb_flush() ---> unconditional flush_tlb_mm()
So even a single page munmap, a frequent operation when uClibc dynamic
linker (ldso) is loading the dependent shared libraries, would move the
the ASID multiple times - needlessly invalidating the pre-faulted TLB
entries (and increasing the rate of ASID wraparound + full TLB flush).
This is now optimised to only be called if tlb->full_mm (which means
for exit/execve) cases only. And for those cases, flush_tlb_mm() is
already optimised to be a no-op for mm->mm_users == 0.
So essentially there are no mmore full mm flushes - except for fork which
anyhow needs it for properly COW'ing parent address space.
munmap now needs to do TLB range flush, which is implemented with
tlb_end_vma()
Results
-------
1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or
7 before).
2. LMBench microbenchmark also shows improvements
Basic system parameters
------------------------------------------------------------------------------
Host OS Description Mhz tlb cache mem scal
pages line par load
bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba 80 8 64 1.1000 1
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full 80 8 64 1.1200 1
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc5-0 Linux 3.9.0-r 80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K
3.9-rc5-0 Linux 3.9.0-r 80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc5-0 Linux 3.9.0-r 314.7 223.2 1054.9 390.2 3615.0 1.590 20.1 126.6
3.9-rc5-0 Linux 3.9.0-r 265.8 183.8 1014.2 314.1 3193.0 6.910 18.8 110.4
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This adds support for an ARC Virtual Platform. This platform is based on the
System C standard promoted by the OSCI (Open System C Initiative) and uses
nSIM to simulate the ARC CPU core itself.
Users can build a virtual SoC by combining System C models of peripherals
and CPU cores.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The original device tree was written using a slightly different
implementation of the fixed-factor-clock device tree binding. The
compatible string must be modified in order to be compatible with the
new implementation.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Infrastructure required to make the Linux kernel compile and boot on the
Abilis Systems TB10x series of SOCs based on ARC700 CPUs:
- Kmake related files (Kconfig, Makefile, tb10x_defconfig)
- TB10x platform initialisation
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
These are the device tree files for the Abilis Systems TB100 and TB101 ICs and
their respective development kit PCBs. These files are committed in preparation
of the following patch set which adds support for these chips to the ARC
platform.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This patch adds some room for CPU-external interrupt controllers in the
Linux interrupt space. Until now, only the 32 CPU internal interrupt lines
were supported which does not allow for external interrupt controllers such
as GPIO modules etc.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
arc-intc is initialized in arc common code as it is applicable to all
platforms. However platforms with their own external intc still need to
refer to it for correct DT interrupt tree hierarchy setup,
e.g.
static struct of_device_id __initdata tb10x_irq_ids[] = {
{ .compatible = "snps,arc700-intc", .data = dummy_init_irq },
{ .compatible = "abilis,tb10x_ictl", .data = tb10x_init_irq },
{},
};
The fix is to use the generic irqchip framework to tie all irqchips in
a special linker section and then call irqchip_init() which calls the
DT of_irq_init() for all the intc in one go.
That way the platform code need not be aware of arc-intc at all.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The existing code was wrong on several counts:
* uboot provided bootargs were copied into @boot_command_line, only to
be over-written by setup_machine_fdt(), effectively lost
* @cmdline_p returned by setup_arch() to start_kernel() didn't include
the DT /bootargs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Given that DeviceTree /bootargs can provide similar functionality,
no point in providing duplicate infrastructure.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The fixup code correctly updates the callee-regs on stack, but
fails to unwind it into actual register file. Thus userspace won't see
the update.
Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
If CONFIG_ARC_MISALIGN_ACCESS is not enabled, or if the fixup fails,
call the same error handler: same signal/si_code to user (SIGBUS)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Remove the line-break between scratch/callee-regs (sneaked in when we
converted from printk to pr_*
* Use %pS to print the symbol names of faulting PC (ret pseudo register)
and BLINK (call return register)
* Don't print user-vma for a kernel crash (only do it for
print-fatal-signals based regfile dump)
* Verbose print the Interrupt/Exception Enable/Active state
* for main executable link address is 0x10000 based (vs. 0) thus offset
of faulting PC needs to be adjusted
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This tracks mainline commit ae903caae2 "Bury the conditionals from
kernel_thread/kernel_execve series" which we missed out as ARC port was
not yet mainline.
[vgupta: commit log modified]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
show_regs() is inherently arch-dependent but it does make sense to print
generic debug information and some archs already do albeit in slightly
different forms. This patch introduces a generic function to print debug
information from show_regs() so that different archs print out the same
information and it's much easier to modify what's printed.
show_regs_print_info() prints out the same debug info as dump_stack()
does plus task and thread_info pointers.
* Archs which didn't print debug info now do.
alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
um, xtensa
* Already prints debug info. Replaced with show_regs_print_info().
The printed information is superset of what used to be there.
arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
* s390 is special in that it used to print arch-specific information
along with generic debug info. Heiko and Martin think that the
arch-specific extra isn't worth keeping s390 specfic implementation.
Converted to use the generic version.
Note that now all archs print the debug info before actual register
dumps.
An example BUG() dump follows.
kernel BUG at /work/os/work/kernel/workqueue.c:4841!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
Hardware name: empty empty/S3992, BIOS 080011 10/26/2007
task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6
RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246
RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
Call Trace:
[<ffffffff81000312>] do_one_initcall+0x122/0x170
[<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
[<ffffffff81c47760>] ? rest_init+0x140/0x140
[<ffffffff81c4776e>] kernel_init+0xe/0xf0
[<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
[<ffffffff81c47760>] ? rest_init+0x140/0x140
...
v2: Typo fix in x86-32.
v3: CPU number dropped from show_regs_print_info() as
dump_stack_print_info() has been updated to print it. s390
specific implementation dropped as requested by s390 maintainers.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Both dump_stack() and show_stack() are currently implemented by each
architecture. show_stack(NULL, NULL) dumps the backtrace for the
current task as does dump_stack(). On some archs, dump_stack() prints
extra information - pid, utsname and so on - in addition to the
backtrace while the two are identical on other archs.
The usages in arch-independent code of the two functions indicate
show_stack(NULL, NULL) should print out bare backtrace while
dump_stack() is used for debugging purposes when something went wrong,
so it does make sense to print additional information on the task which
triggered dump_stack().
There's no reason to require archs to implement two separate but mostly
identical functions. It leads to unnecessary subtle information.
This patch expands the dummy fallback dump_stack() implementation in
lib/dump_stack.c such that it prints out debug information (taken from
x86) and invokes show_stack(NULL, NULL) and drops arch-specific
dump_stack() implementations in all archs except blackfin. Blackfin's
dump_stack() does something wonky that I don't understand.
Debug information can be printed separately by calling
dump_stack_print_info() so that arch-specific dump_stack()
implementation can still emit the same debug information. This is used
in blackfin.
This patch brings the following behavior changes.
* On some archs, an extra level in backtrace for show_stack() could be
printed. This is because the top frame was determined in
dump_stack() on those archs while generic dump_stack() can't do that
reliably. It can be compensated by inlining dump_stack() but not
sure whether that'd be necessary.
* Most archs didn't use to print debug info on dump_stack(). They do
now.
An example WARN dump follows.
WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
Hardware name: empty
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
Call Trace:
[<ffffffff81c614dc>] dump_stack+0x19/0x1b
[<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
[<ffffffff8234a071>] init_workqueues+0x35/0x505
...
v2: CPU number added to the generic debug info as requested by s390
folks and dropped the s390 specific dump_stack(). This loses %ksp
from the debug message which the maintainers think isn't important
enough to keep the s390-specific dump_stack() implementation.
dump_stack_print_info() is moved to kernel/printk.c from
lib/dump_stack.c. Because linkage is per objecct file,
dump_stack_print_info() living in the same lib file as generic
dump_stack() means that archs which implement custom dump_stack()
- at this point, only blackfin - can't use dump_stack_print_info()
as that will bring in the generic version of dump_stack() too. v1
The v1 patch broke build on blackfin due to this issue. The build
breakage was reported by Fengguang Wu.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull SMP/hotplug changes from Ingo Molnar:
"This is a pretty large, multi-arch series unifying and generalizing
the various disjunct pieces of idle routines that architectures have
historically copied from each other and have grown in random, wildly
inconsistent and sometimes buggy directions:
101 files changed, 455 insertions(+), 1328 deletions(-)
this went through a number of review and test iterations before it was
committed, it was tested on various architectures, was exposed to
linux-next for quite some time - nevertheless it might cause problems
on architectures that don't read the mailing lists and don't regularly
test linux-next.
This cat herding excercise was motivated by the -rt kernel, and was
brought to you by Thomas "the Whip" Gleixner."
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
idle: Remove GENERIC_IDLE_LOOP config switch
um: Use generic idle loop
ia64: Make sure interrupts enabled when we "safe_halt()"
sparc: Use generic idle loop
idle: Remove unused ARCH_HAS_DEFAULT_IDLE
bfin: Fix typo in arch_cpu_idle()
xtensa: Use generic idle loop
x86: Use generic idle loop
unicore: Use generic idle loop
tile: Use generic idle loop
tile: Enter idle with preemption disabled
sh: Use generic idle loop
score: Use generic idle loop
s390: Use generic idle loop
powerpc: Use generic idle loop
parisc: Use generic idle loop
openrisc: Use generic idle loop
mn10300: Use generic idle loop
mips: Use generic idle loop
microblaze: Use generic idle loop
...
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, for every ARC kernel build I see the following:
--------------->8-----------------
DTB arch/arc/boot/dts/angel4.dtb.S
AS arch/arc/boot/dts/angel4.dtb.o
LD arch/arc/boot/dts/built-in.o
rm arch/arc/boot/dts/angel4.dtb.S <-- forces rebuild next iter
CHK kernel/config_data.h
--------------->8-----------------
This is because *.dts.S is intermediate file in dtb generation and is by
default deleted by make which needs a ".SECONDARY" hint to NOT do so.
This could have ideally been done in scripts/Makefile.lib - for benefit
of all, however .SECONDARY doesn't seem to work with wildcards.
Thanks to Stephen for suggesting .SECONDARY (vs .PRECIOUS) and making
that work using a non wildcard version in arch makefile.
Thanks to James Hogan for pointing out that *.dtb.S now needs to be
added to clean-files
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
commit "fs: make binfmt support for #! scripts modular and removable"
made support for #!scripts optional - thus need to include the Kconfig
file to get all relevant BINFMT_*
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Currently ARC_HAS_LLSC can be influenced by platform for SMP only using
ARC_HAS_COH_LLSC. For !SMP it defaults to "y".
It turns out that some customers can't support it all, even in UP.
So we change the semantics, and use a negative dependency ARC_CANT_LLSC.
Any platform (independent of SMP or !SMP) can select it to disable LLSC.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
arch/arc/kernel/traps.c no longer compiles without explicitly including
asm/kprobes.h
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The existing uImage target always generates gzip compressed image which
drags bootup for some very slow FPGA customer boards.
So introduce seperate make targets:uImage.{bin,gz} with uncompressed
being default. Also tie gz generation to CONFIG_KERNEL_GZIP, which a
platform can select in it's Kconfig if it wishes gz to be default.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
We do cross compiles for ARC Linux.
With gcc 4.7, a make defconfig spews out the following:
------------------->8--------------------------
make ARCH=arc defconfig
gcc: error: unrecognized command line option '-marc600'
gcc: error: unrecognized command line option '-mA7'
gcc: error: unrecognized command line option '-mno-sdata'
gcc: error: unrecognized command line option '-mno-mpy'
*** Default configuration is based on 'fpga_defconfig'
------------------->8--------------------------
This apparently is coming from LIBGCC line - which is strange to be
invoked for defconfig generation.
Reported-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
There's no (Kconfig) macro CONFIG_BLOCK_DEV_RAM. (CONFIG_BLK_DEV_RAM
does exist though.) But linux/blk.h got killed in 2005 anyway (in a
patch titled "kill blk.h"), so these three lines can be removed.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Some header files were included twice in the same file.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Fixes the following coding style issues as detected by checkpatch:
ERROR: space required before the open parenthesis '('
ERROR: "foo * bar" should be "foo *bar"
WARNING: space prohibited between function name and open parenthesis '('
WARNING: please, no spaces at the start of a line
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Silences the following checkpatch warnings:
WARNING: Use #include <linux/ptrace.h> instead of <asm/ptrace.h>
WARNING: Use #include <linux/kprobes.h> instead of <asm/kprobes.h>
WARNING: Use #include <linux/kgdb.h> instead of <asm/kgdb.h>
WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
WARNING: Use #include <linux/cache.h> instead of <asm/cache.h>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
version.h header file inclusion is not necessary as detected by
versioncheck script.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC irqsave/restore macros were missing the compiler barrier, causing a
stale load in irq-enabled region be used in irq-safe region, despite
being changed, because the register holding the value was still live.
The problem manifested as random crashes in timer code when stress
testing ARCLinux (3.9-rc3) on a !SMP && !PREEMPT_COUNT
Here's the exact sequence which caused this:
(0). tv1[x] <----> t1 <---> t2
(1). mod_timer(t1) interrupted after it calls timer_pending()
(2). mod_timer(t2) completes
(3). mod_timer(t1) resumes but messes up the list
(4). __runt_timers( ) uses bogus timer_list entry / crashes in
timer->function
Essentially mod_timer() was racing against itself and while the spinlock
serialized the tv1[] timer link list, timer_pending() called outside the
spinlock, cached timer link list element in a register.
With low register pressure (and a deep register file), lack of barrier
in raw_local_irqsave() as well as preempt_disable (!PREEMPT_COUNT
version), there was nothing to force gcc to reload across the spinlock,
causing a stale value in reg be used for link list manipulation - ensuing
a corruption.
ARcompact disassembly which shows the culprit generated code:
mod_timer:
push_s blink
mov_s r13,r0 # timer, timer
..
###### timer_pending( )
ld_s r3,[r13] # <------ <variable>.entry.next LOADED
brne r3, 0, @.L163
.L163:
..
###### spin_lock_irq( )
lr r5, [status32] # flags
bic r4, r5, 6 # temp, flags,
and.f 0, r5, 6 # flags,
flag.nz r4
###### detach_if_pending( ) begins
tst_s r3,r3 <--------------
# timer_pending( ) checks timer->entry.next
# r3 is NOT reloaded by gcc, using stale value
beq.d @.L169
mov.eq r0,0
##### detach_timer( ): __list_del( )
ld r4,[r13,4] # <variable>.entry.prev, D.31439
st r4,[r3,4] # <variable>.prev, D.31439
st r3,[r4] # <variable>.next, D.30246
We initially tried to fix this by adding barrier() to preempt_* macros
for !PREEMPT_COUNT but Linus clarified that it was anything but wrong.
http://www.spinics.net/lists/kernel/msg1512709.html
[vgupta: updated commitlog]
Reported-by/Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Cc: Christian Ruppert <christian.ruppert@abilis.com>
Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Debugged-by/Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The generic idle loop implements all functionality. Aside of that it
allows arc to implement the tsk_is_polling() functionality correctly,
despite the patently (now gone) comment in the original arc cpu_idle()
function:
/* Since we SLEEP in idle loop, TIF_POLLING_NRFLAG can't be set */
See kernel/cpu/idle.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Tested-by: Vineet Gupta <vgupta@synopsys.com>
Link: http://lkml.kernel.org/r/20130321215233.711253792@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
orig_r8_IS_EXCPN and orig_r8_IS_BRKPT were same values due to a
copy/paste error. Although it looks bad and is wrong, it really doesn't
affect gdb working.
orig_r8_IS_BRKPT is the one relevant to debugging (breakpoints), since
it is used to provide EFA vs. ERET to a ptrace "stop_pc" request.
So when gdb has inserted a breakpoint, orig_r8_IS_BRKPT is already set,
and anything else (i.e. orig_r8_IS_EXCPN) becoming same as it, really
doesn't hurt gdb. The corollary case, could be nasty but nobody uses the
ptrace "stop_pc" request in that case
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
CC drivers/mmc/host/mmc_spi.o
drivers/mmc/host/mmc_spi.c:118: error: redefinition of 'struct scratch'
make[3]: *** [drivers/mmc/host/mmc_spi.o] Error 1
make[2]: *** [drivers/mmc/host] Error 2
make[1]: *** [drivers/mmc] Error 2
make: *** [drivers] Error 2
CC arch/arc/kernel/kgdb.o
In file included from include/linux/kgdb.h:20,
from arch/arc/kernel/kgdb.c:11:
/home/vineetg/arc/k.org/arc-port/arch/arc/include/asm/kgdb.h:34:
warning: 'struct pt_regs' declared inside parameter list
/home/vineetg/arc/k.org/arc-port/arch/arc/include/asm/kgdb.h:34:
warning: its scope is only this definition or declaration, which is
probably not what you want
arch/arc/kernel/kgdb.c:172: error: conflicting types for 'kgdb_trap'
CC arch/arc/kernel/kgdb.o
arch/arc/kernel/kgdb.c: In function 'pt_regs_to_gdb_regs':
arch/arc/kernel/kgdb.c:62: error: dereferencing pointer to incomplete
type
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The upstream kernel ABI (v3) is different from current out-of-tree (v2):
* no-legacy-syscalls
* user_regs_struct layout has changed
So we rev up the ABI version
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ptrace regset interface relies on ELF_NGREG for ceiling the size of user
request. So any larger request (even if legit) would be clipped.
The existing def of ELF_NGREG didn't use user_regs_struct and was
technically one placeholder short (stop_pc) - although the current code
would still work because pt_regs includes a bunch of extra fields,
making
ELF_NGREG >= sizeof(struct user_regs_struct)/sizeof(long)
But we need to remove this ambiguity, specially since pt_regs should NOT
be directly associated with with anything userspace-ish.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The flat DT (currently embedded in vmlinux) is in .init section.
The unflattened/binary tree doesn't copy strings through and references
them from orig flat DT - which could cause catestrohpy if of_* APIs are
called post init, say from a driver which is a loadable module.
Reported-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Syscall restarting fixes made pt_regs->orig_r8 a short word, which was
not reflected in the assembler code - thus could potentially break gdb
debugging.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Commit 0bbacca "hlist: drop the node parameter from iterators" changed
the iterator across the board - but ARC port being out-of-tree missed
it.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The 64bit RTSC is not reliable, causing spurious "jumps" in higher word,
making Linux timekeeping go bonkers. So as of now just use the lower
32bit timestamp.
A cleaner approach would have been removing RTSC support altogether as the
32bit RTSC is equivalent to old TIMER1 based solution, but some customers
can use the 32bit RTSC in SMP syn fashion (vs. TIMER1 which being incore
can't be done easily).
A fallout of this is sched_clock()'s hardware assisted version needs to
go away since it can't use 32bit wrapping counter - instead we use the
generic "weak" jiffies based version.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
!CONFIG_ARC_HAS_(I|D)CACHE makes Linux disable caches (assuming they
exist in hardware) - mostly for debugging issues with new peripherals.
However, independent of CONFIG_ARC_HAS_(I|D)CACHE, Linux also needs to
handle, non-existant caches, using the information in Cache BCRs (Build
Configuration Reg)
Reported-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Platforms export their SMP callbacks by populating arc_smp_ops.
The population itself needs to be done pretty early, from init_early
callback.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
This again is for switch from singleton platform SMP API to
multi-platform paradigm
Platform code is not yet setup to populate the callbacks, that happens
in next commit
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
All the current platforms can work with 0x8000_0000 based dma_addr_t
since the Bus Bridges typically ignore the top bit (the only excpetion
was Angel4 PCI-AHB bridge which we no longer care for).
That way we don't need plat-specific cpu-addr to bus-addr conversion.
Hooks still provided - just in case a platform has an obscure device
which say needs 0 based bus address.
That way <asm/dma_mapping.h> no longer needs to unconditinally include
<plat/dma_addr.h>
Also verfied that on Angel4 board, other peripherals (IDE-disk / EMAC)
work fine with 0x8000_0000 based dma addresses.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
For now this will suffice for all platforms, later exotic ones needs to
get this from DeviceTree
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
-platform API is retired and instead callbacks are used
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The orig platform code orgnaization was singleton design pattern - only
one platform (and board thereof) would build at a time.
Thus any platform/board specific code (e.g. irq init, early init ...)
expected by ARC common code was exported as well defined set of APIs,
with only ONE instance building ever.
Now with multiple-platform build requirement, that design of code no
longer holds - multiple board specific calls need to build at the same
time - so ARC common code can't use the API approach, it needs a
callback based design where each board registers it's specific set of
functions, and at runtime, depending on board detection, the callbacks
are used from the registry.
This commit adds all the infrastructure, where board specific callbacks
are specified as a "maThine description".
All the hooks are placed in right spots, no board callbacks registered
yet (with MACHINE_STARt/END constructs) so the hooks will not run.
Next commit will actually convert the platform to this infrastructure.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
This is more natural and is now doable since the choice constructs are
gone.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
This mini patchseries addresses the lack of multi-platform-image support
in ARC port.
Older build system only supported one platform(soc) to build at a time
and further only one board of that platform could be built. There was no
technical reason for that - we just didn't have the need.
So the first step towards multi-platform (and multi-board) builds it to
allow build system to do that.
So as applicable, <choice .. endchoice> => <menu .. endmenu>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Implement ioremap_prot() to allow mapping IO memory with variable
protection
via TLB.
Implementing this allows the /dev/mem driver to use its generic access()
VMA callback, which in turn allows ptrace to examine data in memory
mapped regions mapped via /dev/mem, such as Arc DCCM.
The end result is that it is possible to examine values of variables
placed into DCCM in user space programs via GDB.
CC: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
CC: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
1. ./genfilelist.pl arch/arc/include/asm/
2. Create arch/arc/include/uapi/asm/Kbuild as follows
+# UAPI Header export list
+include include/uapi/asm-generic/Kbuild.asm
3. ./disintegrate-one.pl arch/arc/include/{,uapi/}asm/<above-list>
4. Edit arch/arc/include/asm/Kbuild to remove ref to
asm-generic/Kbuild.asm
- To work around empty uapi/asm/setup.h added a placholder comment.
- Also a manual #ifdef __ASSEMBLY__ for a late ptrace change
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: David Howells <dhowells@redhat.com>
This allows ARC Target to do I/O to host in absence of any peripherals
whatsoever, assisted by Metaware Hostlink facility.
Further we have a FUSE based filesystem which makes us mount/access host
filesystem on target and do fops.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Includes mapping of CCMs in address space
* Annotations to move arbitrary code/data into CCM
* Moving some of the critical code/data into CCM
* Runtime detection/reporting
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC700 doesn't natively support unaligned access, but can be emulated
-Unaligned Access Exception
-Disassembly at the Fault address to find the exact insn (long/short)
Also per Arnd's comment, we runtime control it using 2 sysctl knobs:
* SYSCTL_ARCH_UNALIGN_ALLOW: Runtime enable/disble
* SYSCTL_ARCH_UNALIGN_NO_WARN: Warn on each emulation attempt
Originally contributed by Tim Yao <tim.yao@amlogic.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Tim Yao <tim.yao@amlogic.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
-Originally written by Rajeshwar Ranga
-Derived off of generic unwinder in 2.6.19 and adapted to ARC
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Rajeshwar Ranga <rajeshwar.ranga@gmail.com>
ARC common code to enable a SMP system + ISS provided SMP extensions.
ARC700 natively lacks SMP support, hence some of the core features are
are only enabled if SoCs have the necessary h/w pixie-dust. This
includes:
-Inter Processor Interrupts (IPI)
-Cache coherency
-load-locked/store-conditional
...
The low level exception handling would be completely broken in SMP
because we don't have hardware assisted stack switching. Thus a fair bit
of this code is repurposing the MMU_SCRATCH reg for event handler
prologues to keep them re-entrant.
Many thanks to Rajeshwar Ranga for his initial "major" contributions to
SMP Port (back in 2008), and to Noam Camus and Gilad Ben-Yossef for help
with resurrecting that in 3.2 kernel (2012).
Note that this platform code is again singleton design pattern - so
multiple SMP platforms won't build at the moment - this deficiency is
addressed in subsequent patches within this series.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rajeshwar Ranga <rajeshwar.ranga@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
There is a bit of hack/kludge right now where we disable preemption if a
L2 (High prio) IRQ is taken while L1 (Low prio) is active.
Need to revisit this
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This was part of port buildup strategy from Arnd to have a minimal kernel
at first and then add optional features (stacktracing, ptrace, smp,
kprobes, oprofile....)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* arc-uart platform device now populated dynamically, using
of_platform_populate() - applies to any other device whatsoever.
* uart in turn requires incore arc-intc to be also present in DT
* A irq-domain needs to be instantiated for IRQ requests by DT probed
device (e.g. arc-uart)
TODO: switch over to linear irq domain once all devs have been
transitioned to DT
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Arnd Bergmann <arnd@arndb.de>
This is minimal infrastructure needed for devicetree work.
It uses an a sample "skeleton" devicetree - embedded in kernel image -
to print the board, manufacturer by parsing the top-level "compatible"
string.
As of now we don't need any additional "board" specific "machine_desc".
TODO: support interpreting the command line as boot-loader passed dtb
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
N.B. This is old style of hardcoding platform device specific info
in code and it's instantiation thererof using platform_add_devices().
Subsequent patches replace this with DeviceTree based runtime probe.
This patch has been retained just as an example of "don't-do-this" for
newer kernel ports.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
This includes recent changes to make handler "retry" and/or "killable"
The killable (early exit) logic is loosely based on how SH implements it
return if SIGKILL + either of VM_FAULT_OOM or VM_FAULT_RETRY
which is different from Hexagon implementation which would NOT early
exit for
SIGKILL + VM_FAULT_OOM + !VM_FAULT_RETRY
credits: Non executable stack support from Simon Spooner
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC700 MMU provides for tagging TLB entries with a 8-bit ASID to avoid
having to flush the TLB every task switch.
It also allows for a quick way to invalidate all the TLB entries for
task useful for:
* COW sementics during fork()
* task exit()ing
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* ARC700 has VIPT L1 Caches
* Caches don't snoop and are not coherent
* Given the PAGE_SIZE and Cache associativity, we don't support aliasing
D$ configurations (yet), but do allow aliasing I$ configs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Per Al Viro's "signals for dummies" https://lkml.org/lkml/2012/12/6/366
there are 3 golden rules for (not) restarting syscalls:
" What we need to guarantee is
* restarts do not happen on signals caught in interrupts or exceptions
* restarts do not happen on signals caught in sigreturn()
* restart should happen only once, even if we get through do_signal()
many times."
ARC Port already handled #1, this patch fixes#2 and #3.
We use the additional state in pt_regs->orig_r8 to ckh if restarting
has already been done once.
Thanks to Al Viro for spotting this.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
To avoid multiple syscall restarts (multiple signals) or no restart at
all (sigreturn), we need just an extra bit of state "literally 1 bit" in
struct pt_regs. orig_r8 is the best place to do this, however given the
way it is encoded currently, we can't add anything simplistically.
Current orig_r8:
* syscalls -> 1 to NR_SYSCALLS
* Exceptions -> NR_SYSCALLS + 1
* Break-point-> NR_SYSCALLS + 2
In new scheme it is a bit-field
* lower short word contains the exact event type (and a new bit to represent
restart semantics : if syscall was already / can't be restarted)
* upper short word optionally containing the syscall num - needed by
likes of tracehooks etc
This patch only changes how orig_r8 is organised and nothing should
change behaviourily.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Includes following fixes courtesy review by Al-Viro
* Tracer poke to Callee-regs were lost
Before going off into do_signal( ) we save the user-mode callee regs
(as they are not saved by default as part of pt_regs). This is to make
sure that that a Tracer (if tracing related signal) is able to do likes
of PEEKUSR(callee-reg).
However in return path we were simply discarding the user-mode callee
regs, which would break a POKEUSR(callee-reg) from a tracer.
* Issue related to multiple syscall restarts are addressed in next patch
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Jonas Bonn <jonas@southpole.se>
ARC700 includes 2 in-core 32bit timers TIMER0 and TIMER1.
Both have exactly same capabilies.
* programmable to count from TIMER<n>_CNT to TIMER<n>_LIMIT
* for count 0 and LIMIT ~1, provides a free-running counter by
auto-wrapping when limit is reached.
* optionally interrupt when LIMIT is reached (oneshot event semantics)
* rearming the interrupt provides periodic semantics
* run at CPU clk
ARC Linux uses TIMER0 for clockevent (periodic/oneshot) and TIMER1 for
clocksource (free-running clock).
Newer cores provide RTSC insn which gives a 64bit cpu clk snapshot hence
is more apt for clocksource when available.
SMP poses a bit of challenge for global timekeeping clocksource /
sched_clock() backend:
-TIMER1 based local clocks are out-of-sync hence can't be used
(thus we default to jiffies based cs as well as sched_clock() one/both
of which platform can override with it's specific hardware assist)
-RTSC is only allowed in SMP if it's cross-core-sync (Kconfig glue
ensures that) and thus usable for both requirements.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
This includes support for generic clone/for/vfork/execve
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Hand optimised asm code for ARC700 pipeline.
Originally written/optimized by Joern Rennecke
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Joern Rennecke <joern.rennecke@embecosm.com>
* L1_CACHE_SHIFT
* PAGE_SIZE, PAGE_OFFSET
* struct pt_regs, struct user_regs_struct
* struct thread_struct, cpu_relax(), task_pt_regs(), start_thread(), ...
* struct thread_info, THREAD_SIZE, INIT_THREAD_INFO(), TIF_*, ...
* BUG()
* ELF_*
* Elf_*
To disallow user-space visibility into some of the core kernel data-types
such as struct pt_regs, #ifdef __KERNEL__ which also makes the UAPI header
spit (further patch in the series) to NOT export it to asm/uapi/ptrace.h
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Jonas Bonn <jonas.bonn@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Override asm-generic implementations. We basically gain on 2 fronts
* checks for alignment no longer needed as we are only doing "unit"
sized copies.
(Careful observer could argue that While the kernel buffers are aligned,
the user buffer in theory might not be - however in that case the
user space is already broken when it tries to deref a hword/word
straddling word boundary - so we are not making it any worse).
* __copy_{to,from}_user( ) returns bytes that couldn't be copied,
whereas get_user() returns 0 for success or -EFAULT (not size). Thus
the code to do leftover bytes calculation can be avoided as well.
The savings were significant: ~17k of code.
bloat-o-meter vmlinux_uaccess_pre vmlinux_uaccess_post
add/remove: 0/4 grow/shrink: 8/118 up/down: 1262/-18758 (-17496)
^^^^^^^^^
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
This covers the UP / SMP (with no hardware assist for atomic r-m-w) as
well as ARC700 LLOCK/SCOND insns based.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC700 has an in-core intc which provides 2 priorities (a.k.a.) "levels"
of interrupts (per IRQ) hencforth referred to as L1/L2 interrupts.
CPU flags register STATUS32 has Interrupt Enable bits per level (E1/E2)
to globally enable (or disable) all IRQs at a level. Hence the
implementation of arch_local_irq_{save,restore,enable,disable}( )
The STATUS32 reg can be r/w only using the AUX Interface of ARC, hence
the use of LR/SR instructions. Further, E1/E2 bits in there can only be
updated using the FLAG insn.
The intc supports 32 interrupts - and per IRQ enabling is controlled by
a bit in the AUX_IENABLE register, hence the implmentation of
arch_{,un}mask_irq( ) routines.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Arnd in his review pointed out that arch Kconfig organisation has several
deficiencies:
* Build time entries for things which can be runtime extracted from DT
(e.g. SDRAM size, core clk frequency..)
* Not multi-platform-image-build friendly (choice .. endchoice constructs)
* cpu variants support (750/770) is exclusive.
The first 2 have been fixed in subsequent patches.
Due to the nature of the 750 and 770, it is not possible to build for
both together, w/o special runtime glue code which would hurt
performance.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>