getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.
For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.
This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.
Later, we'll add other information to the struct as it becomes
convenient.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull fpu state cleanups from Ingo Molnar:
"This tree streamlines further aspects of FPU handling by eliminating
the prepare_to_copy() complication and moving that logic to
arch_dup_task_struct().
It also fixes the FPU dumps in threaded core dumps, removes and old
(and now invalid) assumption plus micro-optimizes the exit path by
avoiding an FPU save for dead tasks."
Fixed up trivial add-add conflict in arch/sh/kernel/process.c that came
in because we now do the FPU handling in arch_dup_task_struct() rather
than the legacy (and now gone) prepare_to_copy().
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, fpu: drop the fpu state during thread exit
x86, xsave: remove thread_has_fpu() bug check in __sanitize_i387_state()
coredump: ensure the fpu state is flushed for proper multi-threaded core dump
fork: move the real prepare_to_copy() users to arch_dup_task_struct()
Historical prepare_to_copy() is mostly a no-op, duplicated for majority of
the architectures and the rest following the x86 model of flushing the extended
register state like fpu there.
Remove it and use the arch_dup_task_struct() instead.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1336692811-30576-1-git-send-email-suresh.b.siddha@intel.com
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This implements basic -fstack-protector support, based on the early ARM
version in c743f38013. The SMP case is
limited to the initial canary value, while the UP case handles per-task
granularity (limited to 32-bit sh until a new enough sh64 compiler
manifests itself).
Signed-off-by: Filippo Arcidiacono <filippo.arcidiacono@st.com>
Reviewed-by: Carmelo Amoroso <carmelo.amoroso@st.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The address limit is already set in flush_old_exec() so those calls to
set_fs(USER_DS) are redundant.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Trivial build fix for certain configurations that don't grab
linux/prefetch.h via alternate means (specifically SH-2 and SH-3 parts).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:
arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().
do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.
Further kernel_execve() and sys_execve() need to be changed to match.
This has been test built on x86_64, frv, arm and mips.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
The old ctrl in/out routines are non-portable and unsuitable for
cross-platform use. While drivers/sh has already been sanitized, there
is still quite a lot of code that is not. This converts the arch/sh/ bits
over, which permits us to flag the routines as deprecated whilst still
building with -Werror for the architecture code, and to ensure that
future users are not added.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a machine_ops-based reboot interface loosely cloned from
x86, and converts the native sh32 and sh64 cases over to it.
Necessary both for tying in SMP support and also enabling platforms like
SDK7786 to add support for their microcontroller-based power managers.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the x86 xstate changes and implements a task_xstate slab
cache that is dynamically sized to match one of hard FP/soft FP/FPU-less.
This also tidies up and consolidates some of the SH-2A/SH-4 FPU
fragmentation. Now fpu state restorers are commonly defined, with the
init_fpu()/fpu_init() mess reworked to follow the x86 convention.
The fpu_init() register initialization has been replaced by xstate setup
followed by writing out to hardware via the standard restore path.
As init_fpu() now performs a slab allocation a secondary lighterweight
restorer is also introduced for the context switch.
In the future the DSP state will be rolled in here, too.
More work remains for math emulation and the SH-5 FPU, which presently
uses its own special (UP-only) interfaces.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
start_thread() will become a bit heavier with the xstate freeing to be
added in, so move it out-of-line in preparation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Conflict between FPU thread flag migration and debug
thread flag addition.
Conflicts:
arch/sh/include/asm/thread_info.h
arch/sh/include/asm/ubc.h
arch/sh/kernel/process_32.c
This adds preliminary support for the SH-4A UBC to the hw-breakpoints API.
Presently only a single channel is implemented, and the ptrace interface
still needs to be converted. This is the first step to cleaning up the
long-standing UBC mess, making the UBC more generally accessible, and
finally making it SMP safe.
An additional abstraction will be layered on top of this as with the perf
events code to permit the various CPU families to wire up support for
their own specific UBCs, as many variations exist.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
A number of small optimisations to FPU handling, in particular:
- move the task USEDFPU flag from the thread_info flags field (which
is accessed asynchronously to the thread) to a new status field,
which is only accessed by the thread itself. This allows locking to
be removed in most cases, or can be reduced to a preempt_lock().
This mimics the i386 behaviour.
- move the modification of regs->sr and thread_info->status flags out
of save_fpu() to __unlazy_fpu(). This gives the compiler a better
chance to optimise things, as well as making save_fpu() symmetrical
with restore_fpu() and init_fpu().
- implement prepare_to_copy(), so that when creating a thread, we can
unlazy the FPU prior to copying the thread data structures.
Also make sure that the FPU is disabled while in the kernel, in
particular while booting, and for newly created kernel threads,
In a very artificial benchmark, the execution time for 2500000
context switches was reduced from 50 to 45 seconds.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
sh port of the sLeAZY-fpu feature currently implemented for some architectures
such us i386.
Right now the SH kernel has a 100% lazy fpu behaviour.
This is of course great for applications that have very sporadic or no FPU use.
However for very frequent FPU users... you take an extra trap every context
switch.
The patch below adds a simple heuristic to this code: after 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch.
After 256 switches, this is reset and the 100% lazy behavior is returned.
Tests with LMbench showed no regression.
I saw a little improvement due to the prefetching (~2%).
The tests below also show that, with this sLeazy patch, indeed,
the number of FPU exceptions is reduced.
To test this. I hacked the lat_ctx LMBench to use the FPU a little more.
sLeasy implementation
===========================================
switch_to calls | 79326
sleasy calls | 42577
do_fpu_state_restore calls| 59232
restore_fpu calls | 59032
Exceptions: 0x800 (FPU disabled ): 16604
100% Leazy (default implementation)
===========================================
switch_to calls | 79690
do_fpu_state_restore calls | 53299
restore_fpu calls | 53101
Exceptions: 0x800 (FPU disabled ): 53273
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Many of these symbols went away completely, or we just never cared about
them in the first place. Trim the exports down to the essential set.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patches will trigger a reboot using the watchdog
timer instead of double fault. Unlike the previous
method, this one actually works in 32 bit mode.
Reset should also be cleaner.
Signed-off-by: Jon Frosdick <jon.frosdick@st.com>
Signed-off-by: Carl Shaw <carl.shaw@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Annotate __switch_to() so that the function graph tracer does not try to
trace it. Use __notrace_funcgraph, as opposed to notrace, so that other
tracers can continue to trace __switch_to().
The reason that we don't want to trace __switch_to() with the function
graph tracer is because of how the return address stack in task_struct
is implemented. When we enter __switch_to we store the real return
address on prev's ret_stack. When we return from __switch_to() we've
patched the return address on the kernel stack to be
return_to_handler. Calling return_to_handler we do,
-> ftrace_return_to_handler()
-> ftrace_pop_return_ftrace()
Which tries to pop the real return address from current->ret_stack. The
problem being that we stored the return address on prev->ret_stack, but
current now points to next, and next->ret_stack doesn't contain the
correct return address (and is possibly even empty).
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (56 commits)
sh: Fix declaration of __kernel_sigreturn and __kernel_rt_sigreturn
sh: Enable soc-camera in ap325rxa/migor/se7724 defconfigs.
sh: remove stray markers.
sh: defconfig updates.
sh: pci: Initial PCI-Express support for SH7786 Urquell board.
sh: Generic HAVE_PERF_COUNTER support.
SH: convert migor to soc-camera as platform-device
SH: convert ap325rxa to soc-camera as platform-device
soc-camera: unify i2c camera device platform data
sh: add platform data for r8a66597-hcd in setup-sh7723
sh: add platform data for r8a66597-hcd in setup-sh7366
sh: x3proto: add platform data for r8a66597-hcd
sh: highlander: add platform data for r8a66597-hcd
sh: sh7785lcr: add platform data for r8a66597-hcd
sh: turn off irqs when disabling CMT/TMU timers
sh: use kzalloc() for cpg clocks
sh: unbreak WARN_ON()
sh: Use generic atomic64_t implementation.
sh: Revised clock function in highlander
sh: Update r7780mp defconfig
...
avr32, mn10300, parisc, s390, sh, xtensa:
They never set PT_DTRACE, but clear it after do_execve().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Chris Zankel <chris@zankel.net>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/sh has a couple of stray markers without any users introduced
in commit 3d58695edb. Remove them in
preparation of removing the markers in favour of the TRACE_EVENT
macro (and also because we don't keep dead code around).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (23 commits)
sh: sh7785lcr: Map whole PCI address space.
sh: Fix up DSP context save/restore.
sh: Fix up number of on-chip DMA channels on SH7091.
sh: update defconfigs.
sh: Kill off broken direct-mapped cache mode.
sh: Wire up ARCH_HAS_DEFAULT_IDLE for cpuidle.
sh: Add a command line option for disabling I/O trapping.
sh: Select ARCH_HIBERNATION_POSSIBLE.
sh: migor: Fix up CEU use flags.
input: migor_ts: add wakeup support
rtc: rtc-sh: use set_irq_wake()
input: sh_keysc: use enable/disable_irq_wake()
sh: intc: set_irq_wake() support
sh: intc: install enable, disable and shutdown callbacks
clocksource: sh_cmt: use remove_irq() and remove clockevent workaround
sh: ap325 and Migo-R use new sh_mobile_ceu_info flags
sh: Fix up -Wformat-security whining.
sh: ap325rxa: Add ov772x support, again.
sh: Sanitize asm/mmu.h for assembly use.
sh: Tidy up sh7786 pinmux table.
...
There were a number of issues with the DSP context save/restore code,
mostly left-over relics from when it was introduced on SH3-DSP with
little follow-up testing, resulting in things like task_pt_dspregs()
referencing incorrect state on the stack.
This follows the MIPS convention of tracking the DSP state in the
thread_struct and handling the state save/restore in switch_to() and
finish_arch_switch() respectively. The regset interface is also updated,
which allows us to finally be rid of task_pt_dspregs() and the special
cased task_pt_regs().
Signed-off-by: Michael Trimarchi <michael@evidence.eu.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This can use the same implementation as sh64, the generated assembly is
the same between the new and old version, so there is not much point in
leaving it open coded in inline assembly.
This is preparatory work for future consolidation of the _32/_64
variants.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Description snipped from Steven Rostedt's PPC patch:
When idle is called, interrupts are blocked, but the idle
function will still wake up on an interrupt. The problem is
that the interrupt disabled latency tracer will take this call
to idle as a latency.
This patch disables the latency tracing when going into idle.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements a simple show_code() that is in turn plugged in to
show_regs() to provide minimal code dumping at the end of the trace.
Built on top of a simple instruction disassembler derived from the
binutils opcode table.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements a few trace points across events that are deemed
interesting. This implements a number of trace points:
- The page fault handler / TLB miss
- IPC calls
- Kernel thread creation
The original LTTng patch had the slow-path instrumented, which
fails to account for the vast majority of events. In general
placing this in the fast-path is not a huge performance hit, as
we don't take page faults for kernel addresses.
The other bits of interest are some of the other trap handlers, as
well as the syscall entry/exit (which is better off being handled
through the tracehook API). Most of the other trap handlers are corner
cases where alternate means of notification exist, so there is little
value in placing extra trace points in these locations.
Based on top of the points provided both by the LTTng instrumentation
patch as well as the patch shipping in the ST-Linux tree, albeit in a
stripped down form.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch contains the following cleanups:
- make the following needlessly global code static:
- cf-enabler.c: cf_init()
- cpu/clock.c: __clk_enable()
- cpu/clock.c: __clk_disable()
- process_32.c: default_idle()
- time_32.c: struct clocksource_sh
- timers/timer-tmu.c: struct tmu_timer_ops
- remove the following unused functions (no CONFIG_BLK_DEV_FD on sh):
- process_{32,64}.c: disable_hlt()
- process_{32,64}.c: enable_hlt()
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:
scheduler switch to idle task
enable interrupts
Window starts here
----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick
----> interrupt happens (does set NEED_RESCHED)
return from schedule()
cpu_idle(): preempt_disable();
Window ends here
The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.
The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.
Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.
cpu_idle()
{
preempt_disable();
while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop
while (!need_resched())
halt();
tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
In hindsight we should have done this forever, but ...
/me grabs a large brown paperbag.
Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Presently with preempt enabled there's the possibility to be preempted
after the TIF_USEDFPU test and the register save, leading to bogus
state post-__switch_to(). Use an explicit preempt_disable()/enable()
pair around unlazy_fpu()/clear_fpu() to avoid this. Follows the x86
change.
Reported-by: Takuo Koguchi <takuo.koguchi.sw@hitachi.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements kernel-level atomic rollback built on top of gUSA,
as an alternative non-IRQ based atomicity method. This is generally
a faster method for platforms that are lacking the LL/SC pairs that
SH-4A and later use, and is only supportable on legacy cores.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>