Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (262 commits)
sh: mach-ecovec24: Add user debug switch support
sh: Kill off unused se_skipped in alignment trap notification code.
sh: Wire up HAVE_SYSCALL_TRACEPOINTS.
video: sh_mobile_lcdcfb: use both register sets for display panning
video: sh_mobile_lcdcfb: implement display panning
sh: Fix up sh7705 flush_dcache_page() build.
sh: kfr2r09: document the PLL/FLL <-> RF relationship.
sh: mach-ecovec24: need asm/clock.h.
sh: mach-ecovec24: deassert usb irq on boot.
sh: Add KEYSC support for EcoVec24
sh: add kycr2_delay for sh_keysc
sh: cpufreq: Include CPU id in info messages.
sh: multi-evt support for SH-X3 proto CPU.
sh: clkfwk: remove bogus set_bus_parent() from SH7709.
sh: Fix the indication point of the liquid crystal of AP-325RXA(AP3300)
sh: Add EcoVec24 romImage defconfig
sh: USB disable process is needed if romImage boot for EcoVec24
sh: EcoVec24: add HIZA setting for LED
sh: EcoVec24: write MAC address in boot
sh: Add romImage support for EcoVec24
...
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
sched: Fix SD_POWERSAVING_BALANCE|SD_PREFER_LOCAL vs SD_WAKE_AFFINE
sched: Stop buddies from hogging the system
sched: Add new wakeup preemption mode: WAKEUP_RUNNING
sched: Fix TASK_WAKING & loadaverage breakage
sched: Disable wakeup balancing
sched: Rename flags to wake_flags
sched: Clean up the load_idx selection in select_task_rq_fair
sched: Optimize cgroup vs wakeup a bit
sched: x86: Name old_perf in a unique way
sched: Implement a gentler fair-sleepers feature
sched: Add SD_PREFER_LOCAL
sched: Add a few SYNC hint knobs to play with
sched: Fix sync wakeups again
sched: Add WF_FORK
sched: Rename sync arguments
sched: Rename select_task_rq() argument
sched: Feature to disable APERF/MPERF cpu_power
x86: sched: Provide arch implementations using aperf/mperf
x86: Add generic aperf/mperf code
x86: Move APERF/MPERF into a X86_FEATURE
...
Fix up trivial conflict in arch/x86/include/asm/processor.h due to
nearby addition of amd_get_nb_id() declaration from the EDAC merge.
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits)
PCI hotplug: clean up acpi_run_hpp()
PCI hotplug: acpiphp: use generic pci_configure_slot()
PCI hotplug: shpchp: use generic pci_configure_slot()
PCI hotplug: pciehp: use generic pci_configure_slot()
PCI hotplug: add pci_configure_slot()
PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface
PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge
PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation
PCI: Clear saved_state after the state has been restored
PCI PM: Return error codes from pci_pm_resume()
PCI: use dev_printk in quirk messages
PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset()
PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle
PCI Hotplug: acpiphp: find bridges the easy way
PCI: pcie portdrv: remove unused variable
PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support
ACPI PM: Replace wakeup.prepared with reference counter
PCI PM: Introduce device flag wakeup_prepared
PCI / ACPI PM: Rework some debug messages
PCI PM: Simplify PCI wake-up code
...
Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree
scanning having been moved and merged for the 32- and 64-bit cases. The
'needs_freset' initialization added in 6e19314cc ("PCI/powerpc: support
PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.
Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't
really mind too much, SD_BALANCE_NEWIDLE picks up most of the
slack.
On a dual socket, quad core, dual thread nehalem system:
sysbench (--num_threads=16):
SD_BALANCE_WAKE-: 13982 tx/s
SD_BALANCE_WAKE+: 15688 tx/s
kbuild (-j16):
SD_BALANCE_WAKE-: 47.648295846 seconds time elapsed ( +- 0.312% )
SD_BALANCE_WAKE+: 47.608607360 seconds time elapsed ( +- 0.026% )
(same within noise)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is necessary to get ftrace syscall tracing working again.. a fairly
trivial and mechanical change. The one benefit is that this can also be
enabled on sh64, despite not having its own ftrace port.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
If we're looking to place a new task, we might as well find the
idlest position _now_, not 1 tick ago.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make the idle balancer more agressive, to improve a
x264 encoding workload provided by Jason Garrett-Glaser:
NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 252.82 fps, 22096.60 kb/s
encoded 600 frames, 250.69 fps, 22096.60 kb/s
encoded 600 frames, 245.76 fps, 22096.60 kb/s
NO_NEXT_BUDDY LB_BIAS
encoded 600 frames, 344.44 fps, 22096.60 kb/s
encoded 600 frames, 346.66 fps, 22096.60 kb/s
encoded 600 frames, 352.59 fps, 22096.60 kb/s
NO_NEXT_BUDDY NO_LB_BIAS
encoded 600 frames, 425.75 fps, 22096.60 kb/s
encoded 600 frames, 425.45 fps, 22096.60 kb/s
encoded 600 frames, 422.49 fps, 22096.60 kb/s
Peter pointed out that this is better done via newidle_idx,
not via LB_BIAS, newidle balancing should look for where
there is load _now_, not where there was load 2 ticks ago.
Worst-case latencies are improved as well as no buddies
means less vruntime spread. (as per prior lkml discussions)
This change improves kbuild-peak parallelism as well.
Reported-by: Jason Garrett-Glaser <darkshikari@gmail.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1253011667.9128.16.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When merging select_task_rq_fair() and sched_balance_self() we lost
the use of wake_idx, restore that and set them to 0 to make wake
balancing more aggressive.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The problem with wake_idle() is that is doesn't respect things like
cpu_power, which means it doesn't deal well with SMT nor the recent
RT interaction.
To cure this, it needs to do what sched_balance_self() does, which
leads to the possibility of merging select_task_rq_fair() and
sched_balance_self().
Modify sched_balance_self() to:
- update_shares() when walking up the domain tree,
(it only called it for the top domain, but it should
have done this anyway), which allows us to remove
this ugly bit from try_to_wake_up().
- do wake_affine() on the smallest domain that contains
both this (the waking) and the prev (the wakee) cpu for
WAKE invocations.
Then use the top-down balance steps it had to replace wake_idle().
This leads to the dissapearance of SD_WAKE_BALANCE and
SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE.
SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective.
Touch all topology bits to replace the old with new SD flags --
platforms might need re-tuning, enabling SD_BALANCE_WAKE
conditionally on a NUMA distance seems like a good additional
feature, magny-core and small nehalem systems would want this
enabled, systems with slow interconnects would not.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Early clock initialization sets up PLL/FLL values for optimal RF
behaviour. As this relationship is presently undocumented, we document
this in the script so the rationale is apparent.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
After KYCR2 is set, udelay might become necessary if there are only a
small number of keys attached. This patch introduces an optional delay
through the platform data to address this problem.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
when you use romImage on EcoVec24, 1st Linux will enable USB device.
But no-one disable it.
So re-started Linux will get interrupt before USB driver is attached.
This patch disable USB device at first
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
romimage macros which are used in kfr2r09 is very useful for other board.
This patch divides kfr2r09's romimage.h into
romimage-macros and partner-jet-setup.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This was #define'd as 0 on all platforms, so let's get rid of it.
This change makes pci_scan_slot() slightly easier to read.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Tony Luck <tony.luck@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the SMP VIPT case the page copy/clear ops still perform colouring,
care needs to be taken that CPUs don't end up stepping on each other,
so we give them a bit of room to work with.
At the same time, we reduce the worst-case colouring given that these
pages are always consumed.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This supported all DMA channels, and it was tested in SH7722,
SH7780, SH7785 and SH7763.
This can not use with SH DMA API.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Reviewed-by: Matt Fleming <matt@console-pimps.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This fixes up the kmap_coherent/kunmap_coherent() interface for recent
changes both in the page fault path and the shared cache flushers, as
well as adding in some optimizations.
One of the key things to note here is that the TLB flush itself is
deferred until the unmap, and the call in to update_mmu_cache() itself
goes away, relying on the regular page fault path to handle the lazy
dcache writeback if necessary.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The kgdb stub has traditionally tied in to the NMI slot, and manually
handled debounce. Now that we have a generic way to do this instead, all
of the stub-specific debounce silliness can be killed off.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements support for NMI debugging that was shamelessly copied
from the avr32 port. A bit of special magic is needed in the interrupt
exception path given that the NMI exception handler is stubbed in to the
regular exception handling table despite being reported in INTEVT. So we
mangle the lookup and kick off an EXPEVT-style exception dispatch from
the INTEVT path for exceptions that do_IRQ() has no chance of handling.
As a result, we also drop the evt2irq() conversion from the do_IRQ() path
and just do it in assembly.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Some unaligned accesses are completely expected. For example, the
trapped_io code uses the unaligned access fixup code path so there's no
need to warn about having to fixup the unaligned access.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The method of ETHER_LINK pin is board dependence.
This patch adding paramters are:
- no_ether_link : If set to 1, do not use ETHER_LINK
- ether_link_active_low : If set to 1, ETHER_LINK is active low.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is supposed to be the equivalent of __NR_syscalls, not
__NR_syscalls -1. The x86 code this was based on had simply fallen
out of sync at the time this was implemented. Fix it up now.
As a result, tracing of __NR_perf_counter_open works as advertised.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
sys_cacheflush should return with EINVAL if the cache parameter is not
one of ICACHE, DCACHE or BCACHE.
So, we need to include 0 in the first check.
It also adds the three definitions above as wrapper of the existent macros.
PS: ltp cacheflush01 test now passes.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Adds a system call to allow user code to flush code from the cache.
You can use instructions for the data side, but the iside can
only be done by a flush ROM which really only works with a direct
mapped cache. The later SH4's have 2 way Iside, so this call allows
a portable way to flush the cache.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Reading from the ROM is not a good idea as it could disturb some
flash operation that it is in progress.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The SH instruction set has several instructions which accept an 8 bit
immediate operand. For logical instructions this operand is zero extended,
for arithmetic instructions the operand is sign extended. After adding an
option to the assembler to check this, it was found that several pieces
of assembly code were assuming this behaviour, and in one case
getting it wrong.
So this patch explicitly sign extends any immediate operands, which makes
it obvious what is happening, and fixes the one case which got it wrong.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The synopsys PCI cell used in the later STMicro chips requires code to
be run in order to do IO cycles, rather than just memory mapping the IO
space. Rather than extending the existing SH infrastructure to allow
this, use the GENERIC_IOMAP implmentation to save re-inventing the
wheel.
This set of changes allows the SH to be built with GENERIC_IOMAP
enabled, it just ifdef's out the functions provided by the GENERIC_IOMAP
implementation, and provides a few required missing functions.
Signed-off-by: David McKay <david.mckay@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is V3 of the SuperH Mobile Runtime PM platform bus
implentation matching Rafael's Runtime PM v16.
The code gets invoked from the SuperH specific Runtime PM
platform bus functions that override the weak symbols for:
- platform_pm_runtime_suspend()
- platform_pm_runtime_resume()
- platform_pm_runtime_idle()
This Runtime PM implementation performs two levels of power
management. At the time of platform bus runtime suspend the
clock to the device is stopped instantly. Later on if all
devices within the power domain has their clocks stopped
then the device driver ->runtime_suspend() callbacks are
used to save hardware register state for each device.
Device driver ->runtime_suspend() calls are scheduled from
cpuidle context using platform_pm_runtime_suspend_idle().
When all devices have been fully suspended the processor
is allowed to enter deep sleep from cpuidle.
The runtime resume operation turns on clocks and also
restores registers if needed. It is worth noting that the
devices start in a suspended state and the device driver
is responsible for calling runtime resume before accessing
the actual hardware.
In this particular platform bus implementation runtime
resume is not allowed from interrupt context. Runtime
suspend is however allowed from interrupt context as
long as the synchronous functions are avoided.
[ updated for v17 -- PFM. ]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
sh64 does not yet support GENERIC_BUG, but still wants unwinder support.
Alias UNWINDER_BUG and UNWINDER_BUG_ON to their BUG counterparts until
the conversion to GENERIC_BUG is completed.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This simplifies the unwinder trap handling, dropping the use of the
special trapa vector and simply piggybacking on top of the BUG support. A
new BUGFLAG_UNWINDER is added for flagging the unwinder fault, before
continuing on with regular BUG dispatch.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Allow a DWARF register to have an undefined value. When applied to the
DWARF return address register this lets lets us label a function as
having no direct caller, e.g. kernel_thread_helper().
Signed-off-by: Matt Fleming <matt@console-pimps.org>
We can't assume that if we execute the unwinder code and the unwinder
was already running that it has faulted. Clearly two kernel threads can
invoke the unwinder at the same time and may be running simultaneously.
The previous approach used BUG() and BUG_ON() in the unwinder code to
detect whether the unwinder was incapable of unwinding the stack, and
that the next available unwinder should be used instead. A better
approach is to explicitly invoke a trap handler to switch unwinders when
the current unwinder cannot continue.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
The handling of DW_CFA_val_offset ops was incorrectly using the
DWARF_REG_OFFSET flag but the register's value cannot be calculated
using the DWARF_REG_OFFSET method. Create a new flag to indicate that a
different method must be used to calculate the register's value even
though there is no implementation for DWARF_VAL_OFFSET yet; it's mainly
just a place holder.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Plug a memory leak in dwarf_unwinder_dump() where we didn't free the
memory that we had previously allocated for the DWARF frames and DWARF
registers.
Now is also a opportune time to implement our own mempool and kmem
cache. It's a good idea to have a certain number of frame and register
objects in reserve at all times, so that we are guaranteed to have our
allocation satisfied even when memory is scarce. Since we have pools to
allocate from we can implement the registers for each frame as a linked
list as opposed to a sparsely populated array. Whilst it's true that the
lookup time for a linked list is larger than for arrays, there's only
usually a maximum of 8 registers per frame. So the overhead isn't that
much of a concern.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
This does a bit of rework for making the cache flushers SMP-aware. The
function pointer-based flushers are renamed to local variants with the
exported interface being commonly implemented and wrapping as necessary.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
All CPU-specific overloads are done at runtime now, so this common header
can go away and simply be folded back in to asm/ version.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add a P1 jump to the the kfr2r09 romimage code. With this
patch applied the initial zImage assembly code will run
with instruction cache enabled.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add instruction cache and TLB invalidation code for the
the kfr2r09 romimage target.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When get_mmu_context() runs out of new ASIDs it flushes the TLB and
wraps around. Despite the fact the ASIDs are tracked per-CPU, a global
TLB flush was being used. Switch this over to a local one, as matches
the intent.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
As an excellent indicator of how much testing the DSP code gets, a couple
of rather glaring bugs in the DSP save/restore paths were found:
- In the DSP restore case a0 needs to be popped off before a0g,
or the value of a0g is clobbered by the MSB of a0 in the case
of sign extension.
- Beyond that, the save and restore orders were out of sync,
so this fixes that up as well. At the same time, we switch over
to using movs.l for both the save and restore of the general DSP
registers as opposed to using sts.l (which was initially put in
place to work around a bug in ancient binutils versions which
the kernel no longer supports).
Reported-by: Chee Soon Yip <yip.cheesoon@renesas.com>
Cc: Chu Lih Kwek <kwek.chulih@renesas.com>,
Cc: General Lai <general.lai@renesas.com>,
Cc: Robert Cozens <Robert.Cozens@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
While most platforms implement LED banks in sets of 8/16/32, some use
different configurations. This adds a LED mask to the heartbeat platform
data to allow platforms to constrain the bitmap, which is otherwise
derived from the register size.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
FLLFRQ setting is needed to use correct PLL clock for kfr2409.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This moves the initialization over to an early_initcall(). This fixes up
some lockdep interaction issues. At the same time, kill off some
superfluous locking in the init path.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Also, remove the "fix" to DW_CFA_def_cfa_register where we reset the
frame's cfa_offset to 0. This action is incorrect when handling
DW_CFA_def_cfa_register as the DWARF spec specifically states that the
previous contents of cfa_offset should be used with the new
register. The reason that I thought cfa_offset should be reset to 0 was
because it was being assigned a bogus value prior to executing the
DW_CFA_def_cfa_register op. It turns out that the bogus cfa_offset value
came from interpreting .cfi_escape pseudo-ops (those used by the GNU
extensions) as CFA_DW_def_cfa ops.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
The caches enabled case needs more work, but is presently broken
regardless, so this can be done incrementally.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This paves the way for allowing individual CPUs to overload the
individual flushing routines that they care about without having to
depend on weak aliases. SH-4 is converted over initially, as it wires
up pretty much everything. The majority of the other CPUs will simply use
the default no-op implementation with their own region flushers wired up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We use flush_cache_page() outright in copy_to_user_page(), and nothing
else needs it, so just kill it off. SH-5 still defines its own version,
but that too will go away in the same fashion once it converts over.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
All of the flush_dcache_mmap_lock()/flush_dcache_mmap_unlock()
definitions are identical across all CPUs, so just provide them
generically in asm/cacheflush.h.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
flush_dcache_all() is used internally by the SH-4 cache code, it is not
part of the exported cache API, so make it static and don't export it.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a central point for CPU cache initialization routines.
This replaces the antiquated p3_cache_init() method, which the vast
majority of CPUs never cared about.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds a family member to struct sh_cpuinfo, which allows us to fall
back more on the probe routines to work out what sort of subtype we are
running on. This will be used by the CPU cache initialization code in
order to first do family-level initialization, followed by subtype-level
optimizations.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This does a bit of reorganizing for allowing nommu to use the new
and generic cache.c, no functional changes.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This builds in the newly created cache.c (renamed from pg-mmu.c) for both
MMU and NOMMU configurations. The kmap_coherent() stubs and alias
information recorded by each CPU family takes care of doing the right
thing while enabling the code to be commonly shared.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This plugs in kmap_coherent() for the non-SH4 cases to permit the
pg-mmu.c bits to be used generically across all CPUs. SH-5 is still in
the TODO state, but will move over to fixmap and the generic interface
gradually.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This plugs in some register alignment helpers for the shared flushers,
allowing them to also be used on SH-5. The main rationale here is that
in the SH-5 case we have a variable ABI, where the pointer size may not
equal the register width. This register extension is taken care of by
the SH-5 code already today, and is otherwise unused on the SH-4 code.
This combines the two and allows us to kill off the SH-5 implementation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Trying to figure out the best value for DWARF_ARCH_UNWIND_OFFSET is
tricky at best. Various things can change the size (and offset from the
beginning of the function) of the prologue. Notably, turning on ftrace
adds calls to mcount at the beginning of functions, thereby pushing the
prologue further into the function.
So replace DWARF_ARCH_UNWIND_OFFSET with some code that continues to
execute CFA instructions until the value of return address register is
defined. This is safe to do because we know that the return address must
have been pushed onto the frame before our first function call; we just
can't figure out where at compile-time.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
In order to use DWARF unwinder info the frame register has to contain a
valid value. Whilst GCC takes care of this for C code, we have to do it
ourselves for assembly.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This is a first cut at a generic DWARF unwinder for the kernel. It's
still lacking DWARF64 support and the DWARF expression support hasn't
been tested very well but it is generating proper stacktraces on SH for
WARN_ON() and NULL dereferences.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Provide an interface for registering stack unwinders, where each
unwinder is given a rating that describes its accuracy and
complexity. The more accurate an unwinder is, the more complex it is.
If a the current stack unwinder faults, then the stack unwinder with the
next highest accuracy will be used in its place (provided one is
available). For example, this allows unwinders, such as the DWARF
unwinder, to liberally sprinkle BUG()s to catch badly formed DWARF debug
info.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Copy the stacktrace ops code from x86 and provide a central function for
use by functions that need to dump a callstack.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch moves the Migo-R specific header file from
mach-common/ into mach-migor/ and removes unused cruft.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch moves all the romImage related header files into
the mach/ directory.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds support for the WQVGA LCD display used by
the KFR2R09 board. The LCD module is a TX07D34VM0AAA made
by Hitachi, and this module is made up by a R61517 chip
together with a 240x400 pixel display. The screen is
attached to the SuperH Mobile LCDC using a 18-bit SYS bus.
The register settings used by the SYS panel setup code are
based on an out-of-tree driver which apart from duplicating
all LCDC driver code and writing to non-existing hardware
registers also never was posted for upstream merge.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This consolidates all of the NEFF-based sign extension for SH-5.
In the future the other SH code will need to make use of this as well,
so make it generic in preparation for more 32/64 consolidation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a __flush_anon_page() that handles both the aliasing and
non-aliasing cases. This fixes up some crashes with heavy
get_user_pages() users.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
These patches extend struct platform device data for a bunch of
SuperH Mobile processors and embedded boards. The patches simply
add hardware block ids to on-chip platform devices. Platform
devices off chip (such as external ethernet controllers or flash
chips) are left out which gives them a special case hardware block
id of zero.
Upcoming Runtime PM code will make use of the hardware block id
to group devices together. The hardware block id can also be used
to extend the SuperH Mobile clock framework implementation.
This series of patches depend on the following:
"Driver Core: Add platform device arch data V3".
This patch adds a hwblk_id member to struct pdev_archdata. This member
should be used to point out on-chip hardware block id.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is romImage support for the kfr2r09 board V2.
The partner-jet-setup.txt file is converted into assembly code
which becomes the first code to execute from the reset vector.
The file partner-jet-setup.txt can also be used to setup
the hardware using a JTAG debugger so booting from RAM can
be done without burning the code to flash.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch contains support for the romImage build target V2.
The resulting romImage file should be burned to rom
or flash and could be used as small boot loader.
Board code should keep their setup code in the file
romimage.h located in their mach include directory.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This cleans up the irqflags tracing code quite a bit and ties it
in to various missing callsites that caused an imbalance when
CONFIG_PROVE_LOCKING was enabled.
Previously this was catching on:
987 #ifdef CONFIG_PROVE_LOCKING
988 DEBUG_LOCKS_WARN_ON(!p->hardirqs_enabled);
989 DEBUG_LOCKS_WARN_ON(!p->softirqs_enabled);
990 #endif
991 retval = -EAGAIN;
with hardirqs being doubly enabled, and subsequently bailing out
with the following call trace:
Call trace:
[<88035224>] __lock_acquire+0x616/0x6a6
[<88015a8c>] do_fork+0xf8/0x2b0
[<880331ec>] trace_hardirqs_on_caller+0xd4/0x114
[<88241074>] _spin_unlock_irq+0x20/0x64
[<88035224>] __lock_acquire+0x616/0x6a6
[<8800386c>] kernel_thread+0x48/0x70
[<88024ecc>] ____call_usermodehelper+0x0/0x110
[<88024ecc>] ____call_usermodehelper+0x0/0x110
[<88003894>] kernel_thread_helper+0x0/0x14
[<88024bac>] __call_usermodehelper+0x38/0x70
[<88025dc0>] worker_thread+0x150/0x274
[<88035b9c>] lock_release+0x0/0x198
[<88024b74>] __call_usermodehelper+0x0/0x70
[<88028cf0>] autoremove_wake_function+0x0/0x30
[<88028bf2>] kthread+0x3e/0x70
[<88025c70>] worker_thread+0x0/0x274
[<8800389c>] kernel_thread_helper+0x8/0x14
[<88028bb4>] kthread+0x0/0x70
[<88003894>] kernel_thread_helper+0x0/0x14
Reported-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This splits out a separate __update_cache()/__update_tlb() for
update_mmu_cache() to wrap in to. This lets us share the common
__update_cache() bits while keeping special __update_tlb() handling
broken out.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Those definitions are already provided by asm-generic
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
Upcoming paches to support the new 64-bit "BookE" powerpc architecture
will need to have the virtual address corresponding to PTE page when
freeing it, due to the way the HW table walker works.
Basically, the TLB can be loaded with "large" pages that cover the whole
virtual space (well, sort-of, half of it actually) represented by a PTE
page, and which contain an "indirect" bit indicating that this TLB entry
RPN points to an array of PTEs from which the TLB can then create direct
entries. Thus, in order to invalidate those when PTE pages are deleted,
we need the virtual address to pass to tlbilx or tlbivax instructions.
The old trick of sticking it somewhere in the PTE page struct page sucks
too much, the address is almost readily available in all call sites and
almost everybody implemets these as macros, so we may as well add the
argument everywhere. I added it to the pmd and pud variants for consistency.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
Acked-by: Nick Piggin <npiggin@suse.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that the SH-4 page clear/copy ops are generic, they can be used for
all platforms with CONFIG_MMU=y. SH-5 remains the odd one out, but it too
will gradually be converted over to using this interface.
SH-3 platforms which do not contain aliases will see no impact from this
change, while aliasing SH-3 platforms will get the same interface as
SH-4.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This wires up clear_user_highpage() on SH-4 and subsequently converts the
SH7705 32kB cache mode over to using it. Now that the SH-4 implementation
handles all of the dcache purging directly in the aliasing case, there is
no need to do this in the default clear_page() implementation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This inverts the delayed dcache flush a bit to be more in line with other
platforms. At the same time this also gives us the ability to do some
more optimizations and cleanup. Now that the update_mmu_cache() callsite
only tests for the bit, the implementation can gradually be split out and
made generic, rather than relying on special implementations for each of
the peculiar CPU types.
SH7705 in 32kB mode and SH-4 still need slightly different handling, but
this is something that can remain isolated in the varying page copy/clear
routines. On top of that, SH-X3 is dcache coherent, so there is no need
to bother with any of these tests in the PTEAEX version of
update_mmu_cache(), so we kill that off too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Allocate one of the unused PTE bits for _PAGE_SPECIAL directly. This is
prep work for fast gup and the zero page revival.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Extend the SuperH hwblk code to support more than one counter.
Contains ground work for the future Runtime PM implementation.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add both dynamic and static function graph tracer support for sh.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Pull the initial preempt_count value into a single
definition site.
Maintainers for: alpha, ia64 and m68k, please have a look,
your arch code is funny.
The header magic is a bit odd, but similar to the KERNEL_DS
one, CPP waits with expanding these macros until the
INIT_THREAD_INFO macro itself is expanded, which is in
arch/*/kernel/init_task.c where we've already included
sched.h so we're good.
Cc: tony.luck@intel.com
Cc: rth@twiddle.net
Cc: geert@linux-m68k.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that I've added TIF_SYSCALL_FTRACE the thread flags do not fit into
a single byte any more. Code testing them now needs to be aware of the
upper and lower bytes.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
It seems that MCOUNT_INSN_OFFSET was calculating the distance between
the wrong functions. The value that should have actually been computed
is the distance between ftrace_call and ftrace_stub. I discovered this
when I added some code to ftrace_caller.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds cpuidle support for SuperH Mobile.
The sleep mode selected by cpuidle is compared with
the mode selected by the hwblk sleep code and the
best allowed mode is entered.
At this point "Sleep mode" and "Sleep mode + SF" are
supported. This code can easily be extended to support
"Software suspend mode", but the assembly code must
first be updated to avoid loosing interrupts.
Also, update the code to only copy the assembly snippet
into internal memory once at bootup.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch contains the sh7722 specific hwblk implementation.
Hwblk ids are added to the processor specific header file,
module stop bits and areas are kept track of as hwblks,
clocks are converted to make use of the shared hwblk code.
Code to determine allowed sleep modes is also added.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is the hwblk base implementation, containing
structures and shared functions dealing with hardware blocks.
A each processor model should provide a list of hwblks and
describe which module stop bit that is associated with each
hwblck and how the hwblks are grouped together into areas.
The shared code keeps track of the usage count for each
hwblk and the areas. Fallback implementations for processor
specific code are also kept as weak symbols.
The clock framework, the runtime pm code and cpuidle will
all tie into this hwblk implementation.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Rework the bootmem allocator to use the lmb framework.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When arch/sh/include/asm/syscall_32.h is included from a file that
doesn't also include linux/err.h the following error is produced,
In file included from /home/matt/src/kernels/sh-2.6/arch/sh/include/asm/syscall.h:5,
from kernel/trace/trace_syscalls.c:3:
/home/matt/src/kernels/sh-2.6/arch/sh/include/asm/syscall_32.h: In function 'syscall_get_error':
/home/matt/src/kernels/sh-2.6/arch/sh/include/asm/syscall_32.h:28: error: implicit declaration of function 'IS_ERR_VALUE'
make[2]: *** [kernel/trace/trace_syscalls.o] Error 1
make[1]: *** [kernel/trace] Error 2
make: *** [kernel] Error 2
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
commit dbe6f18691
("dma-mapping: mark dma_sync_single and dma_sync_sg as deprecated"
conveniently broke every single SH build.
In the future it would be great if people could at least bother
figuring out how to use grep.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Crib the x86 cpu_idle_wait() implementation and shove it in with the
idle code, subsequently enabling ARCH_HAS_CPU_IDLE_WAIT.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (56 commits)
sh: Fix declaration of __kernel_sigreturn and __kernel_rt_sigreturn
sh: Enable soc-camera in ap325rxa/migor/se7724 defconfigs.
sh: remove stray markers.
sh: defconfig updates.
sh: pci: Initial PCI-Express support for SH7786 Urquell board.
sh: Generic HAVE_PERF_COUNTER support.
SH: convert migor to soc-camera as platform-device
SH: convert ap325rxa to soc-camera as platform-device
soc-camera: unify i2c camera device platform data
sh: add platform data for r8a66597-hcd in setup-sh7723
sh: add platform data for r8a66597-hcd in setup-sh7366
sh: x3proto: add platform data for r8a66597-hcd
sh: highlander: add platform data for r8a66597-hcd
sh: sh7785lcr: add platform data for r8a66597-hcd
sh: turn off irqs when disabling CMT/TMU timers
sh: use kzalloc() for cpg clocks
sh: unbreak WARN_ON()
sh: Use generic atomic64_t implementation.
sh: Revised clock function in highlander
sh: Update r7780mp defconfig
...
This function was only used by pci_claim_resource(), and the last commit
deleted that use.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This enables support for the generic software-based perf counters.
Hardware counter support could be added in the future, but the lack
of a performance counter IRQ makes this rather dubious.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Convert most arches to use asm-generic/kmap_types.h.
Move the KM_FENCE_ macro additions into asm-generic/kmap_types.h,
controlled by __WITH_KM_FENCE from each arch's kmap_types.h file.
Would be nice to be able to add custom KM_types per arch, but I don't yet
see a nice, clean way to do that.
Built on x86_64, i386, mips, sparc, alpha(tonyb), powerpc(tonyb), and
68k(tonyb).
Note: avr32 should be able to remove KM_PTE2 (since it's not used) and
then just use the generic kmap_types.h file. Get avr32 maintainer
approval.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The irqsoff tracer uses the atomic_* functions internally, but the
implementations of those functions in arch/sh/include/asm/atomic-irq.h
disable irqs to achieve atomicity. A continuous loop ensues where we
disable interrupts, trace the interrupt disabling, call atomic_*
functions, disable interrupts, trace the interrupt disabling, etc..
The simplest solution to all this is to just convert uses of
local_irq_save()/local_irq_restore() the raw_* equivalents because the
raw_* equivalents don't call trace_hardirqs_on()/trace_hardirqs_off().
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The asm-generic versions have some helper definitions that we can use
instead, drop our definitions and use those instead.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Makes code futureproof against the impending change to mm->cpu_vm_mask.
It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask(), and by defining
it, the old arch_send_call_function_ipi is defined by the core code.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The current asm-generic/page.h only contains the get_order
function, and asm-generic/uaccess.h only implements
unaligned accesses. This renames the file to getorder.h
and uaccess-unaligned.h to make room for new page.h
and uaccess.h file that will be usable by all simple
(e.g. nommu) architectures.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The existing asm-generic/atomic.h only defines the
atomic_long type. This renames it to atomic-long.h
so we have a place to add a truly generic atomic.h
that can be used on all non-SMP systems.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
This provides a reliable way for asm-generic/types.h and other
files to find out if it is running on a 32 or 64 bit platform.
We cannot use CONFIG_64BIT for this in headers that are included
from user space because CONFIG symbols are not available there.
We also cannot do it inside of asm/types.h because some headers
need the word size but cannot include types.h.
The solution is to introduce a new header <asm/bitsperlong.h>
that defines both __BITS_PER_LONG for user space and
BITS_PER_LONG for usage in the kernel. The asm-generic
version falls back to 32 bit unless the architecture overrides
it, which I did for all 64 bit platforms.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The existing asm-generic versions are incomplete and included
by some architectures. New architectures should be able
to use a generic version, so rename the existing files and
change all users, which lets us add the new files.
Signed-off-by: Remis Lima Baima <remis.developer@googlemail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This fixes up a typo in the ll/sc based cmpxchg code which apparently
wasn't getting a lot of testing due to the swapped old/new pair. With
that fixed up, the ll/sc code also starts using it and provides its own
atomic_add_unless().
Signed-off-by: Aoi Shinkai <shinkoi2005@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch updates the div6 clock helper code to add support
for enable(), disable() and set_rate() callbacks.
Needed by the camera clock enabling board code on Migo-R.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds sh7722 mode pin and pin function
controller comments.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds comments for the sh7724 mode pins
and pin function controller.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch is sh7723 mode pin V2. Mode pins and
pin function controller comments are added.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch reworks the mode pin code to keep the pin
definitions in one place. The mode pins values are now
the value of the bit instead of bit number.
With this patch in place the sh7785 header file contains
mode pin comments. The sh7785 clock code and the sh7785lcr
board code are updated to reflect the new shared mode pins.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds div6 clock helper code. The div6 clocks
are simply 6-bit divide-by-n modules where n is 1 to 64.
Needed for vclk on sh7722, sh7723, sh7343 and sh7366.
sh7724 needs this even more for vclk, fclka, fclkb,
irdaclk and spuclk.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch fixes the 16-bit case of the sh4a specific
unaligned access implementation. Without this patch
the 16-bit version of sh4a get_unaligned() results in
a 32-bit read which may read more data than intended
and/or cross page boundaries.
Unbreaks mtd NOR write handling on Migo-R.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add shared code for 4-bit divisor clocks.
Processor specific code can use SH_CLK_DIV4()
to initialize div4 clocks, and then use
sh_clk_div4_register() for registration.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add shared 32-bit module stop bit clock support.
Processor specific code can use SH_CLK_MSTP32()
to initialize module stop bit clocks, and then
use sh_clk_mstp32() for registration.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds sh7785 mode pin definitions. Mode pins and
pin function controller comments are added as well.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add mode pin support for the SuperH architecture V2.
With this patch applied the board code can add their
own function to export the cpu mode pin configuration.
In most cases this will be a constant bitmap, but
boards that allow reading this from a register can
instead read out the pin state from hardware.
The code warns if a pin is tested but no board specific
mode pin function has been provided.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The flat loader uses an architecture's flat_stack_align() to align the
stack but assumes word-alignment is enough for the data sections.
However, on the Xtensa S6000 we have registers up to 128bit width
which can be used from userspace and therefor need userspace stack and
data-section alignment of at least this size.
This patch drops flat_stack_align() and uses the same alignment that
is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's
not defined by the architecture.
It also fixes m32r which was obviously kaput, aligning an
uninitialized stack entry instead of the stack pointer.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Oskar Schirmer <os@emlix.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Johannes Weiner <jw@emlix.com>
Acked-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
None of the SH PCI controllers support MWI, it is always treated as a
direct memory write, so simply disable it outright. In the case of the
PCI cache line size, consult that for the pci_dma_burst_advice()
strategy, and switch over to PCI_DMA_BURST_MULTIPLE, as PPC64.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch removes the ->build_rate_table() callback,
->recalc() may instead be used for this purpose.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds preliminary support for the ms7724se solution engine board.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
hd64461 is mapped in a fixed location, so the I/O base itself is fairly
meaningless as a configuration item. Additionally, this makes it
impossible to share hd64461 code alongside generic drivers (in the case
of sh_dac_audio), so simply make it commonly defined and permit the
mach_is_foo() logic to work out the proper semantics.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This plugs in all of the MSTP functions in to the clock framework,
and hands them off to the platform devices that want them.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adopts the OMAP clock framework debugfs bits and replaces the aging
procfs bits. The procfs clocks entry was primarily a debugging aid, and
used to be tied in to cpuinfo before the clock list grew too unweildly.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for constructing a rate table by looking at potential
divisors for a specified clock. Each FQRMR clock is given its own table.
Presently each table is rebuilt when the parent propagates down a new
rate, so some more logic needs to be added to do this more intelligently.
Additionally, a fairly generic round_rate() implementation is then
layered on top of it, which subsequently provides us with cpufreq support.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This updates the SH7785 CPU code as well as the SH7785LCR board support
code for making use of the newly refactored clock framework. Support for
the legacy CPG clocks is dropped at this point, with the extal frequency
fed in from the board code.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This moves out the old legacy CPG clocks to their own file, and converts
over the existing users. With these clocks going away and each CPU
dealing with them on their own, CPUs can gradually move over to the new
interface.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now with all of the TMU users moved over to the new TMU driver, and the
old TMU driver killed off, the left-over infrastructure can go along
with it.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch removes the old TMU driver (CONFIG_SH_TMU/timer-tmu.c)
As replacement, select the sh_tmu driver with CONFIG_SH_TIMER_TMU
and configure timer channel using platform data.
If multiple TMU channels are enabled using platform data, use the
earlytimer parameter on the kernel command line to select channel.
For instance, use "earlytimer=sh_tmu.0" to select the first channel.
To verify which timer is being used, look at printouts or the timer
irq count in /proc/interrupts.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This stubs in clk_get_sys() from the ARM clkdev implementation.
Tentatively conver the clk_get() lookup code to use this, and once the
rest of the in-tree users are happy with this, it can replace the
fallback lookups.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The only user for this is the SH-Mobile r_clk, which is now added as a
root clock and can be kicked via propagate_rate() as usual. Given that,
there is no longer any need for the special clk_recalc_rate(), so we kill
it off.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This causes the generic clk_set_parent() implementation to be a bit more
intelligent. A clk_reparent() is added to move the clock over to the new
parent's sibling list, which then allows the generic rate propagation
code to succeed. This also becomes a nop if the new and old parents are
unchanged.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There are a couple of instances where a clk_enable() can fail, which the
SH-Mobile code presently handles, but doesn't get reported all the way
back up. This fixes up the return type so the errors make it all the way
down to the drivers.
Additionally, we now also error out properly if the parent enable fails.
Prep work for aggressively turning off unused clocks on boot.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There is no real distinction here in behaviour, either a clock needs to
be enabled on initialiation or not. The ALWAYS_ENABLED flag was always
intended to only apply to clocks that were physically always on and could
simply not be disabled at all from software. Unfortunately over time this
was abused and the meaning became a bit blurry.
So, we kill off both of all of those paths now, as well as the newer
NEEDS_INIT flag, and consolidate on a CLK_ENABLE_ON_INIT. Clocks that
need to be enabled on initialization can set this, and it will purposely
enable them and bump the refcount up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This resyncs the rate propagation strategy with the scheme used by the
OMAP clock framework. Child clocks are tracked on a list under each
parent and propagation happens there specifically rather than constantly
iterating over the global clock list.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds a followparent_recalc() helper for clocks that just follow the
parent's rate. Switch over the few CPUs that use this scheme for some of
their clocks.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This fixes up the broken I2C offset in 32-bit mode.
The cause is because the board datasheet had a mistake.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
There is nothing in these routines that inherently depends on R0 use.
Given that these routines are inlined, it is rather easy to blow up the
compiler by exhausting the spill class when performing a 64-bit swab.
This presently manifests itself as the following:
CC fs/ocfs2/suballoc.o
fs/ocfs2/suballoc.c: In function 'ocfs2_reserve_suballoc_bits':
fs/ocfs2/suballoc.c:638: error: unrecognizable insn:
(insn 2793 1230 1231 103 arch/sh/include/asm/swab.h:33 (set (reg:HI 853)
(subreg:HI (reg:SI 149 macl) 2)) -1 (expr_list:REG_DEAD (reg:SI 149 macl)
(nil)))
fs/ocfs2/suballoc.c:638: internal compiler error: in extract_insn, at recog.c:1991
This patch switches over to using an arbitrarily assigned register instead.
While the same issue does not exist in the SH-5 case, there is likewise no harm
in having an alternate register used for the byterev/shari pair.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
These are presently only defined for sh32, use the plain unoptimized
versions for sh64. Fixes up smsc911x build.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch updates the clock framework use count code.
With this patch the enable() and disable() callbacks
only get called when counting from and to zero.
While at it the kref stuff gets replaced with an int.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Nothing is using this anymore now that we have fully converted to generic
time, so kill it off completely.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently this is special-cased for early initialization. While there are
situations where these static early initializations are still necessary,
with minor changes it is possible to use this for the regular ioremap
implementation as well. This allows us to kill off the special-casing for
the remap completely and to start tidying up all of the SH-5
special-casing in drivers.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently shm_align_mask is only looked at for the bottom up case, but we
still want this for proper colouring constraints in the topdown case.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that the stragglers (MTU2/CMT/etc.) have been rewritten and we are
selecting both GENERIC_TIME and GENERIC_CLOCKEVENTS, the get_offset()
timer op is completely unused. As a result, we are now able to kill off
the ARCH_USES_GETTIMEOFFSET references.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch removes the old MTU2 driver (CONFIG_SH_MTU2/timer-mtu2.c)
As replacement, select the sh_cmt driver with CONFIG_SH_TIMER_MTU2
and configure timer channel using platform data.
If multiple MTU channels are enabled using platform data, use the
earlytimer parameter on the kernel command line to select channel.
For instance, use "earlytimer=sh_mtu2.0" to select the first channel.
To verify which timer is being used, look at printouts or the timer
irq count in /proc/interrupts.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Convert sh to use GENERIC_TIME via the arch_getoffset() infrastructure.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch removes the old CMT driver (CONFIG_SH_CMT/timer-cmt.c)
As replacement, select the sh_cmt driver with CONFIG_SH_TIMER_CMT
and configure timer channel using platform data.
If multiple CMT channels are enabled using platform data, use the
earlytimer parameter on the kernel command line to select channel.
For instance, use "earlytimer=sh_cmt.0" to select the first channel.
To verify which timer is being used, look at printouts or the timer
irq count in /proc/interrupts.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
All 32-bit SuperH processors currently go through __ioremap_mode()
and check for IO_TRAPPED and directly mapped segments. With this
patch we simplify the MMU less case with a pass through version of
__ioremap_mode() which just returns the physical address.
The effects of this is change are:
- fix non-MMU ioremap() of high address hardware blocks (sh7203 CMT)
- make sure IO_TRAPPED is not selected
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
syscall_nr is presently defined as unsigned in the SH-5 pt_regs,
while the syscall restarting code wants it to be signed. Fix this
up, and bring it in line with the other SH parts.
Reported-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This moves off of the board_pci_channels[] approach for bus registration
and over to a cleaner register_pci_controller(), all derived from the
MIPS code.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Not all PCI channels have non-translatable memory windows, this is a
special property of the on-chip PCIC with its 0xfd00... mapping, handle
this explicitly.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This consolidates the pci_iomap() definitions and reworks how the I/O
port base is handled. PCI channels can register their own I/O map base,
or if none is provided, the system-wide generic I/O base is used instead.
Functionally nothing changes, while this allows us to kill off lots of
I/O address special casing and lookups.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This is left over cruft that hasn't been used by anything in a long time,
kill off bits that weren't purged previously.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This introduces a saner pcibios_align_resource() that can be used
regardless of whether pci-auto or pci-new are being used, and
consolidates it in pci-lib.c.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The new PCI code wants its own bus<->resource mappings instead of the
generic equivalents, so drop the asm-generic include in preparation.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add a plat_early_device_setup() function to allow
processor-specific code to register Early Platform Data.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Adds a __get_pci_io_base() function which is used to match a port range
against struct pci_channel. This allows us to detect if a port range is
assigned to pci or happens to be legacy port io. While at it, remove unused
cpu-specific cruft.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch changes the code to use __is_pci_memory() instead of
is_pci_memaddr(). __is_pci_memory() loops through all the pci
channels on the system to match memory windows.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Store the io window base address in struct pci_channel and use that one
instead of SH77xx_PCI_IO_BASE.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Store the base address of the pci host controller registers in struct
pci_channel and use the address in pci_read_reg() and pci_write_reg().
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Store a struct pci_channel pointer in bus->sysdata. This makes whatever
struct pci_channel assigned to a bus available for sh4_pci_read() and
sh4_pci_write(). We also modify PCIBIOS_MIN_IO and PCIBIOS_MIN_MEM to
use bus->sysdata - this to gives us support for multiple pci channels.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds an init callback to struct pci_channel and makes sure
it is initialized properly. Code is added to call this init function
from pcibios_init(). Return values are adjusted and a warning is is
printed if init fails.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements initial support for the SH-Mobile R2R CPU.
Based on Rev 0.11 of the initial SH7724 hardware manual.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (23 commits)
sh: sh7785lcr: Map whole PCI address space.
sh: Fix up DSP context save/restore.
sh: Fix up number of on-chip DMA channels on SH7091.
sh: update defconfigs.
sh: Kill off broken direct-mapped cache mode.
sh: Wire up ARCH_HAS_DEFAULT_IDLE for cpuidle.
sh: Add a command line option for disabling I/O trapping.
sh: Select ARCH_HIBERNATION_POSSIBLE.
sh: migor: Fix up CEU use flags.
input: migor_ts: add wakeup support
rtc: rtc-sh: use set_irq_wake()
input: sh_keysc: use enable/disable_irq_wake()
sh: intc: set_irq_wake() support
sh: intc: install enable, disable and shutdown callbacks
clocksource: sh_cmt: use remove_irq() and remove clockevent workaround
sh: ap325 and Migo-R use new sh_mobile_ceu_info flags
sh: Fix up -Wformat-security whining.
sh: ap325rxa: Add ov772x support, again.
sh: Sanitize asm/mmu.h for assembly use.
sh: Tidy up sh7786 pinmux table.
...
There were a number of issues with the DSP context save/restore code,
mostly left-over relics from when it was introduced on SH3-DSP with
little follow-up testing, resulting in things like task_pt_dspregs()
referencing incorrect state on the stack.
This follows the MIPS convention of tracking the DSP state in the
thread_struct and handling the state save/restore in switch_to() and
finish_arch_switch() respectively. The regset interface is also updated,
which allows us to finally be rid of task_pt_dspregs() and the special
cased task_pt_regs().
Signed-off-by: Michael Trimarchi <michael@evidence.eu.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Pass the original flags to rwlock arch-code, so that it can re-enable
interrupts if implemented for that architecture.
Initially, make __raw_read_lock_flags and __raw_write_lock_flags stubs
which just do the same thing as non-flags variants.
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cpuidle wants ARCH_HAS_DEFAULT_IDLE defined in order to use the
default idle loop. So, make it accessible and enable it for all
sh machines.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds the ifndef __ASSEMBLY__ preprocessor to allow the
defines in the file to be used also in assembly code.
Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Everyone defines it, and only one person uses it
(arch/mips/sgi-ip27/ip27-nmi.c). So just open code it there.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-mips@linux-mips.org
Rework the hd64461 demuxer code to fix the HD64461 level-triggered
interrupts handling, using handle_level_irq() as needed.
Signed-off-by: Rafael Ignacio Zurita <rizurita@yahoo.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This follows the ARM change from Aaro Koskinen:
When unmapping N pages (e.g. shared memory) the amount of TLB
flushes done can be (N*PAGE_SIZE/ZAP_BLOCK_SIZE)*N although it
should be N at maximum. With PREEMPT kernel ZAP_BLOCK_SIZE is 8
pages, so there is a noticeable performance penalty when
unmapping a large VMA and the system is spending its time in
flush_tlb_range().
The problem is that tlb_end_vma() is always flushing the full VMA
range. The subrange that needs to be flushed can be calculated by
tlb_remove_tlb_entry(). This approach was suggested by Hugh
Dickins, and is also used by other arches.
The speed increase is roughly 3x for 8M mappings and for larger
mappings even more.
Bits and peices are taken from the ARM patch as well as the existing
arch/um implementation that is quite similar.
The end result is a significant reduction in both partial and full TLB
flushes initiated through flush_tlb_range().
At the same time, the nommu implementation was broken, had a superfluous
cache flush, and subsequently would have triggered a BUG_ON() if a
code-path had triggered it. Tidy this up for correctness and provide a
nopped-out implementation there.
More background on the initial discussion can be found at:
http://marc.info/?t=123609820900002&r=1&w=2http://marc.info/?t=123660375800003&r=1&w=2
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds support for extended ASIDs (up to 16-bits) on newer SH-X3 cores
that implement the PTAEX register and respective functionality. Presently
only the 65nm SH7786 (90nm only supports legacy 8-bit ASIDs).
The main change is in how the PTE is written out when loading the entry
in to the TLB, as well as in how the TLB entry is selectively flushed.
While SH-X2 extended mode splits out the memory-mapped U and I-TLB data
arrays for extra bits, extended ASID mode splits out the address arrays.
While we don't use the memory-mapped data array access, the address
array accesses are necessary for selective TLB flushes, so these are
implemented newly and replace the generic SH-4 implementation.
With this, TLB flushes in switch_mm() are almost non-existent on newer
parts.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch contains CONFIG_SUSPEND support to the SuperH
architecture. If enabled, SuperH Mobile processors will
register their suspend callbacks during boot.
To suspend, use "echo mem > /sys/power/state". To allow
wakeup, make sure "/sys/device/platform/../power/wakeup"
contains "enabled". Additional per-device driver patches
are most likely needed.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch adds the clk_set_parent/clk_get_parent routines to the sh
clock framework.
Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds DMA support for newer SH-4A CPUs, particularly SH7763/64/80/85.
This also enables multi IRQ support for platforms that have multiple
vectors bound to the same IRQ source.
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This provides a method for supporting fixed PMB mappings inherited from
the bootloader, as an alternative to the dynamic PMB mapping currently
used by the kernel. In the future these methods will be combined.
P1/P2 area is handled like a regular 29-bit physical address, and local
bus device are assigned P3 area addresses.
Signed-off-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add Suspend-to-disk / swsusp / CONFIG_HIBERNATION support
to the SuperH architecture.
To suspend, use "swapon /dev/sda2; echo disk > /sys/power/state"
To resume, pass "resume=/dev/sda2" on the kernel command line.
The patch "pm: rework includes, remove arch ifdefs V2" is
needed to allow the generic swsusp code to build properly.
Hibernation is not enabled with this patch though, a patch
setting ARCH_HIBERNATION_POSSIBLE will be submitted later.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds preliminary support for the SH7786-based Urquell board.
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This adds preliminary support for the SH7786 CPU subtype.
While this is a dual-core CPU, only UP is supported for now. L2 cache
support is likewise not yet implemented.
More information on this particular CPU subtype is available at:
http://www.renesas.com/fmwk.jsp?cnt=sh7786_root.jsp&fp=/products/mpumcu/superh_family/sh7780_series/sh7786_group/
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The SH-3 does not support 'pref'-based prefetching, only SH-2A and SH-4A
parts do. Remove SH-3 from the list.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Prefetch early exception data. There is unused space in our
exception handler cache line anyway, so this is almost free.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Remove EXPEVT vector from the stack, lookup_exception_vector()
for sh3/sh4/sh4a is already using k2 to get the vector.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
User space can request hardware and/or software time stamping.
Reporting of the result(s) via a new control message is enabled
separately for each field in the message because some of the
fields may require additional computation and thus cause overhead.
User space can tell the different kinds of time stamps apart
and choose what suits its needs.
When a TX timestamp operation is requested, the TX skb will be cloned
and the clone will be time stamped (in hardware or software) and added
to the socket error queue of the skb, if the skb has a socket
associated with it.
The actual TX timestamp will reach userspace as a RX timestamp on the
cloned packet. If timestamping is requested and no timestamping is
done in the device driver (potentially this may use hardware
timestamping), it will be done in software after the device's
start_hard_xmit routine.
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rework and simplify the sched_clock and clocksource code. Instead
of registering the clocksource in a shared file we move it into the
tmu driver. Also, add code to handle sched_clock in the case of no
clocksource.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Fix macros start_thread and user_stack_pointer.
When these macros aren't called with a variable named regs as second
argument, this will result in a build failure.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that atomic_t is a generic opaque type for all architectures, it is
unwise to use intimate knowledge of its internals when manipulating it.
Instead of relying on the "counter" member being at offset 0 from the
beginning of an atomic_t, explicitly reference the member. This guards
us from any changes to the layout of the beginning of the atomic_t type.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
When dereferencing the memory address contained in a register and
modifying the value at that memory address, the register should not be
listed in the inline asm outputs. The value at the memory address is an
output (which is taken care of with the "memory" clobber), not the register.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This corrects a deadlock encountered on ap325 in the cases where the
mutex is contended and the slow-path needs to be fallen back upon.
Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com>
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The T-bit manipulation for syscall error checking had the side effect of
spuriously returning ERESTART* errno values over EINTR. So, we simplify
the error checking a bit and leave the T-bit alone.
Reported-by: Kaz Kojima <kkojima@rr.iij4u.or.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch updates the SuperH gpio code to make use of gpiolib. The
gpiolib callbacks get() and set() are lockless, but we use our own
spinlock for the other operations to make sure hardware register
bitfield accesses stay atomic.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch optimizes the gpio data register handling for gpio_set_value().
Instead of using the good old spinlock-plus-read-modify-write strategy
we now use a shadow register and atomic operations.
This improves the bitbanging mmc performance on Migo-R from 26 Kbytes/s
to 40 Kbytes/s.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch modifies the table based SuperH gpio implementation to
make use of direct table lookups. With this change the functions
gpio_get_value() and gpio_set_value() are O(1).
Tested on Migo-R using bitbanging mmc. Performance is improved from
11 KBytes/s to 26 Kbytes/s.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Bring sh in line with all the other ports. Not sure how sh missed this
change as all the other arches were being updated ...
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
[CVE-2009-0029] s390 specific system call wrappers
[CVE-2009-0029] System call wrappers part 33
[CVE-2009-0029] System call wrappers part 32
[CVE-2009-0029] System call wrappers part 31
[CVE-2009-0029] System call wrappers part 30
[CVE-2009-0029] System call wrappers part 29
[CVE-2009-0029] System call wrappers part 28
[CVE-2009-0029] System call wrappers part 27
[CVE-2009-0029] System call wrappers part 26
[CVE-2009-0029] System call wrappers part 25
[CVE-2009-0029] System call wrappers part 24
[CVE-2009-0029] System call wrappers part 23
[CVE-2009-0029] System call wrappers part 22
[CVE-2009-0029] System call wrappers part 21
[CVE-2009-0029] System call wrappers part 20
[CVE-2009-0029] System call wrappers part 19
[CVE-2009-0029] System call wrappers part 18
[CVE-2009-0029] System call wrappers part 17
[CVE-2009-0029] System call wrappers part 16
[CVE-2009-0029] System call wrappers part 15
...
Add swab.h to kbuild.asm and remove the individual entries from
each arch, mark as unifdef as some arches have some kernel-only
bits inside.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove __attribute__((weak)) from common code sys_pipe implemantation.
IA64, ALPHA, SUPERH (32bit) and SPARC (32bit) have own implemantations
with the same name. Just rename them.
For sys_pipe2 there is no architecture specific implementation.
Cc: Richard Henderson <rth@twiddle.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Make VMAs per mm_struct as for MMU-mode linux. This solves two problems:
(1) In SYSV SHM where nattch for a segment does not reflect the number of
shmat's (and forks) done.
(2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an
exec'ing process when VM_EXECUTABLE is specified, regardless of the fact
that a VMA might be shared and already have its vm_mm assigned to another
process or a dead process.
A new struct (vm_region) is introduced to track a mapped region and to remember
the circumstances under which it may be shared and the vm_list_struct structure
is discarded as it's no longer required.
This patch makes the following additional changes:
(1) Regions are now allocated with alloc_pages() rather than kmalloc() and
with no recourse to __GFP_COMP, so the pages are not composite. Instead,
each page has a reference on it held by the region. Anything else that is
interested in such a page will have to get a reference on it to retain it.
When the pages are released due to unmapping, each page is passed to
put_page() and will be freed when the page usage count reaches zero.
(2) Excess pages are trimmed after an allocation as the allocation must be
made as a power-of-2 quantity of pages.
(3) VMAs are added to the parent MM's R/B tree and mmap lists. As an MM may
end up with overlapping VMAs within the tree, the VMA struct address is
appended to the sort key.
(4) Non-anonymous VMAs are now added to the backing inode's prio list.
(5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of
the backing region. The VMA and region structs will be split if
necessary.
(6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory
segment instead of all the attachments at that addresss. Multiple
shmat()'s return the same address under NOMMU-mode instead of different
virtual addresses as under MMU-mode.
(7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode.
(8) /proc/maps is now the global list of mapped regions, and may list bits
that aren't actually mapped anywhere.
(9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount
of RAM currently allocated by mmap to hold mappable regions that can't be
mapped directly. These are copies of the backing device or file if not
anonymous.
These changes make NOMMU mode more similar to MMU mode. The downside is that
NOMMU mode requires some extra memory to track things over NOMMU without this
patch (VMAs are no longer shared, and there are now region structs).
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
The atomic_t type cannot currently be used in some header files because it
would create an include loop with asm/atomic.h. Move the type definition
to linux/types.h to break the loop.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: New APIs
The old node_to_cpumask/node_to_pcibus returned a cpumask_t: these
return a pointer to a struct cpumask. Part of removing cpumasks from
the stack.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul Mundt <lethal@linux-sh.org>
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.
What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We don't really want this enabled by default, but it is still quite
useful for debugging. So, make it conditional and leave it off by
default.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Now that the rest of the boards that were using cf-enabler "generically"
have switched to setting up their mappings on their own, only the mach-se
boards were left using it. All of the cf-enabler using mach-se boards
use a special initialization of the MRSHPC windows rather than going
through the special PTE as other SH-4 platforms do. This consolidates
the MRSHPC setup logic, hooks it up on the boards that care, and gets rid
of any and all remaining references to cf-enabler.
This has been long overdue, as cf-enabler has been the bane of
arch/sh/kernel for the last 7 years. Good riddance.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Nothing is using this any more, so get rid of it before anyone gets the
bright idea to start using it again.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With the reworked kgdb support, we always detach and reinitialize the
stub. This was mostly a feature for handoffs between sh-ipl+g and the
kgdb stub, but virtually no sh-ipl+g versions ever had this working
right in the first place.
Given that the sh-ipl+g stubs in general use today don't even support
the GDB stub, and we have already killed off the special casing in the
sh-sci serial driver, kill off this now unused symbol too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This migrates from the old bitrotted kgdb stub implementation and moves
to the generic stub. In the process support for SH-2/SH-2A is also added,
which the old stub never provided.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>