Commit Graph

8470 Commits

Author SHA1 Message Date
Milton Miller 1e8c23013e powerpc: Remove virq_to_host
The only references to the irq_map[].host field are internal to
arch/powerpc/kernel/irq.c

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:32:01 +10:00
Milton Miller 3ee62d365b powerpc: Add virq_is_host to reduce virq_to_host usage
Some irq_host implementations are using virq_to_host to check if
they are the irq_host for a virtual irq.  To allow us to make space
versus time tradeoffs, replace this usage with an assertive
virq_is_host that confirms or denies the irq is associated with the
given irq_host.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:59 +10:00
Milton Miller 9553361499 powerpc/axon_msi: Validate msi irq via chip_data
Instead of checking for rogue msi numbers via the irq_map host field
set the chip_data to h.host_data (which is the msic struct pointer)
at map and compare it in get_irq.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:57 +10:00
Milton Miller 6b0aea44d6 powerpc/spider-pic: Get pic from chip_data instead of irq_map
Building on Grant's efforts to remove the irq_map array, this patch
moves spider-pics use of virq_to_host() to use irq_data_get_chip_data
and sets the irq chip data in the map call, like most other interrupt
controllers in powerpc.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:55 +10:00
Milton Miller da05198002 powerpc: Remove irq_host_ops->remap hook
It was called from irq_create_mapping if that was called for a host
and hwirq that was previously mapped, "to update the flags".  But the
only implementation was in beat_interrupt and all it did was repeat a
hypervisor call without error checking that was performed with error
checking at the beginning of the map hook.  In addition, the comment on
the beat remap hook says it will only called once for a given mapping,
which would apply to map not remap.

All flags should be known by the time the match hook is called, before
we call the map hook.  Removing this mostly unused hook will simpify
the requirements of irq_domain concept.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:53 +10:00
Milton Miller 23f73a5fb0 powerpc/psurge: Create a irq_host for secondary cpus
Create a dummy irq_host using the generic dummy irq chip for the secondary
cpus to use.  Create a direct irq mapping for the ipi and register the
ipi action handler against it.  If for some unlikely reason part of this
fails then don't detect the secondary cpus.

This removes another instance of NO_IRQ_IGNORE, records the ipi stats
for the secondary cpus, and runs the ipi on the interrupt stack.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:51 +10:00
Milton Miller 67347eba15 powerpc/mpc62xx_pic: Fix get_irq handling of NO_IRQ
If none of irq category bits were set mpc52xx_get_irq() would pass
NO_IRQ_IGNORE (-1) to irq_linear_revmap, which does an unsigned compare
and declares the interrupt above the linear map range.  It then punts
to irq_find_mapping, which performs a linear search of all irqs,
which will likely miss and only then return NO_IRQ.

If no status bit is set, then we should return NO_IRQ directly.
The interrupt should not be suppressed from spurious counting, in fact
that is the definition of supurious.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:49 +10:00
Milton Miller c42385cd45 powerpc/mpc5121_ads_cpld: Remove use of NO_IRQ_IGNORE
As NO_IRQ_IGNORE is only used between the static function cpld_pic_get_irq
and its caller cpld_pic_cascade, and cpld_pic_cascade only uses it to
suppress calling handle_generic_irq, we can change these uses to NO_IRQ
and remove the extra tests and pathlength in cpld_pic_cascade.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:47 +10:00
Milton Miller d1921bcdee powerpc/fsl_msi: Use chip_data not handler_data
handler_data should be reserved for flow handlers on the dependent
irq, not consumed by the parent irq code that is part of the irq_chip
code.  The msi_data pointer was already set in msidesc->irqhost->hostdata
and being copied to irq_data->chipdata in the msidesc->irqhost->map()
method called via create_irq_mapping, so we can obtain the pointer
from there and free the instance it in teardown_msi_irqs.

Also remove the unnecessary cast of irq_get_handler_data in the
cascade handler, which is the demux flow handler of the parent
msi interrupt.  (This is the expected usage for handler_data).

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:45 +10:00
Milton Miller 6c4c82e20a powerpc/fsl_msi: Don't abuse platform_data for driver_data
The msi platform device driver was abusing dev.platform_data for its
platform_driver_data.  Use the correct pointer for storage.

Platform_data is supposed to be for platforms to communicate to drivers
parameters that are not otherwise discoverable.  Its lifetime matches
the platform_device not the platform device driver.  It is generally
not needed for drivers that only support systems with device trees.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:43 +10:00
Milton Miller 7ee342bdc3 powerpc: Remove i8259 irq_host_ops->unmap
It was never called because the host is always IRQ_HOST_MAP_LEGACY.

And what it purported to do was mask the interrupt (which will already
have happend if we shutdown the interrupt), then synchronise_irq and
clear the chip pointer, both of which will have been be done by the
caller were we to call unmap on a legacy irq.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:41 +10:00
Milton Miller df74e70ac2 powerpc: Remove trival irq_host_ops.unmap
These all just clear chip or chipdata fields, which will be done
by the generic code when we call irq_free_descs.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:39 +10:00
Milton Miller 2d441681a4 powerpc: Return early if irq_host lookup type is wrong
If for some reason the code incrorectly calls the wrong function to
manage the revmap, not only should we warn, we should take action.
However, in the paths we expect to be taken every delivered interrupt
change to WARN_ON_ONCE.  Use the if (WARN_ON(x)) format to get the
unlikely for free.

Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:37 +10:00
Milton Miller 3af259d155 powerpc: Radix trees are available before init_IRQ
Since the generic irq code uses a radix tree for sparse interrupts,
the initcall ordering has been changed to initialize radix trees before
irqs.   We no longer need to defer creating revmap radix trees to the
arch_initcall irq_late_init.

Also, the kmem caches are allocated so we don't need to use
zalloc_maybe_bootmem.

Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:35 +10:00
Milton Miller e085255ebc powerpc/xics: Cleanup xics_host_map and ipi
Since we already have a special case in map to set the ipi handler, use
the desired flow.

If we don't find an ics to handle the interrupt complain instead of
returning 0 without having set a chip or handler.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:33 +10:00
Milton Miller 714542721b powerpc: Use bytes instead of bitops in smp ipi multiplexing
Since there are only 4 messages, we can replace the atomic bit set
(which uses atomic load reserve and store conditional sequence) with
a byte stores to seperate bytes.  We still have to perform a load
reserve and store conditional sequence to avoid loosing messages on
reception but we can do that with a single call to xchg.

The do {} while and __BIG_ENDIAN specific mask testing was chosen by
looking at the generated asm code.  On gcc-4.4, the bit masking becomes
a simple bit mask and test of the register returned from xchg without
storing and loading the value to the stack like attempts with a union
of bytes and an int (or worse, loading single bit constants from the
constant pool into non-voliatle registers that had to be preseved on
the stack).  The do {} while avoids an unconditional branch to the
end of the loop to test the entry / repeat condition of a while loop
and instead optimises for the expected single iteration of the loop.

We have a full mb() at the beginning to cover ordering between send,
ipi, and receive so we can use xchg_local and forgo the further
acquire and release barriers of xchg.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:31 +10:00
Milton Miller 1ece355b68 powerpc: Add kconfig for muxed smp ipi support
Compile the new smp ipi mux and demux code only if a platform
will make use of it.  The new config is selected as required.

The new cause_ipi smp op is only available conditionally to point out
configs where the select is required; this makes setting the op an
immediate fail instead of a deferred unresolved symbol at link.

This also creates a new config for power surge powermac upgrade support
that can be disabled in expert mode but is default on.

I also removed the depends / default y on CONFIG_XICS since it is selected
by PSERIES.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:05 +10:00
Milton Miller 23d72bfd8f powerpc: Consolidate ipi message mux and demux
Consolidate the mux and demux of ipi messages into smp.c and call
a new smp_ops callback to actually trigger the ipi.

The powerpc architecture code is optimised for having 4 distinct
ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
single, scheduler ipi, and enter debugger).  However, several interrupt
controllers only provide a single software triggered interrupt that
can be delivered to each cpu.  To resolve this limitation, each smp_ops
implementation created a per-cpu variable that is manipulated with atomic
bitops.  Since these lines will be contended they are optimialy marked as
shared_aligned and take a full cache line for each cpu.  Distro kernels
may have 2 or 3 of these in their config, each taking per-cpu space
even though at most one will be in use.

This consolidation removes smp_message_recv and replaces the single call
actions cases with direct calls from the common message recognition loop.
The complicated debugger ipi case with its muxed crash handling code is
moved to debug_ipi_action which is now called from the demux code (instead
of the multi-message action calling smp_message_recv).

I put a call to reschedule_action to increase the likelyhood of correctly
merging the anticipated scheduler_ipi() hook coming from the scheduler
tree; that single required call can be inlined later.

The actual message decode is a copy of the old pseries xics code with its
memory barriers and cache line spacing, augmented with a per-cpu unsigned
long based on the book-e doorbell code.  The optional data is set via a
callback from the implementation and is passed to the new cause-ipi hook
along with the logical cpu number.  While currently only the doorbell
implemntation uses this data it should be almost zero cost to retrieve and
pass it -- it adds a single register load for the argument from the same
cache line to which we just completed a store and the register is dead
on return from the call.  I extended the data element from unsigned int
to unsigned long in case some other code wanted to associate a pointer.

The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
conditioned on the CPU_DBELL feature.  The ifdef guard could be relaxed
to CONFIG_SMP but I left it with BOOKE for now.

Also, the doorbell interrupt vector for book-e was not calling irq_enter
and irq_exit, which throws off cpu accounting and causes code to not
realize it is running in interrupt context.  Add the missing calls.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:03 +10:00
Milton Miller 17f9c8a73b powerpc: Move smp_ops_t from machdep.h to smp.h
I can't see any reason these functions are needed by machdep.h
and they are all hidden by CONFIG_SMP with no UP alternative.

Also move the declarations for the fallback timebase ops, which
are used to fill in the smp ops.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:01 +10:00
Milton Miller d4fc8fe1f6 powerpc: Remove stubbed beat smp support
I have no idea if the beat hypervisor supports multiple cpus in
a partition, but the code has not been touched since these stubs
were added in February of 2007 except to move them in April of 2008.
These are stubs: start_cpu always returns fail (which is dropped),
the message passing and reciving are empty functions, and the top
of file comment says "Incomplete".

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:30:59 +10:00
Milton Miller a56555e573 powerpc: Remove alloc_maybe_bootmem for zalloc version
Replace all remaining callers of alloc_maybe_bootmem with
zalloc_maybe_bootmem.   The callsite in pci_dn is followed with a
memset to clear the memory, and not zeroing at the other callsites
in the celleb fake pci code could lead to following uninitialized
memory as pointers or even freeing said pointers on error paths.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:30:57 +10:00
Milton Miller 7ca8aa0924 powerpc: Remove powermac/pic.h
Its unused, and of the three declarations, one is duplicated in pmac.h,
the second is static and the third is renamed and static.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:30:55 +10:00
Milton Miller 3caba98fdd powerpc/mpic: Simplify ipi cpu mask handling
Now that MSG_ALL and MSG_ALL_BUT_SELF have been eliminated,
smp_mpic_mesage_pass no longer needs to lookup the cpumask just to
have mpic_send_ipi extract part of it and recode it in a NR_CPUS loop
by mpic_physmask.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:30:53 +10:00
Milton Miller f1072939b6 powerpc: Remove checks for MSG_ALL and MSG_ALL_BUT_SELF
Now that smp_ops->smp_message_pass is always called with an (online) cpu
number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:46 +10:00
Milton Miller e047637132 powerpc: Remove call sites of MSG_ALL_BUT_SELF
The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc,
and it only uses it to start the debugger. Both debuggers always call
smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do
anything more optimal than a loop over all online cpus, but all message
passing implementations have to code for this special delivery target.

Convert smp_send_debugger_break to take void and loop calling the smp_ops
message_pass function for each of the other cpus in the online cpumask.

Use raw_smp_processor_id() because we are either entering the debugger
or trying to start kdump and the additional warning it not useful were
it to trigger.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:46 +10:00
Milton Miller 2a116f3dd0 powerpc/mpic: Break cpumask abstraction earlier
mpic_set_affinity is allocating and freeing a cpumask var even though
it was breaking the cpumask abstraction when passing the mask to
mpic_physmask.  It also didn't have any check for allocatin failure.

Break the cpumask abstraction earlier and use simple bitwise and of the
bits from the mask with the bits of cpu_online_mask.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:45 +10:00
Milton Miller ebc0421510 powerpc/mpic: Limit NR_CPUS loop to 32 bit
mpic_physmask was looping NR_CPUS times over a mask that was passed as
a u32. Since mpic is architecturaly limited to 32 physical cpus, clamp
the logical cpus to 32 when compiling (we could also clamp at runtime
to nr_cpu_ids).

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:45 +10:00
Milton Miller aa79bc2167 powerpc: Call no-longer static setup_nr_cpu_ids instead of replicating it
c1854e0072 (powerpc: Set nr_cpu_ids early
and use it to free PACAs) copied the formerly static setup_nr_cpu_ids
from init/main.c but 34db18a054 (smp:
move smp setup functions to kernel/smp.c) moved it to kernel/smp.c
with a declaration in include/linux/smp.h, so we can call it instead of
replicating it.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:45 +10:00
Milton Miller 2cd947f175 powerpc: Use nr_cpu_ids in initial paca allocation
Now that we never set a cpu above nr_cpu_ids possible we can
limit our initial paca allocation to nr_cpu_ids.  We can then
clamp the number of cpus in platforms/iseries/setup.c.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller 8657ae28dd powerpc: Respect nr_cpu_ids when calling set_cpu_possible and set_cpu_present
We should not set cpus above nr_cpu_ids to possible.  While we
will trigger a warning with CONFIG_CPUMASK_DEBUG, even then the mask
initializers will set the bits beyond what the iterators check and cause
nr_cpu_ids to increase.

Respecting nr_cpu_ids during setup will allow us to use it in our initial
paca allocation.  It can be reduced from NR_CPUS by the existing early param
nr_cpus=, which was added in 2b633e3fac (smp:
Use nr_cpus= to set nr_cpu_ids early).  We already call parse_early_parms
between finding the command line and allocating the pacas.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller 7c82733744 powerpc/iseries: Cleanup and fix secondary startup
9cb82f2f46 (Make iSeries spin on
__secondary_hold_spinloop, like pSeries) added a load of current_set
but this load was repeated later and we don't even have the paca yet.
It also checked __secondary_hold_spinloop with a 32 bit compare instead
of a 64 bit compare.

b6f6b98a4e (Don't spin on sync instruction
at boot time) missed the copy of the startup code in iseries.

1426d5a3bd (Dynamically allocate pacas)
doesn't allow for pacas to be less than lppacas and recalculated the paca
location from the cpu id in r0 every time through the secondary loop.

Various revisions over time made the comments on conditional branches
confusing with respect to being a hold loop or forward progress

Mostly in-order description of the changes:

Replicate the few lines of code saved by the ugly scoped ifdef CONFIG_SMP
in the secondary loop between yielding on UP and marking time with the
hypervisor on SMP.  Always compile the iseries_secondary_yield loop and
use it if the cpu id is above nr_cpu_ids.  Change all forward progress
paths to be forward branches to the next numerical label.  Assign a
label to all loops.  Move all sync instructions from the loops to the
forward progress path.  Wait to load current_set until paca is set to go.
Move the iseries_secondary_smp_loop label to cover the whole spin loop.
Add HMT_MEDIUM when we make forward progress.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller bd9e5eefec powerpc/kdump64: Don't reference freed memory as pacas
Starting with 1426d5a3bd (powerpc:
Dynamically allocate pacas) the space for pacas beyond cpu_possible
is freed, but we failed to update the loop in crash.c.

Since c1854e0072 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) the number of pacas allocated is
always nr_cpu_ids.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <stable@kernel.org> # .34.x
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller 768d18ad6d powerpc: Don't search for paca in freed memory
Starting with 1426d5a3bd (powerpc:
Dynamically allocate pacas) we free the memory for pacas beyond
cpu_possible, but we failed to update the loop the secondary cpus use
to find their paca.  If the system has running cpu threads for which
the kernel did not allocate a paca for they will search the memory that
was freed.  For instance this could happen when the device tree for
a kdump kernel was not updated after a cpu hotplug, or the kernel is
running with more cpus than the kernel was configured.

Since c1854e0072 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) we set nr_cpu_ids before telling the
cpus to advance, so use that to limit the search.

We can't reference nr_cpu_ids without CONFIG_SMP because it is defined
as 1 instead of a memory location, but any extra threads should be sent
to kexec_wait in that case anyways, so make that explicit and remove
the search loop for UP.

Note to stable: The fix also requires
c1854e0072 (powerpc: Set
nr_cpu_ids early and use it to free PACAs) to function.  Also
9d07bc841c (Properly handshake CPUs going
out of boot spin loop) affects the second chunk, specifically the branch
target was 3b before and is 4b after that patch, and there was a blank
line before the #ifdef CONFIG_SMP that was removed

Cc: <stable@kernel.org> # .34.x: c1854e0072 powerpc: Set nr_cpu_ids early
Cc: <stable@kernel.org> # .34.x
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Milton Miller 3d2cea732d powerpc/kexec: Fix memory corruption from unallocated slaves
Commit 1fc711f7ff (powerpc/kexec: Fix race
in kexec shutdown) moved the write to signal the cpu had exited the kernel
from before the transition to real mode in kexec_smp_wait to kexec_wait.

Unfornately it missed that kexec_wait is used both by cpus leaving the
kernel and by secondary slave cpus that were not allocated a paca for
what ever reason -- they could be beyond nr_cpus or not described in
the current device tree for whatever reason (for example, kexec-load
was not refreshed after a cpu hotplug operation).  Cpus coming through
that path they will write to paca[NR_CPUS] which is beyond the space
allocated for the paca data and overwrite memory not allocated to pacas
but very likely still real mode accessable).

Move the write back to kexec_smp_wait, which is used only by cpus that
found their paca, but after the transition to real mode.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <stable@kernel.org> # (1fc711f was backported to 2.6.32)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Anton Blanchard f0e939ae37 powerpc/pseries: Print corrupt r3 in FWNMI code
I have a report of an FWNMI with an r3 value that we think is
corrupt, but since we don't print r3 we have no idea what was
wrong with it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Nishanth Aravamudan eb0dd411bd pseries/iommu: Restore iommu table pointer when restoring iommu ops
When we swtich to direct dma ops, we set the dma data union to have the
dma offset.  When we switch back to iommu table ops because of a later
dma_set_mask, we need to restore the iommu table pointer. Without this
change, crashes have been observed on kexec where (for reasons still
being investigated) we fall back to a 32-bit dma mask on a particular
device and then panic because the table pointer is not valid.

The easiset way to find this value is to call
pci_dma_dev_setup_pSeriesLP which will search up the pci tree until it
finds the node with the table.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Milton Miller <miltonm@bga.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Anton Blanchard 40f1ce7fb7 powerpc: Remove ioremap_flags
We have a confusing number of ioremap functions. Make things just a
bit simpler by merging ioremap_flags and ioremap_prot.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Anton Blanchard be135f4089 powerpc: Add ioremap_wc
Add ioremap_wc so drivers can request write combining on kernel
mappings.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard f5f0307f42 powerpc: Improve scheduling of system call entry instructions
After looking at our system call path, Mary Brown suggested that we
should put all mfspr SRR* instructions before any mtspr SRR*.

To test this I used a very simple null syscall (actually getppid)
testcase at http://ozlabs.org/~anton/junkcode/null_syscall.c

I tested with the following changes against the pseries_defconfig:

CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n

to remove the overhead of virtual CPU accounting and syscall
auditing.

POWER6:
baseline:       mean = 757.2 cycles       sd = 2.108
modified:       mean = 759.1 cycles       sd = 2.020

POWER7:
baseline:       mean = 411.4 cycles       sd = 0.138
modified:       mean = 404.1 cycles       sd = 0.109

So we have 1.77% improvement on POWER7 which looks significant. The
POWER6 suggest a 0.25% slowdown, but the results are within 1
standard deviation and may be in the noise.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard ba00ce1d6e powerpc: Remove static branch hint in giveup_altivec
A static branch hint will override dynamic branch prediction on
recent POWER CPUs. Since we are about to use more altivec in the
kernel remove the static hint in giveup_altivec that assumes
a userspace task is using altivec.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard d988f0e3f8 powerpc: Simplify 4k/64k copy_page logic
To make it easier to add optimised versions of copy_page, remove
the 4kB loop for 64kB pages and just do all the work in copy_page.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard 37e0c21e9b powerpc/pseries: Enable iSCSI support for a number of cards
Enable iSCSI support for a number of cards. We had the base
networking devices enabled but forgot to enable iSCSI.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:41 +10:00
Anton Blanchard 32218bdd31 powerpc/pseries: Enable Emulex and Qlogic 10Gbit cards
Enable the Qlogic and Emulex 10Gbit adapters.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:41 +10:00
Stratos Psomadakis 2a2c29c1a5 powerpc/mm: Fix compiler warning in pgtable-ppc64.h [-Wunused-but-set-variable]
The variable 'old' is set but not used in the wrprotect functions in
arch/powerpc/include/asm/pgtable-ppc64.h, which can trigger a compiler warning.

Remove the variable, since it's not used anyway.

Signed-off-by: Stratos Psomadakis <psomas@ece.ntua.gr>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:41 +10:00
Nishanth Aravamudan af442a1baa powerpc: Ensure dtl buffers do not cross 4k boundary
Future releases of fimrware will enforce a requirement that DTL buffers
do not cross a 4k boundary. Commit
127493d5dc satisfies this requirement for
CONFIG_VIRT_CPU_ACCOUNTING=y kernels, but if !CONFIG_VIRT_CPU_ACCOUNTING
&& CONFIG_DTL=y, the current code will fail at dtl registration time.
Fix this by making the kmem cache from
127493d5dc visible outside of setup.c and
using the same cache in both dtl.c and setup.c. This requires a bit of
reorganization to ensure ordering of the kmem cache and buffer
allocations.

Note: Since firmware now limits the size of the buffer, I made
dtl_buf_entries read-only in debugfs.

Tested with upcoming firmware with the 4 combinations of
CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_DTL.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:41 +10:00
Nishanth Aravamudan 767303349e powerpc: Fix kexec with dynamic dma windows
When we kexec we look for a particular property added by the first
kernel, "linux,direct64-ddr-window-info", per-device where we already
have set up dynamic dma windows. The current code, though, wasn't
initializing the size of this property and thus when we kexec'd, we
would find the property but read uninitialized memory resulting in
garbage ddw values for the kexec'd kernel and panics. Fix this by
setting the size at enable_ddw() time and ensuring that the size of the
found property is valid at dupe_ddw_if_kexec() time.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:40 +10:00
Michal Marek 31355403db powerpc: Use the deterministic mode of ar
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:40 +10:00
Justin Mattock 93395febdb powerpc: Remove unused config in the Makefile
The patch below removes an unused config variable found by using a kernel
cleanup script.
Note: I did try to cross compile these but hit erros while doing so..
(gcc is not setup to cross compile) and am unsure if anymore needs to be done.
Please have a look if/when anybody has free time.

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:40 +10:00
Michal Marek c4f56af0f6 powerpc: Call gzip with -n
The timestamps recorded in the .gz files add no value.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:40 +10:00
kerstin jonsson c560bbceaf powerpc/4xx: Fix regression in SMP on 476
commit c56e58537d breaks SMP support in PPC_47x chip.
 secondary_ti must be set to current thread info before callin kick_cpu or else
 start_secondary_47x will jump into void when trying to return to c-code.
 In the current setup secondary_ti is initialized before the CPU idle task is started
 and only the boot core will start. I am not sure this is the correct solution, but it
 makes SMP possible in my chip.
 Note! The HOTPLUG support probably need some fixing to, There is no trampoline code
 available in head_44x.S - start_secondary_resume?

Signed-off-by: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 13:09:22 +10:00
Ben Hutchings 35d215fbe4 powerpc/kexec: Fix build failure on 32-bit SMP
Commit b987812b3f left
crash_kexec_wait_realmode() undefined for UP.

Commit 7c7a81b53e defined it for UP but
left it undefined for 32-bit SMP.

Seems like people are getting confused by nested #ifdef's, so move the
definitions of crash_kexec_wait_realmode() after the #ifdef CONFIG_SMP
section.

Compile-tested with 32-bit UP, 32-bit SMP and 64-bit SMP configurations.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 13:09:21 +10:00
Benjamin Herrenschmidt 69e3cea8d5 powerpc/smp: Make start_secondary_resume available to all CPU variants
This should fix SMP & Hotplug builds on FSL BookE and 476

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 13:07:12 +10:00
Grant Likely b1608d69cb drivercore: revert addition of of_match to struct device
Commit b826291c, "drivercore/dt: add a match table pointer to struct
device" added an of_match pointer to struct device to cache the
of_match_table entry discovered at driver match time.  This was unsafe
because matching is not an atomic operation with probing a driver.  If
two or more drivers are attempted to be matched to a driver at the
same time, then the cached matching entry pointer could get
overwritten.

This patch reverts the of_match cache pointer and reworks all users to
call of_match_device() directly instead.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-18 12:32:23 -06:00
Rafael J. Wysocki 2d2a9163bd Merge branch 'syscore' into for-linus
* syscore:
  PM: Remove sysdev suspend, resume and shutdown operations
  PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
  PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM
  PM / AVR32: Use struct syscore_ops instead of sysdevs for PM
  PM / Blackfin: Use struct syscore_ops instead of sysdevs for PM
  ARM / Samsung: Use struct syscore_ops for "core" power management
  ARM / PXA: Use struct syscore_ops for "core" power management
  ARM / SA1100: Use struct syscore_ops for "core" power management
  ARM / Integrator: Use struct syscore_ops for core PM
  ARM / OMAP: Use struct syscore_ops for "core" power management
  ARM: Use struct syscore_ops instead of sysdevs for PM in common code
2011-05-17 23:23:40 +02:00
Greg Kroah-Hartman 82a3242e11 sysfs: remove "last sysfs file:" line from the oops messages
On some arches (x86, sh, arm, unicore, powerpc) the oops message would
print out the last sysfs file accessed.

This was very useful in finding a number of sysfs and driver core bugs
in the 2.5 and early 2.6 development days, but it has been a number of
years since this file has actually helped in debugging anything that
couldn't also be trivially determined from the stack traceback.

So it's time to delete the line.  This is good as we need all the space
we can get for oops messages at times on consoles.

Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-13 16:05:51 -07:00
Ingo Molnar 9cb5baba5e Merge commit 'v2.6.39-rc7' into sched/core 2011-05-12 09:36:18 +02:00
Rafael J. Wysocki f5a592f7d7 PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
Make some PowerPC architecture's code use struct syscore_ops
objects for power management instead of sysdev classes and sysdevs.

This simplifies the code and reduces the kernel's memory footprint.
It also is necessary for removing sysdevs from the kernel entirely in
the future.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-11 21:37:15 +02:00
Grant Likely 85f60ae4ee dt/flattree: explicitly pass command line pointer to early_init_dt_scan_chosen
This patch drops the reference to a global 'cmd_line' variable from
early_init_dt_scan_chosen, and instead passes the pointer to the command
line string via the *data argument.  Each architecture does something
slightly different with the initial command line, so it makes sense for
the architecture to be able to specify the variable name.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-11 14:53:18 +02:00
Bharat Bhushan 09000adb86 KVM: PPC: Fix issue clearing exit timing counters
Following dump is observed on host when clearing the exit timing counters

[root@p1021mds kvm]# echo -n 'c' > vm1200_vcpu0_timing
INFO: task echo:1276 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
echo          D 0ff5bf94     0  1276   1190 0x00000000
Call Trace:
[c2157e40] [c0007908] __switch_to+0x9c/0xc4
[c2157e50] [c040293c] schedule+0x1b4/0x3bc
[c2157e90] [c04032dc] __mutex_lock_slowpath+0x74/0xc0
[c2157ec0] [c00369e4] kvmppc_init_timing_stats+0x20/0xb8
[c2157ed0] [c0036b00] kvmppc_exit_timing_write+0x84/0x98
[c2157ef0] [c00b9f90] vfs_write+0xc0/0x16c
[c2157f10] [c00ba284] sys_write+0x4c/0x90
[c2157f40] [c000e320] ret_from_syscall+0x0/0x3c

        The vcpu->mutex is used by kvm_ioctl_* (KVM_RUN etc) and same was
used when clearing the stats (in kvmppc_init_timing_stats()). What happens
is that when the guest is idle then it held the vcpu->mutx. While the
exiting timing process waits for guest to release the vcpu->mutex and
a hang state is reached.

        Now using seprate lock for exit timing stats.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-05-11 07:57:04 -04:00
Justin P. Mattock 70f23fd66b treewide: fix a few typos in comments
- kenrel -> kernel
- whetehr -> whether
- ttt -> tt
- sss -> ss

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-10 10:16:21 +02:00
Frederic Weisbecker 925f83c085 hw_breakpoints, powerpc: Fix CONFIG_HAVE_HW_BREAKPOINT off-case in ptrace_set_debugreg()
We make use of ptrace_get_breakpoints() / ptrace_put_breakpoints() to
protect ptrace_set_debugreg() even if CONFIG_HAVE_HW_BREAKPOINT if off.
However in this case, these APIs are not implemented.

To fix this, push the protection down inside the relevant ifdef.
Best would be to export the code inside
CONFIG_HAVE_HW_BREAKPOINT into a standalone function to cleanup
the ifdefury there and call the breakpoint ref API inside. But
as it is more invasive, this should be rather made in an -rc1.

Fixes this build error:

  arch/powerpc/kernel/ptrace.c:1594: error: implicit declaration of function 'ptrace_get_breakpoints' make[2]: ***

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: LPPC <linuxppc-dev@lists.ozlabs.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1304639598-4707-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-06 11:24:46 +02:00
Jack Miller a0496d450a powerpc: Add early debug for WSP platforms
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:41 +10:00
David Gibson a1d0d98daf powerpc: Add WSP platform
Add a platform for the Wire Speed Processor, based on the PPC A2.

This includes code for the ICS & OPB interrupt controllers, as well
as a SCOM backend, and SCOM based cpu bringup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:35 +10:00
Richard A Lary 82578e192b powerpc/eeh: Display eeh error location for bus and device
For adapters which have devices under a PCIe switch/bridge it is informative
  to display information for both the PCIe switch/bridge and the device on
  which the bus error was detected.

  rebased to powerpc-next

Signed-off-by: Richard A Lary <rlary@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:31 +10:00
Benjamin Herrenschmidt 40bd587a88 powerpc: Rename slb0_limit() to safe_stack_limit() and add Book3E support
slb0_limit() wasn't a very descriptive name. This changes it along with
a comment explaining what it's used for, and provides a 64-bit BookE
implementation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:24 +10:00
Tseng-Hui (Frank) Lin 77eafe101a powerpc/pseries: Add support for IO event interrupts
This patch adds support for handling IO Event interrupts which come
through at the /event-sources/ibm,io-events device tree node.

The interrupts come through ibm,io-events device tree node are generated
by the firmware to report IO events. The firmware uses the same interrupt
to report multiple types of events for multiple devices. Each device may
have its own event handler. This patch implements a plateform interrupt
handler that is triggered by the IO event interrupts come through
ibm,io-events device tree node, pull in the IO events from RTAS and call
device event handlers registered in the notifier list.

Device event handlers are expected to use atomic_notifier_chain_register()
and atomic_notifier_chain_unregister() to register/unregister their
event handler in pseries_ioei_notifier_list list with IO event interrupt.
Device event handlers are responsible to identify if the event belongs
to the device event handler. The device event handle should return NOTIFY_OK
after the event is handled if the event belongs to the device event handler,
or NOTIFY_DONE otherwise.

Signed-off-by: Tseng-Hui (Frank) Lin <thlin@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:19:01 +10:00
Tseng-Hui (Frank) Lin 4cb4638079 powerpc/pseries: Add RTAS event log v6 definition
This patch adds definitions of non-IBM specific v6 extended log
definitions to rtas.h.

Signed-off-by: Tseng-Hui (Frank) Lin <tsenglin@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:18:59 +10:00
Stephen Rothwell 79af2187fa powerpc: Fix compile with icwsx support
Due to a collision between NO_CONTEXT->MMU_NO_CONTEXT change and
Anton's patch.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:18:34 +10:00
Anton Blanchard 228e548e60 net: Add sendmmsg socket system call
This patch adds a multiple message send syscall and is the send
version of the existing recvmmsg syscall. This is heavily
based on the patch by Arnaldo that added recvmmsg.

I wrote a microbenchmark to test the performance gains of using
this new syscall:

http://ozlabs.org/~anton/junkcode/sendmmsg_test.c

The test was run on a ppc64 box with a 10 Gbit network card. The
benchmark can send both UDP and RAW ethernet packets.

64B UDP

batch   pkts/sec
1       804570
2       872800 (+ 8 %)
4       916556 (+14 %)
8       939712 (+17 %)
16      952688 (+18 %)
32      956448 (+19 %)
64      964800 (+20 %)

64B raw socket

batch   pkts/sec
1       1201449
2       1350028 (+12 %)
4       1461416 (+22 %)
8       1513080 (+26 %)
16      1541216 (+28 %)
32      1553440 (+29 %)
64      1557888 (+30 %)

We see a 20% improvement in throughput on UDP send and 30%
on raw socket send.

[ Add sparc syscall entries. -DaveM ]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-05 11:10:14 -07:00
Ingo Molnar 98bb318864 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent 2011-05-04 20:33:42 +02:00
Richard A Lary ecb7390211 powerpc/pseries/eeh: Handle functional reset on non-PCIe device
Fundamental reset is an optional reset type supported only by PCIe adapters.
  Handle the unexpected case where a non-PCIe device has requested a
  fundamental reset. Try hot-reset as a fallback to handle this case.

Signed-off-by: Richard A Lary <rlary@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 16:02:38 +10:00
Richard A Lary 308fc4f8e1 powerpc/pseries/eeh: Propagate needs_freset flag to device at PE
For multifunction adapters with a PCI bridge or switch as the device
  at the Partitionable Endpoint(PE), if one or more devices below PE
  sets dev->needs_freset, that value will be set for the PE device.

  In other words, if any device below PE requires a fundamental reset
  the PE will request a fundamental reset.

Signed-off-by: Richard A Lary <rlary@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 16:02:36 +10:00
Brian King 9ee820fa00 powerpc/pseries: Add page coalescing support
Adds support for page coalescing, which is a feature on IBM Power servers
which allows for coalescing identical pages between logical partitions.
Hint text pages as coalesce candidates, since they are the most likely
pages to be able to be coalesced between partitions. This patch also
exports some page coalescing statistics available from firmware via
lparcfg.

[BenH: Moved a couple of things around to fix compile problems]

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 16:02:21 +10:00
Ben Hutchings 7707e4110e powerpc/kexec: Fix build failure on 32-bit SMP
Commit b987812b3f left
crash_kexec_wait_realmode() undefined for UP.

Commit 7c7a81b53e defined it for UP but
left it undefined for 32-bit SMP.

Seems like people are getting confused by nested #ifdef's, so move the
definitions of crash_kexec_wait_realmode() after the #ifdef CONFIG_SMP
section.

Compile-tested with 32-bit UP, 32-bit SMP and 64-bit SMP configurations.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:59 +10:00
KOSAKI Motohiro 104699c0ab powerpc: Convert old cpumask API into new one
Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

-       ctx->cpus_allowed = current->cpus_allowed;

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:59 +10:00
Paul Mackerras 48404f2e95 powerpc: Save Come-From Address Register (CFAR) in exception frame
Recent 64-bit server processors (POWER6 and POWER7) have a "Come-From
Address Register" (CFAR), that records the address of the most recent
branch or rfid (return from interrupt) instruction for debugging purposes.

This saves the value of the CFAR in the exception entry code and stores
it in the exception frame.  We also make xmon print the CFAR value in
its register dump code.

Rather than extend the pt_regs struct at this time, we steal the orig_gpr3
field, which is only used for system calls, and use it for the CFAR value
for all exceptions/interrupts other than system calls.  This means we
don't save the CFAR on system calls, which is not a great problem since
system calls tend not to happen unexpectedly, and also avoids adding the
overhead of reading the CFAR to the system call entry path.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:09 +10:00
Paul Mackerras 1977b50212 powerpc: Save register r9-r13 values accurately on interrupt with bad stack
When we take an interrupt or exception from kernel mode and the stack
pointer is obviously not a kernel address (i.e. the top bit is 0), we
switch to an emergency stack, save register values and panic.  However,
on 64-bit server machines, we don't actually save the values of r9 - r13
at the time of the interrupt, but rather values corrupted by the
exception entry code for r12-r13, and nothing at all for r9-r11.

This fixes it by passing a pointer to the register save area in the paca
through to the bad_stack code in r3.  The register values are saved in
one of the paca register save areas (depending on which exception this
is).  Using the pointer in r3, the bad_stack code now retrieves the
saved values of r9 - r13 and stores them in the exception frame on the
emergency stack.  This also stores the normal exception frame marker
("regshere") in the exception frame.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:27 +10:00
Tseng-Hui (Frank) Lin 851d2e2fe8 powerpc: Add Initiate Coprocessor Store Word (icswx) support
Icswx is a PowerPC instruction to send data to a co-processor. On Book-S
processors the LPAR_ID and process ID (PID) of the owning process are
registered in the window context of the co-processor at initialization
time. When the icswx instruction is executed the L2 generates a cop-reg
transaction on PowerBus. The transaction has no address and the
processor does not perform an MMU access to authenticate the transaction.
The co-processor compares the LPAR_ID and the PID included in the
transaction and the LPAR_ID and PID held in the window context to
determine if the process is authorized to generate the transaction.

The OS needs to assign a 16-bit PID for the process. This cop-PID needs
to be updated during context switch. The cop-PID needs to be destroyed
when the context is destroyed.

Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: Tseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:26 +10:00
Michael Neuling a32e252f7c powerpc: Use new CPU feature bit to select 2.06 tlbie
This removes MMU_FTR_TLBIE_206 as we can now use CPU_FTR_HVMODE_206.  It
also changes the logic to select which tlbie to use to be based on this
new CPU feature bit.

This also duplicates the ASM_FTR_IF/SET/CLR defines for CPU features
(copied from MMU features).

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:26 +10:00
Grant Likely 476eb49126 powerpc/irq: Stop exporting irq_map
First step in eliminating irq_map[] table entirely

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:02:15 +10:00
Linus Torvalds 5933f2ae35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  sysctl: net: call unregister_net_sysctl_table where needed
  Revert: veth: remove unneeded ifname code from veth_newlink()
  smsc95xx: fix reset check
  tg3: Fix failure to enable WoL by default when possible
  networking: inappropriate ioctl operation should return ENOTTY
  amd8111e: trivial typo spelling: Negotitate -> Negotiate
  ipv4: don't spam dmesg with "Using LC-trie" messages
  af_unix: Only allow recv on connected seqpacket sockets.
  mii: add support of pause frames in mii_get_an
  net: ftmac100: fix scheduling while atomic during PHY link status change
  usbnet: Transfer of maintainership
  usbnet: add support for some Huawei modems with cdc-ether ports
  bnx2: cancel timer on device removal
  iwl4965: fix "Received BA when not expected"
  iwlagn: fix "Received BA when not expected"
  dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
  usbnet: Resubmit interrupt URB if device is open
  iwl4965: fix "TX Power requested while scanning"
  iwlegacy: led stay solid on when no traffic
  b43: trivial: update module info about ucode16_mimo firmware
  ...
2011-05-02 18:00:43 -07:00
Lucas De Marchi e9c549998d Revert wrong fixes for common misspellings
These changes were incorrectly fixed by codespell. They were now
manually corrected.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-04-26 23:31:11 -07:00
Richard A. Lary 65f47f1339 powerpc/eeh: Add support for ibm,configure-pe RTAS call
Added support for ibm,configure-pe RTAS call introduced with
PAPR 2.2.

Signed-off-by: Richard A. Lary <rlary@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:54 +10:00
Matt Evans 44ae3ab335 powerpc: Free up some CPU feature bits by moving out MMU-related features
Some of the 64bit PPC CPU features are MMU-related, so this patch moves
them to MMU_FTR_ bits.  All cpu_has_feature()-style tests are moved to
mmu_has_feature(), and seven feature bits are freed as a result.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:52 +10:00
Anton Blanchard eca590f402 powerpc/rtas: Only sleep in rtas_busy_delay if we have useful work to do
RTAS returns extended error codes as a hint of how long the
OS might want to wait before retrying a call. If we have nothing
else useful to do we may as well call back straight away.

This was found when testing the new dynamic dma window feature.
Firmware split the zeroing of the TCE table into 32k chunks but
returned 9901 (which is a suggested wait of 10ms). All up this took
about 10 minutes to complete since msleep is jiffies based and will
round 10ms up to 20ms.

With the patch below we take 3 seconds to complete the same test.
The hint firmware is returning in the RTAS call should definitely
be decreased, but even if we slept 1ms each iteration this would
take 32s.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:50 +10:00
Michael Ellerman a7b8ad4058 powerpc/book3e: Fix extlb size
The calculation of the size for the exception save area of the TLB
miss handler is wrong, luckily it's too big not too small.

Rework it to make it a bit clearer, and also correct. We want 3 save
areas, each EX_TLB_SIZE _bytes_.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:48 +10:00
Michael Ellerman b91e136cdf powerpc: Use MSR_64BIT in sstep.c, fix kprobes on BOOK3E
We check MSR_SF a lot in sstep.c, to decide if we need to emulate the
truncation of values when running in 32-bit mode. Factor out that code
into a helper, and convert it and the other uses to use MSR_64BIT.

This fixes a bug on BOOK3E where kprobes would end up returning to a
32-bit address, because regs->nip was truncated, because (msr & MSR_SF)
was false.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:46 +10:00
Michael Ellerman 9f0b079320 powerpc: Use MSR_64BIT in places
Use the new MSR_64BIT in a few places. Some of these are already ifdef'ed
for BOOKE vs BOOKS, but it's still clearer, MSR_SF does not immediately
parse as "MSR bit for 64bit".

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:44 +10:00
Michael Ellerman 9d4a2925c2 powerpc: Add MSR_64BIT
The MSR bit which indicates 64-bit-ness is different between server and
booke, so add a #define which gives you the right mask regardless.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:43 +10:00
Wanlong Gao 0407a31429 powerpc: Fix build warning of the defconfigs
BT_L2CAP and BT_SCO have changed to bool .
Value 'm' has invalid .

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:41 +10:00
Geert Uytterhoeven b618d2f043 powerpc/ps3: Update debug message for irq_set_chip_data()
commit ec775d0e70 ("powerpc: Convert to new irq_*
function names") changed a call from set_irq_chip_data() to
irq_set_chip_data(), but forgot to update the corresponding debug message

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:39 +10:00
Michael Ellerman 73706c3283 powerpc/irq: Dump chip data pointer in virq_mapping
This can be useful for differentiating interrupts on the same host
but with different chip data.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:37 +10:00
Michael Ellerman e70606eb9b powerpc/numa: Look for ibm, associativity-reference-points at the root
If we don't find ibm,associativity-reference-points as a child of
/rtas, look for it at the root of the tree instead. We use this on
Book3E where we have no RTAS but still use the sPAPR conventions
for NUMA.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:35 +10:00
Michael Ellerman 69b123684b powerpc/pci: Properly initialize IO workaround "private"
Even when no initfunc is provided.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:33 +10:00
Michael Ellerman d1109b7529 powerpc/pci: Make IO workarounds init implicit when first bus is registered
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:31 +10:00
Michael Ellerman 3cc30d0726 powerpc/pci: Move IO workarounds to the common kernel dir
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:29 +10:00
Michael Ellerman 21176fed25 powerpc/pci: Split IO vs MMIO indirect access hooks
The goal is to avoid adding overhead to MMIO when only PIO is needed

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:27 +10:00
Alexey Kardashevskiy efcac6589a powerpc: Per process DSCR + some fixes (try#4)
The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

This patch allows the value to be specified per thread by emulating
the corresponding mfspr and mtspr instructions. Children of such
threads inherit the value. Other threads use a default value that
can be specified in sysfs - /sys/devices/system/cpu/dscr_default.

If a thread starts with non default value in the sysfs entry,
all children threads inherit this non default value even if
the sysfs value is changed later.

Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:19 +10:00
Jack Miller f0aae3238f powerpc/book3e: Flush IPROT protected TLB entries leftover by firmware
When we set up the TLB for ourselves on Book3E, we need to flush out any
old mappings established by the firmware or bootloader.  At present we
attempt this with a tlbilx to flush everything, but this will leave behind
any entries with the IPROT bit set.

There are several good reason firmware might establish mappings with IPROT,
and in fact ePAPR compliant firmwares are required to establish their
initial mapped area with IPROT.

This patch, therefore adds more complex code to scan through the TLB upon
entry and flush away any entries that are not our own.

Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:16 +10:00
Benjamin Herrenschmidt 1a51dde139 powerpc/book3e: Use way 3 for linear mapping bolted entry
An erratum on A2 can lead to the bolted entry we insert for the linear
mapping being evicted, to avoid that write the bolted entry to way 3.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:14 +10:00
Michael Ellerman ca1769f7a3 powerpc: Index crit/dbg/mcheck stacks using cpu number on 64bit
In exc_lvl_ctx_init() we index into the crit/dbg/mcheck stacks using
the hard cpu id, but that assumes the hard cpu id is zero based and
contiguous. That is not the case on A2.

The root of the problem is that the 32bit code has no equivalent of the
paca to allow it to do the hard->soft mapping in assembler. Until the
32bit code is updated to handle that, index the stacks using the soft
cpu ids on 64bit and hard on 32 bit.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:12 +10:00
Benjamin Herrenschmidt bd49178109 powerpc: Add TLB size detection for TYPE_3E MMUs
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:10 +10:00
Benjamin Herrenschmidt 76b4eda866 powerpc: Add A2 cpu support
Add the cputable entry, regs and setup & restore entries for
the PowerPC A2 core.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:02 +10:00
Frederic Weisbecker 07fa7a0a8a powerpc, hw_breakpoints: Fix racy access to ptrace breakpoints
While the tracer accesses ptrace breakpoints, the child task may
concurrently exit due to a SIGKILL and thus release its breakpoints
at the same time. We can then dereference some freed pointers.

To fix this, hold a reference on the child breakpoints before
manipulating them.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Prasad <prasad@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1302284067-7860-4-git-send-email-fweisbec@gmail.com
2011-04-25 17:33:54 +02:00
Andrea Galbusera e2a85aeceb powerpc: Fix multicast problem in fs_enet driver
mac-fec.c was setting individual UDP address registers instead of multicast
group address registers when joining a multicast group.
This prevented from correctly receiving UDP multicast packets.
According to datasheet, replaced hash_table_high and hash_table_low
with grp_hash_table_high and grp_hash_table_low respectively.
Also renamed hash_table_* with grp_hash_table_* in struct fec declaration
for 8xx: these registers are used only for multicast there.

Tested on a MPC5121 based board.
Build tested also against mpc866_ads_defconfig.

Signed-off-by: Andrea Galbusera <gizero@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-21 16:59:30 -07:00
Ingo Molnar 42ac9e87fd Merge commit 'v2.6.39-rc4' into sched/core
Merge reason: Pick up upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-21 11:39:28 +02:00
Benjamin Herrenschmidt 411e689d92 powerpc/nvram: Search for nvram using compatible
As well as searching for nodes with type = "nvram", search for nodes
that have compatible = "nvram". This can't be converted into a single
call to of_find_compatible_node() with a non-NULL type, because that
searches for a node that has _both_ type & compatible = "nvram".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:20 +10:00
Michael Ellerman 5ca1237601 powerpc/xics: Move irq_host matching into the ics backend
An upcoming new ics backend will need to implement different matching
semantics to the current ones, which are essentially the RTAS ics
backends. So move the current match into the RTAS backend, and allow
other ics backends to override.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:20 +10:00
Benjamin Herrenschmidt ab814b938d powerpc: Add SCOM infrastructure
SCOM is a side-band configuration bus implemented on some processors.
This code provides a way for code to map and operate on devices via
SCOM, while the details of how that is implemented is left up to a
SCOM "controller" in the platform code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:19 +10:00
Michael Ellerman cd85257905 powerpc/xics: xics.h relies on linux/interrupt.h
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:19 +10:00
Benjamin Herrenschmidt 931e1241a2 powerpc/a2: Add some #defines for A2 specific instructions
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:19 +10:00
Anton Blanchard b68a70c496 powerpc: Replace open coded instruction patching with patch_instruction/patch_branch
There are a few places we patch instructions without using
patch_instruction and patch_branch, probably because they
predated it. Fix it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:18 +10:00
Michael Ellerman f5be2dc0bd powerpc/nohash: Allocate stale_map[cpu] on CPU_UP_PREPARE not CPU_ONLINE
Currently we allocate the stale_map for a cpu when it comes online,
this leaves open a small window where a process can be scheduled
on the cpu before the stale_map is allocated. Instead allocate
the stale_map at CPU_UP_PREPARE time, that way it will be always
available before tasks start running.

It is possible the cpu fails to come up, in which case we should free
the stale_map, so add a CPU_UP_CANCELED case to do that.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:18 +10:00
Michael Ellerman de30097476 powerpc/smp: smp_ops->kick_cpu() should be able to fail
When we start a cpu we use smp_ops->kick_cpu(), which currently
returns void, it should be able to fail. Convert it to return
int, and update all uses.

Convert all the current error cases to return -ENOENT, which is
what would eventually be returned by __cpu_up() currently when
it doesn't detect the cpu as coming up in time.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:18 +10:00
David Gibson 6c5b59b913 powerpc/boot: Add an ePAPR compliant boot wrapper
This is a first cut at making bootwrapper code which will
produce a zImage compliant with the requirements set down
by ePAPR.

This is a very simple bootwrapper, taking the device tree
blob supplied by the ePAPR boot program and passing it on
to the kernel. It builds on the earlier patch to build a
relocatable ET_DYN zImage to meet the other ePAPR image
requirements.

For good measure we have some paranoid checks which will
generate warnings if some of the ePAPR entry condition
guarantees are not met.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 16:59:21 +10:00
Michael Ellerman 6975a783d7 powerpc/boot: Allow building the zImage wrapper as a relocatable ET_DYN
This patch adds code, linker script and makefile support to allow
building the zImage wrapper around the kernel as a position independent
executable.  This results in an ET_DYN instead of an ET_EXEC ELF output
file, which can be loaded at any location by the firmware and will
process its own relocations to work correctly at the loaded address.

This is of interest particularly since the standard ePAPR image format
must be an ET_DYN (although this patch alone is not sufficient to
produce a fully ePAPR compliant boot image).

Note for now we don't enable building with -pie for anything.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 16:59:20 +10:00
Michael Ellerman ee7a2aa3d3 powerpc/mm: Fix slice state initialization for Book3E
On Book3E, MMU_NO_CONTEXT != 0, but the slice_mm_new_context()
macro assumes that it is.  This means that the map of the
page sizes for each slice is always initialized to zeroes
(which happens to be 4k pages), rather than to the correct
default base page size value - which might be 64k.

This patch corrects the problem.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 16:59:20 +10:00
Michael Ellerman 5e8e7b404a powerpc/mm: Standardise on MMU_NO_CONTEXT
Use MMU_NO_CONTEXT as the initialiser for mm_context.id on
nohash and hash64.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 16:59:20 +10:00
Benjamin Herrenschmidt af2771493a powerpc: Improve prom_printf()
Adds the ability to print decimal numbers and adds some more
format string variants

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:25 +10:00
Benjamin Herrenschmidt dd79773864 powerpc: Perform an isync to synchronize CPUs coming out of secondary_hold
We need to do that to guarantee they see any code change done by
dynamic patching during boot.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:25 +10:00
Benjamin Herrenschmidt 948cf67c47 powerpc: Add NAP mode support on Power7 in HV mode
Wakeup comes from the system reset handler with a potential loss of
the non-hypervisor CPU state. We save the non-volatile state on the
stack and a pointer to it in the PACA, which the system reset handler
uses to restore things

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt 9d07bc841c powerpc: Properly handshake CPUs going out of boot spin loop
We need to wait a bit for them to have done their CPU setup
or we might end up with translation and EE on with different
LPCR values between threads

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt ad0693ee72 powerpc: Call CPU ->restore callback earlier on secondary CPUs
We do it before we loop on the PACA start flag. This way, we get a
chance to set critical SPRs on all CPUs before Linux tries to start
them up, which avoids problems when changing some bits such as LPCR
bits that need to be identical on all threads of a core or similar
things like that. Ideally, some of that should also be done before
the MMU is enabled, but that's a separate issue which would require
moving some of the SMP startup code earlier, let's not get there
for now, it works with that change alone.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt b144871cb5 powerpc: Initialize TLB and LPID register on HV mode Power7
In case entry from the bootloader isn't "clean"

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt 895796a8ab powerpc: Initialize LPCR:DPFD on power7 to a sane default
This sets the default data stream prefetch size for operating
systems that don't set their own value in DSCR. We use 4 which
is "medium".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Paul Mackerras 673b189a2e powerpc: Always use SPRN_SPRG_HSCRATCH0 when running in HV mode
This uses feature sections to arrange that we always use HSPRG1
as the scratch register in the interrupt entry code rather than
SPRG2 when we're running in hypervisor mode on POWER7.  This will
ensure that we don't trash the guest's SPRG2 when we are running
KVM guests.  To simplify the code, we define GET_SCRATCH0() and
SET_SCRATCH0() macros like the GET_PACA/SET_PACA macros.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Benjamin Herrenschmidt b3e6b5dfcf powerpc: More work to support HV exceptions
Rework exception macros a bit to split offset from vector and add
some basic support for HDEC, HDSI, HISI and a few more.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Benjamin Herrenschmidt a5d4f3ad3a powerpc: Base support for exceptions using HSRR0/1
Pass the register type to the prolog, also provides alternate "HV"
version of hardware interrupt (0x500) and adjust LPES accordingly

We tag those interrupts by setting bit 0x2 in the trap number

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt 2dd60d79e0 powerpc: In HV mode, use HSPRG0 for PACA
When running in Hypervisor mode (arch 2.06 or later), we store the PACA
in HSPRG0 instead of SPRG1. The architecture specifies that SPRGs may be
lost during a "nap" power management operation (though they aren't
currently on POWER7) and this enables use of SPRG1 by KVM guests.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt 24cc67de62 powerpc: Define CPU feature for Architected 2.06 HV mode
This bit indicates that we are operating in hypervisor mode on a CPU
compliant to architecture 2.06 or later (currently server only).

We set it on POWER7 and have a boot-time CPU setup function that
clears it if MSR:HV isn't set (booting under a hypervisor).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt f6e17f9b0b powerpc/xics: Make sure we have a sensible default distribution server
Even when nothing is specified in the device tree, and despite the
fact that we don't setup links properly yet, we still need a reasonable
value in there or some interrupts won't be setup properly to point to
an existing processor.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt 50fb8ebe7c powerpc: Add more Power7 specific definitions
This adds more SPR definitions used on newer processors when running
in hypervisor mode. Along with some other P7 specific bits and pieces

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:21 +10:00
Benjamin Herrenschmidt 0b05ac6e24 powerpc/xics: Rewrite XICS driver
This is a significant rework of the XICS driver, too significant to
conveniently break it up into a series of smaller patches to be honest.

The driver is moved to a more generic location to allow new platforms
to use it, and is broken up into separate ICP and ICS "backends". For
now we have the native and "hypervisor" ICP backends and one common
RTAS ICS backend.

The driver supports one ICP backend instanciation, and many ICS ones,
in order to accomodate future platforms with multiple possibly different
interrupt "sources" mechanisms.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:02:35 +10:00
Benjamin Herrenschmidt 7b84b29b8c powerpc/powermac: Build fix with SMP and CPU hotplug
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 15:46:35 +10:00
Eric B Munson 86c74ab317 powerpc/perf_event: Skip updating kernel counters if register value shrinks
Because of speculative event roll back, it is possible for some event coutners
to decrease between reads on POWER7.  This causes a problem with the way that
counters are updated.  Delta calues are calculated in a 64 bit value and the
top 32 bits are masked.  If the register value has decreased, this leaves us
with a very large positive value added to the kernel counters.  This patch
protects against this by skipping the update if the delta would be negative.
This can lead to a lack of precision in the coutner values, but from my testing
the value is typcially fewer than 10 samples at a time.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:23 +10:00
Stefan Roese 09597cfe93 powerpc: Don't write protect kernel text with CONFIG_DYNAMIC_FTRACE enabled
This problem was noticed on an MPC855T platform. Ftrace did oops
when trying to write to the kernel text segment.

Many thanks to Joakim for finding the root cause of this problem.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:21 +10:00
Anton Blanchard 84ffae55af powerpc: Fix oops if scan_dispatch_log is called too early
We currently enable interrupts before the dispatch log for the boot
cpu is setup. If a timer interrupt comes in early enough we oops in
scan_dispatch_log:

Unable to handle kernel paging request for data at address 0x00000010

...

.scan_dispatch_log+0xb0/0x170
.account_system_vtime+0xa0/0x220
.irq_enter+0x88/0xc0
.do_IRQ+0x48/0x230

The patch below adds a check to scan_dispatch_log to ensure the
dispatch log has been allocated.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:19 +10:00
Nishanth Aravamudan 127493d5dc powerpc/pseries: Use a kmem cache for DTL buffers
PAPR specifies that DTL buffers can not cross AMS environments (aka CMO
in the PAPR) and can not cross a memory entitlement granule boundary
(4k). This is found in section 14.11.3.2 H_REGISTER_VPA of the PAPR.
kmalloc does not guarantee an alignment of the allocation, though,
beyond 8 bytes (at least in my understanding). Create a special kmem
cache for DTL buffers with the alignment requirement.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:08 +10:00
Paul Gortmaker 7c7a81b53e powerpc/kexec: Fix regression causing compile failure on UP
Recent commit b987812b3f caused
a compile failure on UP because a considerably large block
of the file was included within CONFIG_SMP, hence making a stub
function not exposed on UP builds when it needed to be.

Relocate the stub to the #else /* ! CONFIG_SMP */ section
and also annotate the relevant else/endif so that nobody
else falls into the same trap I did.

Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:06:45 +10:00
Benjamin Herrenschmidt 8f3dda75cb Merge remote branch 'kumar/merge' into merge 2011-04-18 12:09:37 +10:00
Alexandre Bounine 59f9996555 RapidIO/mpc85xx: fix possible mport registration problems
Fix a possible problem with mport registration left non-cleared after
fsl_rio_setup() exits on link error.  Abort mport initialization if
registration failed.

This patch is applicable to 2.6.39-rc1 only.  The problem does not exist
for earlier versions.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-14 16:06:56 -07:00
Peter Zijlstra 184748cc50 sched: Provide scheduler_ipi() callback in response to smp_send_reschedule()
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.

In order to do so we need a generic callback from the existing scheduler IPI.

This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.

BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
2011-04-14 08:52:32 +02:00
Kumar Gala e5462d16f7 powerpc/85xx: disable Suspend support if SMP enabled
We currently dont have CPU Hotplug support working on 85xx so we need to
disable Suspsend support as it will force enabling of CPU Hotplug.

arch/powerpc/kernel/built-in.o: In function `cpu_die': arch/powerpc/kernel/smp.c:702: undefined reference to `start_secondary_resume'
make: *** [.tmp_vmlinux1] Error 1

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Scott Wood d51ad91535 powerpc/e500mc: Remove CPU_FTR_MAYBE_CAN_NAP/CPU_FTR_MAYBE_CAN_DOZE
e500mc does not support the HID0/MSR mechanism that is used by e500_idle
(and there are also issues with waking on certain types of interrupts).

Further, even if napping is never actually enabled, just having
CPU_FTR_CAN_NAP will cause machine_init() to overwrite the board's supplied
ppc_md.power_save().

We drop CPU_FTR_MAYBE_CAN_DOZE becuase we should use 'wait' instead on
e500mc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Kumar Gala 11ed0db9f6 powerpc/book3e: Fix CPU feature handling on 64-bit e5500
The CPU_FTRS_POSSIBLE and CPU_FTRS_ALWAYS defines did not encompass
e5500 CPU features when built for 64-bit.  This causes issues with
cpu_has_feature() as it utilizes the POSSIBLE & ALWAYS defines as part
of its check.

Create a unique CPU_FTRS_E5500 (as its different from CPU_FTRS_E500MC),
created a new group for 64-bit Book3e based CPUs and add CPU_FTRS_E5500
to that group.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Prabhakar Kushwaha 07d9fce24d powerpc: Check device status before adding serial device
serial port nodes with the property status="disabled" are not usable and so
avoid adding "disabled" port with the system.

Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Prabhakar Kushwaha ef1fd2df85 powerpc/85xx: Don't add disabled PCIe devices
PCIe nodes with the property status="disabled" are not usable and so
avoid adding "disabled" PCIe bridge with the system.

Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Rafael J. Wysocki 1f112cee07 PM / Hibernate: Introduce CONFIG_HIBERNATE_CALLBACKS
Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose.  However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation.  Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.

To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it.  Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
2011-04-11 22:54:42 +02:00
Linus Torvalds 42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
Matt Evans c60e65d786 powerpc/pseries: Fix build without CONFIG_HOTPLUG_CPU
Signed-off-by: Matt Evans <matt@ozlabs.au.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-05 16:22:11 +10:00
Ryan Grimm c1854e0072 powerpc: Set nr_cpu_ids early and use it to free PACAs
Without this, "holes" in the CPU numbering can cause us to
free too many PACAs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-05 16:22:11 +10:00
Benjamin Herrenschmidt f86d6b9b36 powerpc/pseries: Don't register global initcall
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-05 16:22:11 +10:00
Paul Gortmaker b987812b3f powerpc/kexec: Fix mismatched ifdefs for PPC64/SMP.
Commit b3df895aeb "powerpc/kexec: Add support for FSL-BookE"
introduced the original PPC_STD_MMU_64 checks around the function
crash_kexec_wait_realmode().   Then commit c2be05481f
"powerpc: Fix default_machine_crash_shutdown #ifdef botch" changed
the ifdef around the calling site to add a check on SMP, but the
ifdef around the function itself was left unchanged, leaving an
unused function for PPC_STD_MMU_64=y and SMP=n

Rather than have two ifdefs that can get out of sync like this,
simply put the corrected conditional around the function and use
a stub to get rid of one set of ifdefs completely.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-05 16:22:10 +10:00
Benjamin Herrenschmidt 83ebb3e344 Merge remote branch 'kumar/merge' into merge 2011-04-05 16:20:22 +10:00
Sylvestre Ledru f65e51d740 Documentation: fix minor typos/spelling
Fix some minor typos:
 * informations => information
 * there own => their own
 * these => this

Signed-off-by: Sylvestre Ledru <sylvestre.ledru@scilab.org>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-04 17:51:47 -07:00
Prabhakar Kushwaha d2f989262e powerpc/85xx: Update dts for PCIe memory maps to match u-boot of Px020RDB
PCIe memory address space is 1:1 mapped with u-boot.

Update dts of Px020RDB i.e. P1020RDB and P2020RDB to match the address map
changes in u-boot.

Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-04 09:30:40 -05:00
Benjamin Herrenschmidt 76d479a7ca powerpc/pmac/smp: Remove no-longer needed preempt workaround
The generic code properly re-initializes the preempt count in the
idle thread now

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:38 +11:00
Benjamin Herrenschmidt aeeafbfa7a powerpc/smp: Increase vdso_data->processorCount, not just decrease it
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:36 +11:00
Benjamin Herrenschmidt c56e58537d powerpc/smp: Create idle threads on demand and properly reset them
Instead of creating idle threads at boot for all possible CPUs, we
create them on demand, like x86 or ARM, and we properly call init_idle
to re-initialize an idle thread when a CPU was unplugged and is now
re-plugged.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:34 +11:00
Benjamin Herrenschmidt 105765f451 powerpc/smp: Don't expose per-cpu "cpu_state" array
Instead, keep it static, expose an accessor and use that from
the PowerMac code. Avoids easy namespace collisions and will
make it easier to consolidate with other implementations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:33 +11:00
Benjamin Herrenschmidt 734796f123 powerpc/pmac/smp: Fix CPU hotplug crashes on some machines
On some machines that use i2c to synchronize the timebases (such
as PowerMac7,2/7,3 G5 machines), hotplug CPU would crash when
putting back a new CPU online due to the underlying i2c bus being
closed.

This uses the newly added bringup_done() callback to move the close
along with other housekeeping calls, and adds a CPU notifier to
re-open the i2c bus around subsequent hotplug operations

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:31 +11:00
Benjamin Herrenschmidt d72944457b powerpc/smp: Add a smp_ops->bringup_up() done callback
This allows us to stop abusing smp_ops->setup_cpu() for cleanup
tasks that have to take place after the initial boot time CPU
bringup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:29 +11:00
Benjamin Herrenschmidt 62cc67b9df powerpc/pmac/smp: Properly NAP offlined CPU on G5
The current code soft-disables, and then goes to NAP mode which
turns interrupts on. That means that if an interrupt occurs, we
will hit the masked interrupt code path which isn't what we want,
as it will return with EE off, which will either get us out of
NAP mode, or fail to enter it (according to spec).

Instead, let's just rely on the fact that it is safe to take
decrementer interrupts on an offline CPU and leave interrupts
enabled. We can also get rid of the special case in asm for
power4_cpu_offline_powersave() and just use power4_idle().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:25 +11:00
Benjamin Herrenschmidt e872e41b79 powerpc/pmac/smp: Remove HMT changes for PowerMac offline code
Those instructions do nothing on non-threaded processors such
as 970's used on those machines.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:23 +11:00
Benjamin Herrenschmidt 4c6130d9bb powerpc/pmac/smp: Consolidate 32-bit and 64-bit PowerMac cpu_die in one file
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:21 +11:00
Benjamin Herrenschmidt 45e07fd045 powerpc/pmac/smp: Fixup smp_core99_cpu_disable() and use it on 64-bit
Use the generic code, just add the MPIC priority setting,

I don't see any use in mucking around with the decrementer,
as 32-bit will have EE off all along, and 64-bit will be able
to deal with it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:20 +11:00
Benjamin Herrenschmidt 1c91cc5705 powerpc/pmac/smp: Rename fixup_irqs() to migrate_irqs() and use it on ppc32
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:18 +11:00
Benjamin Herrenschmidt fb49f864c3 powerpc/pmac/smp: Fix 32-bit PowerMac cpu_die
Use generic cpu_state, call idle_task_exit() properly, and
remove smp_core99_cpu_die() which isn't useful, the generic
function does the job just fine.
2011-04-01 15:37:16 +11:00
Benjamin Herrenschmidt 7a53a4fe70 powerpc/smp: Remove unused smp_ops->cpu_enable()
Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:14 +11:00
Benjamin Herrenschmidt b527d07114 powerpc/smp: Remove unused generic_cpu_enable()
Nobody uses it, besides we should always use the normal __cpu_up
path anyways

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:12 +11:00
Benjamin Herrenschmidt 4fcb8833af powerpc/smp: Fix generic_mach_cpu_die()
This is used by some "soft" hotplug implementations. I needs to
call idle_task_exit() when the CPU is going away, and we remove
the now no-longer needed set_cpu_online() and local_irq_enable()
which are handled by the return to start_secondary

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:10 +11:00
Benjamin Herrenschmidt fa3f82c8bb powerpc/smp: soft-replugged CPUs must go back to start_secondary
Various thing are torn down when a CPU is hot-unplugged. That CPU
is expected to go back to start_secondary when re-plugged to re
initialize everything, such as clock sources, maps, ...

Some implementations just return from cpu_die() callback
in the idle loop when the CPU is "re-plugged". This is not enough.

We fix it using a little asm trampoline which resets the stack
and calls back into start_secondary as if we were all fresh from
boot. The trampoline already existed on ppc64, but we add it for
ppc32

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:09 +11:00
Benjamin Herrenschmidt 963e5d3b76 powerpc: Make decrementer interrupt robust against offlined CPUs
With some implementations, it is possible that a timer interrupt
occurs every few seconds on an offline CPU. In this case, just
re-arm the decrementer and return immediately

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:07 +11:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds 6aba74f279 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  avr32: Fix missing irq namespace conversion
  powerpc: qe_ic: Rename get_irq_desc_data and get_irq_desc_chip
  genirq: Remove the now obsolete config options and select statements
  arm: versatile : Fix typo introduced in irq namespace cleanup
  sound: Fixup the last user of the old irq functions
  genirq: Remove obsolete comment
  genirq: Remove now obsolete set_irq_wake()
  sh: Fix irq cleanup fallout
  x86: apb_timer: Fixup genirq fallout
  genirq: Fix misnamed label in handle_edge_eoi_irq

Fix up crazy conflict in arch/powerpc/include/asm/qe_ic.h:

 - commit eead4d5c63 ("powerpc: qe_ic: Rename get_irq_desc_data and
   get_irq_desc_chip") made the helper functions use
   irq_desc_get_handler_data() instead of the legacy (and no longer
   existing) get_irq_desc_data.

 - commit d4db35e8dc ("powerpc/qe_ic: Fix another breakage from the
   irq_data conversion") used irq_desc_get_chip_data() instead.

According to Thomas, the former is the correct direct conversion, but it
does look like both should work (arch/powerpc/sysdev/qe_lib/qe_ic.c
seems to initialize both to the same thing), and the chip data in some
ways is the more logical.  Somebody should really decide on one of the
other.

This merge picks irq_desc_get_handler_data() as the straightforward pure
conversion to new names, as per Thomas.
2011-03-30 09:35:52 -07:00
Richard Cochran eead4d5c63 powerpc: qe_ic: Rename get_irq_desc_data and get_irq_desc_chip
These two functions disappeared in commit

    0c6f8a8b91
    "genirq: Remove compat code"

but they still exist in qe_ic.h.
This patch renames the function to their new names.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Lennert Buytenhek <buytenh@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <20110330132504.GA31832@riccoc20.at.omicron.at>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-30 15:38:02 +02:00
Thomas Gleixner 78c8982564 genirq: Remove the now obsolete config options and select statements
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-30 14:13:23 +02:00
Benjamin Herrenschmidt d4db35e8dc powerpc/qe_ic: Fix another breakage from the irq_data conversion
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 11:17:15 +11:00
Benjamin Herrenschmidt b3cf2bb3d5 powerpc/8xx: Fix another breakage from the irq_data conversion
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 11:07:13 +11:00
Thomas Gleixner 8987eccde8 powerpc/cell: Use handle_edge_eoi_irq for real
Missed one instance when moving that to the core code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: michael@ellerman.id.au
Cc: mingo@elte.hu
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:24 +11:00
Anton Blanchard 23c6211043 powerpc/pseries: Enable Chelsio network and iWARP drivers
Ensure the Chelsio T3/T4 network drivers and iWARP drivers are
enabled in the pseries config.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:22 +11:00
Benjamin Herrenschmidt 84493804bb powerpc/mm: Move the STAB0 location to 0x8000 to make room in low memory
Recent upstream builds with allmodconfig fail due to lack of space
between 0x3000 and 0x6000. We have a hard block at 0x7000 but we can
spare a page by moving the STAB0 from 0x6000 to 0x8000.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:20 +11:00
Anton Blanchard ad5d1c888e powerpc: Fix accounting of softirq time when idle
commit cf9efce0ce (powerpc: Account time using timebase rather
than PURR) used in_irq() to detect if the time was spent in
interrupt processing. This only catches hardirq context so if we
are in softirq context and in the idle loop we end up accounting it
as idle time. If we instead use in_interrupt() we catch both softirq
and hardirq time.

The issue was found when running a network intensive workload. top
showed the following:

0.0%us,  1.1%sy,  0.0%ni, 85.7%id,  0.0%wa,  9.9%hi,  3.3%si,  0.0%st

85.7% idle. But this was wildly different to the perf events data.
To confirm the suspicion I ran something to keep the core busy:

# yes > /dev/null &

8.2%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa, 10.3%hi, 81.4%si,  0.0%st

We only got 8.2% of the CPU for the userspace task and softirq has
shot up to 81.4%.

With the patch below top shows the correct stats:

0.0%us,  0.0%sy,  0.0%ni,  5.3%id,  0.0%wa, 13.3%hi, 81.3%si,  0.0%st

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:18 +11:00
Milton Miller 2d86938a4e powerpc/pseries/smp: query-cpu-stopped-state support won't change
If a given firmware doesn't have a token to support query-cpu-stopped-state,
its not likely to change during the lifetime of the kernel.

Only print this information once, not once per secondary thread.

While here, make the line wrap grep friendly.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:16 +11:00
Milton Miller 943739fd59 powerpc/xics: Use hwirq for xics domain irq number
To try to avoid future confusion, rename irq to hwirq when it refers
to a xics domain number instead of a linux irq number.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:15 +11:00
Milton Miller 4f1fc48a73 powerpc/xics: Fix numberspace mismatch from irq_desc conversion
commit 79f26c268e (powerpc:
platforms/pseries irq_data conversion) pushed irq_desc down into many
functions, dererencing the descriptor irq field as late as possible.

But it incorrectly passed a linix virtural irq number to RTAS,
resulting in the interrupt not being disabled and possibly
other bad things, such as another interrupt being disabled and/or
a checkstop.

In addition this missed the point of xics_mask_unknown_vec and
the seperation of xics_mask_real_irq from xics_mask_irq.  When
xics_mask_unknown_vec is called it's because the hardware delivered an
irq source for which we have no linux irq allocated, and thefore we can
not have an irq_desc allocated.

Revert xics_mask_real_irq to its prior version, naming the argument
hwirq to highlight the difference.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:13 +11:00
Stephen Rothwell 834796a849 powerpc: Wire up new syscalls
These syscalls have been added recently:
	name_to_handle_at
	open_by_handle_at
	clock_adjtime
	syncfs

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:11 +11:00
Varun Sethi 05e02d7f88 powerpc/booke: Correct the SPRN_MAS5 definition.
339 is the SPR number for MAS5 documented by Power ISA 2.06, and
implemented by e500mc.  It is not yet used anywhere in the kernel,
so nothing should be relying on the wrong number.

Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:09 +11:00
Scott Wood 67eb54944b powerpc: ARCH_PFN_OFFSET should be unsigned long
pfns are unsigned long, but MEMORY_START is phys_addr_t.  This leads
to page_to_pfn() returning phys_addr_t, and thus type mismatches in a few
print statements.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:07 +11:00
Benjamin Herrenschmidt 6090912c4a powerpc: Implement dma_mmap_coherent()
This is used by Alsa to mmap buffers allocated with dma_alloc_coherent()
into userspace. We need a special variant to handle machines with
non-coherent DMAs as those buffers have "special" virt addresses and
require non-cachable mappings

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:00 +11:00
Jim Keniston 15d260b36f powerpc/nvram: Don't overwrite oops/panic report on normal shutdown
For normal halt, reboot, and poweroff events, refrain from overwriting
the lnx,oops-log partition.  Also, don't save the dmesg buffer on an
emergency-restart event if we've already saved it earlier in panic().

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:36:23 +11:00
Stephen Rothwell ff56535d29 powerpc: Restore some misc devices to our configs
Uwe Kleine-König reported:

	while working on an defconfig (arm/mx27) I noticed that just updating
	it[1] results in removing CONFIG_EEPROM_AT24=y.  The reason is that
	since commit

		v2.6.36-5965-g5f2365d (misc devices: do not enable by default)

	MISC_DEVICES isn't enabled anymore by default.  So all defconfigs that
	have CONFIG_SOME_SYMBOL=y (or =m) (with SOME_SYMBOL depending on
	MISC_DEVICES) but not CONFIG_MISC_DEVICES=y suffer from the same
	problem.

This restores those misc devices to the powerpc defconfigs.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Uwe Kleine-König
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:36:23 +11:00
Thomas Gleixner 433c9c67c5 powerpc: Use generic show_interrupts()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:13 +02:00
Thomas Gleixner ec775d0e70 powerpc: Convert to new irq_* function names
Scripted with coccinelle.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:12 +02:00
Thomas Gleixner 7bfbc1f283 powerpc: irq: Use irqdata based information
We want to tighten the irq_desc access. So use the new accessors for
the same information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:12 +02:00
Thomas Gleixner ddaedd1c4a powerpc-fsl-msi-use-irqd.patch
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:11 +02:00
Thomas Gleixner 773e20d5b7 powerpc: xilinx: Cleanup flow type handling
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus. The core also
updates the LEVEL flag.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:11 +02:00
Thomas Gleixner 1ac06cdadf powerpc: uic: Cleanup flow type handling
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus. The core also
updates IRQ_LEVEL.

Use irq_data to get the level type information in the chip functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:10 +02:00
Thomas Gleixner 24a3f2e82b powerpc: mpic: Cleanup flow type handling
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus.

The new core code allows to update the type in irq_data and return
IRQ_SET_MASK_OK_NOCOPY, so the core code will not touch it, except for
setting the IRQ_LEVEL flag.

Retrieve the IRQ_LEVEL information from irq_data which avoids a
redundant sparse irq lookup as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:10 +02:00
Thomas Gleixner 5fed97a9fd powerpc: mpc8xx_pic: Cleanup flow type handling
The core irq_set_type() function updates the flow type when the chip
callback returns 0. So setting the type is bogus. The level flag is
updated in the core as well.

Use the proper accessors for setting the irq handlers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:10 +02:00