Commit Graph

65120 Commits

Author SHA1 Message Date
Benjamin Herrenschmidt 9424fabf86 powerpc: Fix 64-bit BookE FP unavailable exceptions
We were using CR0.EQ after EXCEPTION_COMMON, hoping it still
contained whether we came from userspace or kernel space.

However, under some circumstances, EXCEPTION_COMMON will
call C code and clobber non-volatile registers, so we really
need to re-load the previous MSR from the stackframe and
re-test.

While there, invert the condition to make the fast path more
obvious and remove the BUG_OPCODE which was a debugging
leftover and call .ret_from_except as we should.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:18 +11:00
Benjamin Herrenschmidt 990118c84b powerpc: Fix register clobbering when accumulating stolen time
When running under a hypervisor that supports stolen time accounting,
we may call C code from the macro EXCEPTION_PROLOG_COMMON in the
exception entry path, which clobbers CR0.

However, the FPU and vector traps rely on CR0 indicating whether we
are coming from userspace or kernel to decide what to do.

So we need to restore that value after the C call

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:16 +11:00
Benjamin Herrenschmidt 7ac21cd465 powerpc/xmon: Add display of soft & hard irq states
Also use local_paca instead of get_paca() to avoid getting into
the smp_processor_id() debugging code from the debugger

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:14 +11:00
Benjamin Herrenschmidt 9be72573a8 powerpc: Add support for page fault retry and fatal signals
Other architectures such as x86 and ARM have been growing
new support for features like retrying page faults after
dropping the mm semaphore to break contention, or being
able to return from a stuck page fault when a SIGKILL is
pending.

This refactors our implementation of do_page_fault() to
move the error handling out of line in a way similar to
x86 and adds support for those two features.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:12 +11:00
Benjamin Herrenschmidt 9f2f79e3a3 powerpc: Disable interrupts in 64-bit kernel FP and vector faults
If we get a floating point, altivec or vsx unavaible interrupt in
kernel, we trigger a kernel error. There is no point preserving
the interrupt state, in fact, that can even make debugging harder
as the processor state might change (we may even preempt) between
taking the exception and landing in a debugger.

So just make those 3 disable interrupts unconditionally.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2: On BookE only disable when hitting the kernel unavailable
    path, otherwise it will fail to restore softe as
    fast_exception_return doesn't do it.
2012-03-09 10:55:10 +11:00
Benjamin Herrenschmidt a546498f3b powerpc: Call do_page_fault() with interrupts off
We currently turn interrupts back to their previous state before
calling do_page_fault(). This can be annoying when debugging as
a bad fault will potentially have lost some processor state before
getting into the debugger.

We also end up calling some generic code with interrupts enabled
such as notify_page_fault() with interrupts enabled, which could
be unexpected.

This changes our code to behave more like other architectures,
and make the assembly entry code call into do_page_faults() with
interrupts disabled. They are conditionally re-enabled from
within do_page_fault() in the same spot x86 does it.

While there, add the might_sleep() test in the case of a successful
trylock of the mmap semaphore, again like x86.

Also fix a bug in the existing assembly where r12 (_MSR) could get
clobbered by C calls (the DTL accounting in the exception common
macro and DISABLE_INTS) in some cases.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2. Add the r12 clobber fix
2012-03-09 10:55:08 +11:00
Benjamin Herrenschmidt 1b70117924 powerpc: Improve behaviour of irq tracing on 64-bit exception entry
Some exceptions would unconditionally disable interrupts on entry,
which is fine, but calling lockdep every time not only adds more
overhead than strictly needed, but also means we get quite a few
"redudant" disable logged, which makes it hard to spot the really
bad ones.

So instead, split the macro used by the exception code into a
normal one and a separate one used when CONFIG_TRACE_IRQFLAGS is
enabled, and make the later skip th tracing if interrupts were
already disabled.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:06 +11:00
Benjamin Herrenschmidt 1421ae0b29 powerpc: Improve 64-bit syscall entry/exit
We unconditionally hard enable interrupts. This is unnecessary as
syscalls are expected to always be called with interrupts enabled.

While at it, we add a WARN_ON if that is not the case and
CONFIG_TRACE_IRQFLAGS is enabled (we don't want to add overhead
to the fast path when this is not set though).

Thus let's remove the enabling (and associated irq tracing) from
the syscall entry path. Also on Book3S, replace a few mfmsr
instructions with loads of PACAMSR from the PACA, which should be
faster & schedule better.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:04 +11:00
Benjamin Herrenschmidt fe1952fc0a powerpc: Rework runlatch code
This moves the inlines into system.h and changes the runlatch
code to use the thread local flags (non-atomic) rather than
the TIF flags (atomic) to keep track of the latch state.

The code to turn it back on in an asynchronous interrupt is
now simplified and partially inlined.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:02 +11:00
Benjamin Herrenschmidt 7450f6f03e powerpc: Use the same interrupt prolog for perfmon as other interrupts
The perfmon interrupt is the sole user of a special variant of the
interrupt prolog which differs from the one used by external and timer
interrupts in that it saves the non-volatile GPRs and doesn't turn the
runlatch on.

The former is unnecessary and the later is arguably incorrect, so
let's clean that up by using the same prolog. While at it we rename
that prolog to use the _ASYNC prefix.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:55:00 +11:00
Benjamin Herrenschmidt 4f8cf36f48 powerpc: Remove legacy iSeries bits from assembly files
This removes the various bits of assembly in the kernel entry,
exception handling and SLB management code that were specific
to running under the legacy iSeries hypervisor which is no
longer supported.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:54:59 +11:00
Stephen Rothwell b078766026 powerpc: clean up vio.c
This cleans up vio.c after the removal of the legacy iSeries platform.
It also removes some no longer referenced include files.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:35:23 +11:00
Stephen Rothwell 8ee3e0d696 powerpc: Remove the main legacy iSerie platform code
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-09 10:35:11 +11:00
Akinobu Mita 2d4b971287 powerpc/pmac: Use string library in nvram code
- Use memchr_inv to check if the data contains all 0xFF bytes.
  It is faster than looping for each byte.

- Use memcmp to compare memory areas

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:09:05 +11:00
Grant Likely ad5b7f1350 powerpc: Make SPARSE_IRQ required
All IRQs on powerpc are managed via irq_domain anyway, there isn't really
any advantage to turning SPARSE_IRQ off, and it's the direction we want
to take the kernel design anyway.  This patch makes powerpc always use
SPARSE_IRQ.

On pseries_defconfig, SPARSE_IRQ adds only about 0x300 bytes to the
.text sections, and removes about 0x20000 from the data section for the
static irq_desc table.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:09:04 +11:00
Nishanth Aravamudan e9daf2ad7f powerpc/prom: Remove limit on maximum size of properties
On a 16TB system (using AMS/CMO), I get:

WARNING: ignoring large property [/ibm,dynamic-reconfiguration-memory] ibm,dynamic-memory length 0x000000000017ffec

and significantly less memory is thus shown to the partition. As far as
I can tell, the constant used is arbitrary. Ben Herrenschmidt provided
additional background that

> The limit was originally set because of Apple machines carrying ROM
> images in the device-tree, at a time where we were much more memory
> constrained than we are now.

and that it is likely not very useful any longer.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:06:10 +11:00
Matt Fleming a2007ce844 powerpc: Use set_current_blocked() and block_sigmask()
As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block
is pending in the shared queue.

Also, use the new helper function introduced in commit 5e6292c0f2
("signal: add block_sigmask() for adding sigmask to current->blocked")
which centralises the code for updating current->blocked after
successfully delivering a signal and reduces the amount of duplicate
code across architectures. In the past some architectures got this
code wrong, so using this helper function should stop that from
happening again.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:06:09 +11:00
Joe Perches a2234b4bae powerpc: Use vsprintf extention %pf with builtin_return_address
Emit the function name not the address when possible.

builtin_return_address() gives an address.  When building
a kernel with CONFIG_KALLSYMS, emit the actual function
name not the address.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:06:09 +11:00
Jimi Xenidis de801de139 powerpc/icswx: Fix race condition with IPI setting ACOP
There is a race where a thread causes a coprocessor type to be valid
in its own ACOP _and_ in the current context, but it does not
propagate to the ACOP register of other threads in time for them to
use it.  The original code tries to solve this by sending an IPI to
all threads on the system, which is heavy handed, but unfortunately
still provides a window where the icswx is issued by other threads and
the ACOP is not up to date.

This patch detects that the ACOP DSI fault was a "false positive" and
syncs the ACOP and causes the icswx to be replayed.

Signed-off-by: Jimi Xenidis <jimix@pobox.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:06:09 +11:00
Anton Blanchard a6cf7ed511 powerpc/atomic: Implement atomic*_inc_not_zero
Implement atomic_inc_not_zero and atomic64_inc_not_zero. At the
moment we use atomic*_add_unless which requires us to put 0 and
1 constants into registers. We can also avoid a subtract by
saving the original value in a second temporary.

This removes 3 instructions from fget:

- c0000000001b63c0:       39 00 00 00     li      r8,0
- c0000000001b63c4:       39 40 00 01     li      r10,1
...
- c0000000001b63e8:       7c 0a 00 50     subf    r0,r10,r0

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-07 17:06:08 +11:00
Danny Kukawka 0a167e0a5c arch/powerpc/platforms/powernv/setup.c: included asm/xics.h twice
arch/powerpc/platforms/powernv/setup.c: included 'asm/xics.h' twice,
remove the duplicate.

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-27 11:33:59 +11:00
Danny Kukawka ed7e3d1ca7 arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
arch/powerpc/kvm/book3s_hv.c: included 'linux/sched.h' twice,
remove the duplicate.

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-27 11:33:58 +11:00
Stephen Rothwell 3d066d77cf powerpc: remove CONFIG_PPC_ISERIES from the architecture Kconfig files
After this, we can remove the legacy iSeries code more easily.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-27 11:33:58 +11:00
Benjamin Herrenschmidt fe83364f0b powerpc/mpic: Fix allocation of reverse-map for multi-ISU mpics
When using a multi-ISU MPIC, we can interrupts up to
isu_size * MPIC_MAX_ISU, not just isu_size, so allocate
the right size reverse map.

Without this, the code will constantly fallback to
a linear search.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-27 11:33:58 +11:00
Benjamin Herrenschmidt f851013cb2 Merge remote-tracking branch 'origin/master' into next 2012-02-27 10:50:11 +11:00
Linus Torvalds ee3253241a This is the arch/c6x part of commit 7c43185138
which was dropped because c6x had not yet been merged at the time.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPR5ojAAoJEOiN4VijXeFPflcP/3oiSnfN64ND3NTVBOB8zXyV
 /CB2F5UQsia3L+Qag3nuB2Vgm6l+5+yPbMHx2IO55XwhG8P31nS7FSsX/kA6YcTQ
 EXN7Y9Y3/1q1trLxZf4ZrFuBev794Tg2X6HmIQztWPp/ru7z5iftnWGQinboCHKf
 8ftsKMCJqOmp4IwX1lj3513qn4FjPI1Pnvdp1P9oAVCuvjrKHDiKxS7pbcODTsHH
 KkVxZQVbF2EHw3LtTFYiryBrVq5arkQU8mbhoouv40TIa2/CdM5F8FKtdEhMkdsy
 6g/JVNdzFGCPMo0eg5eX3oSkr/1ohcFjZmbvq3NkSmh9YmdPfBUMIFTylJr8ShEX
 MlSK2T86cm7W213LUcHW/A5PjwgFEWiBHBI6TJ+2ojfMrrT+bEX8LjGRqQGK+uFR
 P+JhaWYk1KH5njgJWyuh/Q1HXjM3ucURwd3/PWsmZ0fyZm7OcuX1BXGFqLdICrVx
 49DmgzxqIxoyxfIg5EGdLFlR5MQaTxxQirVPkCACWL6avttr767vJ3H76lFGkVSW
 Tg+2KDLF3Ta3t6GUrBo+yPpI5oAWPDW5FAlPpanxfDu6Up3QCWfj2hlk2DZZoJMS
 kHPUF3ShO0lcyp4ZXMNaE7pJZevtmJV++eHSPZvL36ySFHXwQ/NmXhkT8agA05kY
 8VVEBYxS0dnynTO0igTs
 =YDuM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

This is the arch/c6x part of commit 7c43185138 ("Kbuild: Use dtc's -d
(dependency) option") which was dropped because c6x had not yet been
merged at the time.

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  Kbuild: Use dtc's -d (dependency) option
2012-02-24 09:01:46 -08:00
Linus Torvalds 37e79cbf7d SH/R-Mobile fixes for 3.3-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.15 (GNU/Linux)
 
 iEYEABECAAYFAk9HFAgACgkQGkmNcg7/o7gQ3QCgveYM5yvFmfaeK+7uKyot5CkC
 RkkAoMoo0cOs3B0s5JlFv1Fl7IKfgVWk
 =odpk
 -----END PGP SIGNATURE-----

Merge tag 'rmobile-for-linus' of git://github.com/pmundt/linux-sh

SH/R-Mobile fixes for 3.3-rc5

* tag 'rmobile-for-linus' of git://github.com/pmundt/linux-sh:
  arch/arm/mach-shmobile/board-ag5evm.c: included linux/dma-mapping.h twice
  ARM: mach-shmobile: r8a7779 PFC IPSR4 fix
  ARM: mach-shmobile: sh73a0 PSTR 32-bit access fix
  ARM: mach-shmobile: add GPIO-to-IRQ translation to sh7372
  ARM: mach-shmobile: clock-sh73a0: add DSIxPHY clock support
  arm: fix compile failure in mach-shmobile/board-ag5evm.c
  ARM: mach-shmobile: mackerel: add ak4642 amixer settings on comment
  ARM: mach-shmobile: mackerel: use renesas_usbhs instead of r8a66597_hcd
  ARM: mach-shmobile: simplify MMCIF DMA configuration
  ARM: mach-shmobile: IRQ driven GPIO key support for Kota2
  ARM: mach-shmobile: sh73a0 IRQ sparse alloc fix
  ARM: mach-shmobile: sh73a0 PINT IRQ base fix
2012-02-24 08:57:22 -08:00
Linus Torvalds 0e69e08401 SuperH fixes for 3.3-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.15 (GNU/Linux)
 
 iEYEABECAAYFAk9HE98ACgkQGkmNcg7/o7g4+QCfYIcWunicH7+tLMjH/h5MgHq2
 S5EAoJdIKyAPfV1tsIY0ykJZQP+H4t4X
 =QqTI
 -----END PGP SIGNATURE-----

Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

SuperH fixes for 3.3-rc5

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
  sh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH
  sh: modify a resource of sh_eth_giga1_resources in board-sh7757lcr
  arch/sh: remove references to cpu_*_map.
  sh: Fix typo in pci-sh7780.c
  sh: add platform_device for SPI1 in setup-sh7757
  sh: modify resource for SPI0 in setup-sh7757
  sh: se7724: fix compile breakage
  sh: clkfwk: bugfix: use clk_reparent() for div6 clocks
  sh: clock-sh7724: fixup sh_fsi clock settings
  sh: sh7757lcr: update to the new MMCIF DMA configuration
  sh: fix the sh_mmcif_plat_data in board-sh7757lcr
  video: pvr2fb: Fix up spurious section mismatch warnings.
  sh: Defer to asm-generic/device.h.
2012-02-24 08:56:51 -08:00
Danny Kukawka 7372a4cd6c arch/arm/mach-shmobile/board-ag5evm.c: included linux/dma-mapping.h twice
arch/arm/mach-shmobile/board-ag5evm.c: included 'linux/dma-mapping.h'
twice, remove the duplicate.

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:32:17 +09:00
Magnus Damm 74eb436ec0 ARM: mach-shmobile: r8a7779 PFC IPSR4 fix
Fix the bit field width information for the IPSR4 register
in the r8a7779 pin function controller (PFC).

Without this fix the Marzen board fails to receive data
over the serial console due to misconfigured pin function
for the RX pin.

Signed-off-by: Magnus Damm <damm@opensource.se>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:24:59 +09:00
Magnus Damm 689189fb01 ARM: mach-shmobile: sh73a0 PSTR 32-bit access fix
Convert the sh73a0 SMP code to use 32-bit PSTR access.

This fixes wakeup from deep sleep for sh73a0 secondary CPUs.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:24:58 +09:00
Paul Mundt 35eb304b5c Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into rmobile-fixes-for-linus 2012-02-24 13:23:23 +09:00
Phil Edworthy 1ae911cba4 sh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH
Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:21:46 +09:00
Shimoda, Yoshihiro befe0756d5 sh: modify a resource of sh_eth_giga1_resources in board-sh7757lcr
The latest sh_eth driver needs a resource of TSU in the channel 1,
if the controller has TSU registers. So, this patch adds the resource.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:21:46 +09:00
Rusty Russell 004f4ce9f3 arch/sh: remove references to cpu_*_map.
This has been obsolescent for a while; time for the final push.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:21:45 +09:00
Masanari Iida ecfb68c673 sh: Fix typo in pci-sh7780.c
Correct spelling "erorr" to "error" in
arch/sh/drivers/pci/pci-sh7780.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-02-24 13:21:44 +09:00
Linus Torvalds 73c8e679aa Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
BenH says:
 'Here are a few more powerpc bits for you.  A stupid regression I
  introduced with my previous commit to "fix" program check exceptions
  (brown paper bag for me), fix the cpuidle default, a bug fix for
  something that isn't strictly speaking a regression but some upstream
  changes causes it to show in lockdep now while it didn't before, and
  finally a trivial one for rusty to make his life easier later on
  removing the old cpumask cruft. '

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix various issues with return to userspace
  cpuidle: Default y on powerpc pSeries
  powerpc: Fix program check handling when lockdep is enabled
  powerpc: Remove references to cpu_*_map
2012-02-23 11:48:36 -08:00
Michael Ellerman f2699491e0 powerpc/perf: Move perf core & PMU code into a subdirectory
The perf code has grown a lot since it started, and is big enough to
warrant its own subdirectory. For reference it's ~60% bigger than the
oprofile code. It declutters the kernel directory, makes it simpler to
grep for "just perf stuff", and allows us to shorten some filenames.

While we're at it, make it more obvious that we have two implementations
of the core perf logic. One for (roughly) Book3S CPUs, which was the
original implementation, and the other for Freescale embedded CPUs.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:04 +11:00
Mahesh Salgaonkar 12d9299241 fadump: Remove the phyp assisted dump code.
Remove the phyp assisted dump implementation which is not is use.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:03 +11:00
Mahesh Salgaonkar 67b43b9d7c fadump: Invalidate the fadump registration during machine shutdown.
If dump is active during system reboot, shutdown or halt then invalidate
the fadump registration as it does not get invalidated automatically.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:03 +11:00
Mahesh Salgaonkar b500afff11 fadump: Invalidate registration and release reserved memory for general use.
This patch introduces an sysfs interface '/sys/kernel/fadump_release_mem' to
invalidate the last fadump registration, invalidate '/proc/vmcore', release
the reserved memory for general use and re-register for future kernel dump.
Once the dump is copied to the disk, unlike phyp dump, the userspace tool
can release all the memory reserved for dump with one single operation of
echo 1 to '/sys/kernel/fadump_release_mem'.

Release the reserved memory region excluding the size of the memory required
for future kernel dump registration. And therefore, unlike kdump, Fadump
doesn't need a 2nd reboot to get back the system to the production
configuration.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:02 +11:00
Mahesh Salgaonkar d34c5f26cf fadump: Add PT_NOTE program header for vmcoreinfo
Introduce a PT_NOTE program header that points to physical address of
vmcoreinfo_note buffer declared in kernel/kexec.c. The vmcoreinfo
note buffer is populated during crash_fadump() at the time of system
crash.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:02 +11:00
Mahesh Salgaonkar ebaeb5ae24 fadump: Convert firmware-assisted cpu state dump data into elf notes.
When registered for firmware assisted dump on powerpc, firmware preserves
the registers for the active CPUs during a system crash. This patch reads
the cpu register data stored in Firmware-assisted dump format (except for
crashing cpu) and converts it into elf notes and updates the PT_NOTE program
header accordingly. The exact register state for crashing cpu is saved to
fadump crash info structure in scratch area during crash_fadump() and read
during second kernel boot.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:01 +11:00
Mahesh Salgaonkar 2df173d9e8 fadump: Initialize elfcore header and add PT_LOAD program headers.
Build the crash memory range list by traversing through system memory during
the first kernel before we register for firmware-assisted dump. After the
successful dump registration, initialize the elfcore header and populate
PT_LOAD program headers with crash memory ranges. The elfcore header is
saved in the scratch area within the reserved memory. The scratch area starts
at the end of the memory reserved for saving RMR region contents. The
scratch area contains fadump crash info structure that contains magic number
for fadump validation and physical address where the eflcore header can be
found. This structure will also be used to pass some important crash info
data to the second kernel which will help second kernel to populate ELF core
header with correct data before it gets exported through /proc/vmcore. Since
the firmware preserves the entire partition memory at the time of crash the
contents of the scratch area will be preserved till second kernel boot.

Since the memory dump exported through /proc/vmcore is in ELF format similar
to kdump, it will help us to reuse the kdump infrastructure for dump capture
and filtering. Unlike phyp dump, userspace tool does not need to refer any
sysfs interface while reading /proc/vmcore.

NOTE: The current design implementation does not address a possibility of
introducing additional fields (in future) to this structure without affecting
compatibility. It's on TODO list to come up with better approach to
address this.

Reserved dump area start => +-------------------------------------+
                            |  CPU state dump data                |
                            +-------------------------------------+
                            |  HPTE region data                   |
                            +-------------------------------------+
                            |  RMR region data                    |
Scratch area start       => +-------------------------------------+
                            |  fadump crash info structure {      |
                            |     magic nummber                   |
                     +------|---- elfcorehdr_addr                 |
                     |      |  }                                  |
                     +----> +-------------------------------------+
                            |  ELF core header                    |
Reserved dump area end   => +-------------------------------------+

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:01 +11:00
Mahesh Salgaonkar 3ccc00a7e0 fadump: Register for firmware assisted dump.
On 2012-02-20 11:02:51 Mon, Paul Mackerras wrote:
> On Thu, Feb 16, 2012 at 04:44:30PM +0530, Mahesh J Salgaonkar wrote:
>
> If I have read the code correctly, we are going to get this printk on
> non-pSeries machines or on older pSeries machines, even if the user
> has not put the fadump=on option on the kernel command line.  The
> printk will be annoying since there is no actual error condition.  It
> seems to me that the condition for the printk should include
> fw_dump.fadump_enabled.  In other words you should probably add
>
> 	if (!fw_dump.fadump_enabled)
> 		return 0;
>
> at the beginning of the function.

Hi Paul,

Thanks for pointing it out. Please find the updated patch below.

The existing patches above this (4/10 through 10/10) cleanly applies
on this update.

Thanks,
-Mahesh.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:01 +11:00
Mahesh Salgaonkar eb39c8803d fadump: Reserve the memory for firmware assisted dump.
Reserve the memory during early boot to preserve CPU state data, HPTE region
and RMA (real mode area) region data in case of kernel crash. At the time of
crash, powerpc firmware will store CPU state data, HPTE region data and move
RMA region data to the reserved memory area.

If the firmware-assisted dump fails to reserve the memory, then fallback
to existing kexec-based kdump.

Most of the code implementation to reserve memory has been
adapted from phyp assisted dump implementation written by Linas Vepstas
and Manish Ahuja

This patch also introduces a config option CONFIG_FA_DUMP for firmware
assisted dump feature on Powerpc (ppc64) architecture.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:01 +11:00
Kyle Moffett e55d7f737d powerpc/mpic: Remove duplicate MPIC_WANTS_RESET flag
There are two separate flags controlling whether or not the MPIC is
reset during initialization, which is completely unnecessary, and only
one of them can be specified in the device tree.

Also, most platforms in-tree right now do actually want to reset the
MPIC during initialization anyways, which means lots of duplicate code
passing the MPIC_WANTS_RESET flag.

Fix all of the callers which currently do not pass the MPIC_WANTS_RESET
flag to pass the MPIC_NO_RESET flag, then remove the MPIC_WANTS_RESET
flag and make the code reset the MPIC by default.

Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:00 +11:00
Kyle Moffett c1b8d45db4 powerpc/mpic: Add "last-interrupt-source" property to override hardware
The FreeScale PowerQUICC-III-compatible (mpc85xx/mpc86xx) MPICs do not
correctly report the number of hardware interrupt sources, so software
needs to override the detected value with "256".

To avoid needing to write custom board-specific code to detect that
scenario, allow it to be easily overridden in the device-tree.

Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:50:00 +11:00
Kyle Moffett 5019609fce powerpc/mpic: Remove MPIC_BROKEN_FRR_NIRQS and duplicate irq_count
The mpic->irq_count variable is only used as a software error-checking
limit to determine whether or not an IRQ number is valid.  In board code
which does not manually specify an IRQ count to mpic_alloc(), i.e. 0, it
is automatically detected from the number of ISUs and the ISU size.

In practice, all hardware ends up with irq_count == num_sources, so all
of the runtime checks on mpic->irq_count should just check the value of
mpic->num_sources instead.

When platform hardware does not correctly report the number of IRQs,
which only happens on the MPC85xx/MPC86xx, the MPIC_BROKEN_FRR_NIRQS
flag is used to override the detected value of num_sources with the
manual irq_count parameter.  Since there's no need to manually specify
the number of IRQs except in this case, the extra flag can be eliminated
and the test changed to "irq_count != 0".

Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:49:59 +11:00
Kyle Moffett 9ca163c860 fsl/mpic: Create and document the "single-cpu-affinity" device-tree flag
The Freescale MPIC (and perhaps others in the future) is incapable of
routing non-IPI interrupts to more than once CPU at a time.  Currently
all of the Freescale boards msut pass the MPIC_SINGLE_DEST_CPU flag to
mpic_alloc(), but that information should really be present in the
device-tree.

Older board code can't rely on the device-tree having the property set,
but newer platforms won't need it manually specified in the code.

[BenH: Remove unrelated changes, folded in a different patch]

Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:49:59 +11:00
Kyle Moffett 98cca250ae fsl/mpic: Document and use the "big-endian" device-tree flag
The MPIC code checks for a "big-endian" property and sets the flag
MPIC_BIG_ENDIAN if one is present, although prior to the "mpic->flags"
fixup that would never have worked anways.

Unfortunately, even now that it works properly, the Freescale mpic
device-node (the "PowerQUICC-III"-compatible one) does not specify it,
so all of the board ports need to manually pass it to mpic_alloc().

Document the flag and add it to the pq3 device tree.  Existing code will
still need to pass the MPIC_BIG_ENDIAN flag because their dtb may not
have this property, but new platforms shouldn't need to do so.

Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:49:58 +11:00
Kyle Moffett 3a7a7176e8 powerpc/mpic: Fix use of "flags" variable in mpic_alloc()
The mpic_alloc() function takes a "flags" parameter and assigns it into
the mpic->flags variable fairly early on, but several later pieces of
code detect various device-tree properties and save them into the
"mpic->flags" variable (EG: "big-endian" => MPIC_BIG_ENDIAN).

Unfortunately, a number of codepaths (including several which test the
flag MPIC_BIG_ENDIAN!) test "flags" instead of "mpic->flags", and get
wrong answers as a result.

Consolidate the device-tree flag tests early in mpic_alloc() and change
all of the checks after "mpic->flags" is init'ed to use "mpic->flags".

[BenH: Fixed up use of mpic->node before it's initialized]

Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-23 10:49:58 +11:00
Linus Torvalds 71c01b9d5b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
It contains 3 important fixes for ColdFire based machines:
 - fix processes getting stuck when running from strace
 - fix kernel vmalloced pages not being visible in all kernel contexts
 - fix shared user pages sometimes being visible in another process
   context

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: Do not set global share for non-kernel shared pages
  m68k: Add shared bit to Coldfire kernel page entries
  m68knommu: fix syscall tracing stuck process
2012-02-22 08:45:08 -08:00
Benjamin Herrenschmidt 18b246fa60 powerpc: Fix various issues with return to userspace
We have a few problems when returning to userspace. This is a
quick set of fixes for 3.3, I'll look into a more comprehensive
rework for 3.4. This fixes:

 - We kept interrupts soft-disabled when schedule'ing or calling
do_signal when returning to userspace as a result of a hardware
interrupt.

 - Rename do_signal to do_notify_resume like all other archs (and
do_signal_pending back to do_signal, which it was before Roland
changed it).

 - Add the missing call to key_replace_session_keyring() to
do_notify_resume().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
2012-02-22 16:48:53 +11:00
Michael Ellerman 922b9f86a0 powerpc: Fix program check handling when lockdep is enabled
In commit 54321242af ("Disable interrupts early in Program Check"), we
switched from enabling to disabling interrupts in program_check_common.

Whereas ENABLE_INTS leaves r3 untouched, if lockdep is enabled DISABLE_INTS
calls into lockdep code and will clobber r3. That means we pass a bogus
struct pt_regs* into program_check_exception() and all hell breaks loose.

So load our regs pointer into r3 after we call DISABLE_INTS.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-22 16:48:49 +11:00
Rusty Russell 07d2f1a54a powerpc: Remove references to cpu_*_map
This has been obsolescent for a while; time for the final push.

In adjacent context, replaced old cpus_* with cpumask_*.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-22 16:48:47 +11:00
Linus Torvalds 6b0d1abb35 Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
A few more things this time around.  The only thing warranting some
commentry is the modpost change, which allows folk building a Thumb2
enabled kernel to see section mismatch warnings.  This is why many
weren't noticed with OMAP.

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM/audit: include audit header and fix audit arch
  ARM: OMAP: fix voltage domain build errors with PM_OPP disabled
  ARM/PCI: Remove ARM's duplicate definition of 'pcibios_max_latency'
  ARM: 7336/1: smp_twd: Don't register CPUFREQ notifiers if local timers are not initialised
  ARM: 7327/1: need to include asm/system.h in asm/processor.h
  ARM: 7326/2: PL330: fix null pointer dereference in pl330_chan_ctrl()
  ARM: 7164/3: PL330: Fix the size of the dst_cache_ctrl field
  ARM: 7325/1: fix v7 boot with lockdep enabled
  ARM: 7324/1: modpost: Fix section warnings for ARM for many compilers
  ARM: 7323/1: Do not allow ARM_LPAE on pre-ARMv7 architectures
2012-02-21 18:24:42 -08:00
Linus Torvalds faf309009e sys_poll: fix incorrect type for 'timeout' parameter
The 'poll()' system call timeout parameter is supposed to be 'int', not
'long'.

Now, the reason this matters is that right now 32-bit compat mode is
broken on at least x86-64, because the 32-bit code just calls
'sys_poll()' directly on x86-64, and the 32-bit argument will have been
zero-extended, turning a signed 'int' into a large unsigned 'long'
value.

We could just introduce a 'compat_sys_poll()' function for this, and
that may eventually be what we have to do, but since the actual standard
poll() semantics is *supposed* to be 'int', and since at least on x86-64
glibc sign-extends the argument before invocing the system call (so
nobody can actually use a 64-bit timeout value in user space _anyway_,
even in 64-bit binaries), the simpler solution would seem to be to just
fix the definition of the system call to match what it should have been
from the very start.

If it turns out that somebody somehow circumvents the user-level libc
64-bit sign extension and actually uses a large unsigned 64-bit timeout
despite that not being how poll() is supposed to work, we will need to
do the compat_sys_poll() approach.

Reported-by: Thomas Meyer <thomas@m3y3r.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-21 17:24:20 -08:00
Eric Paris 5180bb392a ARM/audit: include audit header and fix audit arch
Both bugs being fixed were introduced in:
29ef73b7a8

Include linux/audit.h to fix below build errors:

  CC      arch/arm/kernel/ptrace.o
arch/arm/kernel/ptrace.c: In function 'syscall_trace':
arch/arm/kernel/ptrace.c:919: error: implicit declaration of function 'audit_syscall_exit'
arch/arm/kernel/ptrace.c:921: error: implicit declaration of function 'audit_syscall_entry'
arch/arm/kernel/ptrace.c:921: error: 'AUDIT_ARCH_ARMEB' undeclared (first use in this function)
arch/arm/kernel/ptrace.c:921: error: (Each undeclared identifier is reported only once
arch/arm/kernel/ptrace.c:921: error: for each function it appears in.)
make[1]: *** [arch/arm/kernel/ptrace.o] Error 1
make: *** [arch/arm/kernel] Error 2

This part of the patch is:
Reported-by: Axel Lin <axel.lin@gmail.com>
Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
(They both provided patches to fix it)

This patch also (at the request of the list) fixes the fact that
ARM has both LE and BE versions however the audit code was called as if
it was always BE.  If audit userspace were to try to interpret the bits
it got from a LE system it would obviously do so incorrectly.  Fix this
by using the right arch flag on the right system.

This part of the patch is:
Reported-by: Russell King - ARM Linux <linux@arm.linux.org.uk>

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-21 16:50:14 +00:00
Russell King 3ddd4d0c62 ARM: OMAP: fix voltage domain build errors with PM_OPP disabled
The voltage domain code wants the voltage tables, which are in the
opp*.c files.  These files aren't built when PM_OPP is disabled,
causing the following build errors at link time:

twl-common.c:(.init.text+0x2e48): undefined reference to `omap34xx_vddmpu_volt_data'
twl-common.c:(.init.text+0x2e4c): undefined reference to `omap34xx_vddcore_volt_data'
twl-common.c:(.init.text+0x2e5c): undefined reference to `omap36xx_vddmpu_volt_data'
twl-common.c:(.init.text+0x2e60): undefined reference to `omap36xx_vddcore_volt_data'
twl-common.c:(.init.text+0x2830): undefined reference to `omap44xx_vdd_mpu_volt_data'
twl-common.c:(.init.text+0x283c): undefined reference to `omap44xx_vdd_iva_volt_data'
twl-common.c:(.init.text+0x2844): undefined reference to `omap44xx_vdd_core_volt_data'

Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-21 09:36:34 +00:00
Myron Stowe e23e8c0690 ARM/PCI: Remove ARM's duplicate definition of 'pcibios_max_latency'
The patch series to re-factor PCI's 'latency timer' setup (re:
http://marc.info/?l=linux-kernel&m=131983853831049&w=2) forgot to
remove the ARM specific definition of 'pcibios_max_latency' once such
had been moved into the pci core resulting in ARM related compile
errors -
  drivers/built-in.o:(.data+0x230): multiple definition of
  `pcibios_max_latency'
  arch/arm/common/built-in.o:(.data+0x40c): first defined here
  make[1]: *** [vmlinux.o] Error 1

In the series, patch 2/16 (commit 168c8619fd) converted the ARM
specific version of 'pcibios_set_master()' to a non-inlined version.
This was done in preperation for hosting it up into PCI's core, which
was done in patch 10/16 (commit 96c5590058) of the series (and
where the removal of ARM's 'pcibios_max_latency' was overlooked).

Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-21 09:35:32 +00:00
Santosh Shilimkar 910ba598c8 ARM: 7336/1: smp_twd: Don't register CPUFREQ notifiers if local timers are not initialised
Current ARM local timer code registers CPUFREQ notifiers even in case
the twd_timer_setup() isn't called. That seems to be wrong and
would eventually lead to kernel crash on the CPU frequency transitions
on the SOCs where the local timer doesn't exist or broken because of
hardware BUG. Fix it by testing twd_evt and *__this_cpu_ptr(twd_evt).

The issue was observed with v3.3-rc3 and building an OMAP2+ kernel
on OMAP3 SOC which doesn't have TWD.

Below is the dump for reference :

 Unable to handle kernel paging request at virtual address 007e900
 pgd = cdc20000
 [007e9000] *pgd=00000000
 Internal error: Oops: 5 [#1] SMP
 Modules linked in:
 CPU: 0    Not tainted  (3.3.0-rc3-pm+debug+initramfs #9)
 PC is at twd_update_frequency+0x34/0x48
 LR is at twd_update_frequency+0x10/0x48
 pc : [<c001382c>]    lr : [<c0013808>]    psr: 60000093
 sp : ce311dd8  ip : 00000000  fp : 00000000
 r10: 00000000  r9 : 00000001  r8 : ce310000
 r7 : c0440458  r6 : c00137f8  r5 : 00000000  r4 : c0947a74
 r3 : 00000000  r2 : 007e9000  r1 : 00000000  r0 : 00000000
 Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment usr
 Control: 10c5387d  Table: 8dc20019  DAC: 00000015
 Process sh (pid: 599, stack limit = 0xce3102f8)
 Stack: (0xce311dd8 to 0xce312000)
 1dc0:                                                       6000c
 1de0: 00000001 00000002 00000000 00000000 00000000 00000000 00000
 1e00: ffffffff c093d8f0 00000000 ce311ebc 00000001 00000001 ce310
 1e20: c001386c c0437c4c c0e95b60 c0e95ba8 00000001 c0e95bf8 ffff4
 1e40: 00000000 00000000 c005ef74 ce310000 c0435cf0 ce311ebc 00000
 1e60: ce352b40 0007a120 c08d5108 c08ba040 c08ba040 c005f030 00000
 1e80: c08bc554 c032fe2c 0007a120 c08d4b64 ce352b40 c08d8618 ffff8
 1ea0: c08ba040 c033364c ce311ecc c0433b50 00000002 ffffffea c0330
 1ec0: 0007a120 0007a120 22222201 00000000 22222222 00000000 ce357
 1ee0: ce3d6000 cdc2aed8 ce352ba0 c0470164 00000002 c032f47c 00034
 1f00: c0331cac ce352b40 00000007 c032f6d0 ce352bbc 0003d090 c0930
 1f20: c093d8bc c03306a4 00000007 ce311f80 00000007 cdc2aec0 ce358
 1f40: ce8d20c0 00000007 b6fe5000 ce311f80 00000007 ce310000 0000c
 1f60: c000de74 ce987400 ce8d20c0 b6fe5000 00000000 00000000 0000c
 1f80: 00000000 00000000 001fbac8 00000000 00000007 001fbac8 00004
 1fa0: c000df04 c000dd60 00000007 001fbac8 00000001 b6fe5000 00000
 1fc0: 00000007 001fbac8 00000007 00000004 b6fe5000 00000000 00202
 1fe0: 00000000 beb565f8 00101ffc 00008e8c 60000010 00000001 00000
 [<c001382c>] (twd_update_frequency+0x34/0x48) from [<c008ac4c>] )
 [<c008ac4c>] (smp_call_function_single+0x17c/0x1c8) from [<c0013)
 [<c0013890>] (twd_cpufreq_transition+0x24/0x30) from [<c0437c4c>)
 [<c0437c4c>] (notifier_call_chain+0x44/0x84) from [<c005efe4>] ()
 [<c005efe4>] (__srcu_notifier_call_chain+0x70/0xa4) from [<c005f)
 [<c005f030>] (srcu_notifier_call_chain+0x18/0x20) from [<c032fe2)
 [<c032fe2c>] (cpufreq_notify_transition+0xc8/0x1b0) from [<c0333)
 [<c033364c>] (omap_target+0x1b4/0x28c) from [<c032f47c>] (__cpuf)
 [<c032f47c>] (__cpufreq_driver_target+0x50/0x64) from [<c0331d24)
 [<c0331d24>] (cpufreq_set+0x78/0x98) from [<c032f6d0>] (store_sc)
 [<c032f6d0>] (store_scaling_setspeed+0x5c/0x74) from [<c03306a4>)
 [<c03306a4>] (store+0x58/0x74) from [<c014d868>] (sysfs_write_fi)
 [<c014d868>] (sysfs_write_file+0x80/0xb4) from [<c00f2c2c>] (vfs)
 [<c00f2c2c>] (vfs_write+0xa8/0x138) from [<c00f2e9c>] (sys_write)
 [<c00f2e9c>] (sys_write+0x40/0x6c) from [<c000dd60>] (ret_fast_s)
 Code: e594300c e792210c e1a01000 e5840004 (e7930002)
 ---[ end trace 5da3b5167c1ecdda ]---

Reported-by: Kevin Hilman <khilman@ti.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-21 09:26:46 +00:00
Linus Torvalds 27e74da980 i387: export 'fpu_owner_task' per-cpu variable
(And define it properly for x86-32, which had its 'current_task'
declaration in separate from x86-64)

Bitten by my dislike for modules on the machines I use, and the fact
that apparently nobody else actually wanted to test the patches I sent
out.

Snif. Nobody else cares.

Anyway, we probably should uninline the 'kernel_fpu_begin()' function
that is what modules actually use and that references this, but this is
the minimal fix for now.

Reported-by: Josh Boyer <jwboyer@gmail.com>
Reported-and-tested-by: Jongman Heo <jongman.heo@samsung.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-20 19:34:10 -08:00
Linus Torvalds 39e255dab5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  [S390] correct ktime to tod clock comparator conversion
  [S390] 3215 deadlock with tty_wakeup
  [S390] incorrect PageTables counter for kvm page tables
  [S390] idle: avoid RCU usage in extended quiescent state
2012-02-20 16:13:39 -08:00
Linus Torvalds 7e16838d94 i387: support lazy restore of FPU state
This makes us recognize when we try to restore FPU state that matches
what we already have in the FPU on this CPU, and avoids the restore
entirely if so.

To do this, we add two new data fields:

 - a percpu 'fpu_owner_task' variable that gets written any time we
   update the "has_fpu" field, and thus acts as a kind of back-pointer
   to the task that owns the CPU.  The exception is when we save the FPU
   state as part of a context switch - if the save can keep the FPU
   state around, we leave the 'fpu_owner_task' variable pointing at the
   task whose FP state still remains on the CPU.

 - a per-thread 'last_cpu' field, that indicates which CPU that thread
   used its FPU on last.  We update this on every context switch
   (writing an invalid CPU number if the last context switch didn't
   leave the FPU in a lazily usable state), so we know that *that*
   thread has done nothing else with the FPU since.

These two fields together can be used when next switching back to the
task to see if the CPU still matches: if 'fpu_owner_task' matches the
task we are switching to, we know that no other task (or kernel FPU
usage) touched the FPU on this CPU in the meantime, and if the current
CPU number matches the 'last_cpu' field, we know that this thread did no
other FP work on any other CPU, so the FPU state on the CPU must match
what was saved on last context switch.

In that case, we can avoid the 'f[x]rstor' entirely, and just clear the
CR0.TS bit.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-20 10:58:54 -08:00
Linus Torvalds 80ab6f1e8c i387: use 'restore_fpu_checking()' directly in task switching code
This inlines what is usually just a couple of instructions, but more
importantly it also fixes the theoretical error case (can that FPU
restore really ever fail? Maybe we should remove the checking).

We can't start sending signals from within the scheduler, we're much too
deep in the kernel and are holding the runqueue lock etc.  So don't
bother even trying.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-20 10:58:28 -08:00
Linus Torvalds cea20ca3f3 i387: fix up some fpu_counter confusion
This makes sure we clear the FPU usage counter for newly created tasks,
just so that we start off in a known state (for example, don't try to
preload the FPU state on the first task switch etc).

It also fixes a thinko in when we increment the fpu_counter at task
switch time, introduced by commit 34ddc81a23 ("i387: re-introduce FPU
state preloading at context switch time").  We should increment the
*new* task fpu_counter, not the old task, and only if we decide to use
that state (whether lazily or preloaded).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-20 10:24:09 -08:00
Linus Torvalds be2874cb4e These are the bug fixes that have accumulated since 3.3-rc3 in arm-soc.
The majority of them are regression fixes for stuff that broke during
 the merge 3.3 window.
 
 The notable ones are:
 
 * The at91 ata drivers both broke because of an earlier cleanup patch that
   some other patches were based on. Jean-Christophe decided to remove
   the legacy at91_ide driver and fix the new-style at91-pata driver while
   keeping the cleanup patch. I almost rejected the patches for being too
   late and too big but in the end decided to accept them because they
   fix a regression.
 
 * A patch fixing build breakage from the sysdev-to-device conversion
   colliding with other changes touches a number of mach-s3c files.
 
 * b0654037 "ARM: orion: Fix Orion5x GPIO regression from MPP cleanup"
   is a mechanical change that unfortunately touches a lot of lines
   that should up in the diffstat.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIVAwUATztTM2CrR//JCVInAQJu3Q/+KN4npDjjJbRm1FR4J+z7dEy3631gt7Ku
 M64JuC2259da0AtXlHXoc8XB7ZrBkMR2k1n+Q42FqUFVILOXcrHSTId6osPQ8WYE
 TGWR0E2APP6/w4YH3dz0aTUauX0HhnWNP4ShWalWxw2Zsc1nhPNcMO3k57E/PNnp
 nUHb2ZR+Huqk9Eje6/Vkr7OQq7dhl0KJvITJKCT1H93vVYZc5l2O5ZytcOC3dsFg
 yMP/btmu9JlCenOwoKcQFv6ug0tWAYiY4ALqQujLN0kcf7rmjLLOG2HQrnycmeh3
 gv9jwK04gYxHkhPbCLCgO/bg906LVcYIl9/TY7jXK8oE4kR0vVxdQOWEzKIX5+KO
 dyAuwy3uGi4szG8f1DKnz1h7vR1MEyBVgQ+yRqnfhLh7mFmuZcOlGTzziD3csDXG
 qd5B2xf9WvLupfpbvgnHUUKEIJVfWPDoJeN3jGCOjd4+j8OzPR6yeAtU85TDQzIx
 IKs2x+0zrYMBre3R+m5vb9v3IhPb1wZU29eXXRzDmLuHJDM00Qc8LmpiWUoeu3cX
 DwuLstYLm8EhWN+LnjAABd3mKeR5tyBojK3EsDFRxIfz3mKHVNEAPE6Iky60Lfwr
 Pq+LgBBftFfcct70UyXWSK7UI92suavDgCHVejIxpbvIWF1UVY7S1mgmflZ1WZAL
 R5tdx6oe5Y4=
 =QYVn
 -----END PGP SIGNATURE-----

Merge tag 'fixes-3.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

These are the bug fixes that have accumulated since 3.3-rc3 in arm-soc.
The majority of them are regression fixes for stuff that broke during
the merge 3.3 window.

The notable ones are:

* The at91 ata drivers both broke because of an earlier cleanup patch that
  some other patches were based on. Jean-Christophe decided to remove
  the legacy at91_ide driver and fix the new-style at91-pata driver while
  keeping the cleanup patch. I almost rejected the patches for being too
  late and too big but in the end decided to accept them because they
  fix a regression.

* A patch fixing build breakage from the sysdev-to-device conversion
  colliding with other changes touches a number of mach-s3c files.

* b0654037 "ARM: orion: Fix Orion5x GPIO regression from MPP cleanup"
  is a mechanical change that unfortunately touches a lot of lines
  that should up in the diffstat.

* tag 'fixes-3.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
  ARM: at91: drop ide driver in favor of the pata one
  pata/at91: use newly introduced SMC accessors
  ARM: at91: add accessor to manage SMC
  ARM: at91:rtc/rtc-at91sam9: ioremap register bank
  ARM: at91: USB AT91 gadget registration for module
  ep93xx: fix build of vision_ep93xx.c
  ARM: OMAP2xxx: PM: fix OMAP2xxx-specific UART idle bug in v3.3
  ARM: orion: Fix USB phy for orion5x.
  ARM: orion: Fix Orion5x GPIO regression from MPP cleanup
  ARM: EXYNOS: Add cpu-offset property in gic device tree node
  ARM: EXYNOS: Bring exynos4-dt up to date
  ARM: OMAP3: cm-t35: fix section mismatch warning
  ARM: OMAP2: Fix the OMAP2 only build break seen with 2011+ ARM tool-chains
  ARM: tegra: paz00: fix wrong UART port on mini-pcie plug
  ARM: tegra: paz00: fix wrong SD1 power gpio
  i2c: tegra: Add devexit_p() for remove
  ARM: EXYNOS: Correct M-5MOLS sensor clock frequency on Universal C210 board
  ARM: EXYNOS: Correct framebuffer window size on Nuri board
  ARM: SAMSUNG: Fix missing api-change from subsys_interface change
  ARM: EXYNOS: Fix "warning: initialization from incompatible pointer type"
  ...
2012-02-18 15:40:00 -08:00
Linus Torvalds 06ca7c4376 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Here are a few more fixes for powerpc.  Some are regressions, the rest
is simple/obvious/nasty enough that I deemed it good to go now.

Here's also step one of deprecating legacy iSeries support: we are
removing it from the main defconfig.

Nobody seems to be using it anymore and the code is nasty to maintain,
(involves horrible hacks in various low level areas of the kernel) so we
plan to actually rip it out at some point.  For now let's just avoid
building it by default.  Stephen will proceed to do the actual removal
later (probably 3.4 or 3.5).

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events
  powerpc/adb: Use set_current_state()
  powerpc: Disable interrupts early in Program Check
  powerpc: Remove legacy iSeries from ppc64_defconfig
  powerpc/fsl/pci: Fix PCIe fixup regression
  powerpc: Fix kernel log of oops/panic instruction dump
2012-02-18 15:26:37 -08:00
Linus Torvalds 34ddc81a23 i387: re-introduce FPU state preloading at context switch time
After all the FPU state cleanups and finally finding the problem that
caused all our FPU save/restore problems, this re-introduces the
preloading of FPU state that was removed in commit b3b0870ef3 ("i387:
do not preload FPU state at task switch time").

However, instead of simply reverting the removal, this reimplements
preloading with several fixes, most notably

 - properly abstracted as a true FPU state switch, rather than as
   open-coded save and restore with various hacks.

   In particular, implementing it as a proper FPU state switch allows us
   to optimize the CR0.TS flag accesses: there is no reason to set the
   TS bit only to then almost immediately clear it again.  CR0 accesses
   are quite slow and expensive, don't flip the bit back and forth for
   no good reason.

 - Make sure that the same model works for both x86-32 and x86-64, so
   that there are no gratuitous differences between the two due to the
   way they save and restore segment state differently due to
   architectural differences that really don't matter to the FPU state.

 - Avoid exposing the "preload" state to the context switch routines,
   and in particular allow the concept of lazy state restore: if nothing
   else has used the FPU in the meantime, and the process is still on
   the same CPU, we can avoid restoring state from memory entirely, just
   re-expose the state that is still in the FPU unit.

   That optimized lazy restore isn't actually implemented here, but the
   infrastructure is set up for it.  Of course, older CPU's that use
   'fnsave' to save the state cannot take advantage of this, since the
   state saving also trashes the state.

In other words, there is now an actual _design_ to the FPU state saving,
rather than just random historical baggage.  Hopefully it's easier to
follow as a result.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-18 14:03:48 -08:00
Linus Torvalds f94edacf99 i387: move TS_USEDFPU flag from thread_info to task_struct
This moves the bit that indicates whether a thread has ownership of the
FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
(called 'has_fpu') in task_struct->thread.has_fpu.

This fixes two independent bugs at the same time:

 - changing 'thread_info->status' from the scheduler causes nasty
   problems for the other users of that variable, since it is defined to
   be thread-synchronous (that's what the "TS_" part of the naming was
   supposed to indicate).

   So perfectly valid code could (and did) do

	ti->status |= TS_RESTORE_SIGMASK;

   and the compiler was free to do that as separate load, or and store
   instructions.  Which can cause problems with preemption, since a task
   switch could happen in between, and change the TS_USEDFPU bit. The
   change to TS_USEDFPU would be overwritten by the final store.

   In practice, this seldom happened, though, because the 'status' field
   was seldom used more than once, so gcc would generally tend to
   generate code that used a read-modify-write instruction and thus
   happened to avoid this problem - RMW instructions are naturally low
   fat and preemption-safe.

 - On x86-32, the current_thread_info() pointer would, during interrupts
   and softirqs, point to a *copy* of the real thread_info, because
   x86-32 uses %esp to calculate the thread_info address, and thus the
   separate irq (and softirq) stacks would cause these kinds of odd
   thread_info copy aliases.

   This is normally not a problem, since interrupts aren't supposed to
   look at thread information anyway (what thread is running at
   interrupt time really isn't very well-defined), but it confused the
   heck out of irq_fpu_usable() and the code that tried to squirrel
   away the FPU state.

   (It also caused untold confusion for us poor kernel developers).

It also turns out that using 'task_struct' is actually much more natural
for most of the call sites that care about the FPU state, since they
tend to work with the task struct for other reasons anyway (ie
scheduling).  And the FPU data that we are going to save/restore is
found there too.

Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to
the %esp issue.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Reported-and-tested-by: Raphael Prevost <raphael@buro.asia>
Acked-and-tested-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-18 10:19:41 -08:00
Martin Schwidefsky cf1eb40f8f [S390] correct ktime to tod clock comparator conversion
The conversion of the ktime to a value suitable for the clock comparator
does not take changes to wall_to_monotonic into account. In fact the
conversion just needs the boot clock (sched_clock_base_cc) and the
total_sleep_time.

This is applicable to 3.2+ kernels.

CC: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-02-17 10:29:33 +01:00
Martin Schwidefsky 2320c57937 [S390] incorrect PageTables counter for kvm page tables
The page_table_free_pgste function is used for kvm processes to free page
tables that have the pgste extension. It calls pgtable_page_ctor instead of
pgtable_page_dtor which increases NR_PAGETABLE instead of decreasing it.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-02-17 10:29:33 +01:00
Heiko Carstens f3612304ee [S390] idle: avoid RCU usage in extended quiescent state
Avoid calling wake_up() from our NMI "bottom halve" from RCU extended
quiescent state in idle. wake_up() has RCU read-side critical sections
but this will be completely ignored by RCU if the cpu is in extended
quiescent state.
Which means that whatever object is being accessed from within the
read-side critical section can be freed concurrently from a different
cpu.
So make sure we leave extended quiescent state before calling wake_up().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-02-17 10:29:32 +01:00
Linus Torvalds 4903062b54 i387: move AMD K7/K8 fpu fxsave/fxrstor workaround from save to restore
The AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
pending.  In order to not leak FIP state from one process to another, we
need to do a floating point load after the fxsave of the old process,
and before the fxrstor of the new FPU state.  That resets the state to
the (uninteresting) kernel load, rather than some potentially sensitive
user information.

We used to do this directly after the FPU state save, but that is
actually very inconvenient, since it

 (a) corrupts what is potentially perfectly good FPU state that we might
     want to lazy avoid restoring later and

 (b) on x86-64 it resulted in a very annoying ordering constraint, where
     "__unlazy_fpu()" in the task switch needs to be delayed until after
     the DS segment has been reloaded just to get the new DS value.

Coupling it to the fxrstor instead of the fxsave automatically avoids
both of these issues, and also ensures that we only do it when actually
necessary (the FP state after a save may never actually get used).  It's
simply a much more natural place for the leaked state cleanup.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-16 19:11:15 -08:00
Linus Torvalds b3b0870ef3 i387: do not preload FPU state at task switch time
Yes, taking the trap to re-load the FPU/MMX state is expensive, but so
is spending several days looking for a bug in the state save/restore
code.  And the preload code has some rather subtle interactions with
both paravirtualization support and segment state restore, so it's not
nearly as simple as it should be.

Also, now that we no longer necessarily depend on a single bit (ie
TS_USEDFPU) for keeping track of the state of the FPU, we migth be able
to do better.  If we are really switching between two processes that
keep touching the FP state, save/restore is inevitable, but in the case
of having one process that does most of the FPU usage, we may actually
be able to do much better than the preloading.

In particular, we may be able to keep track of which CPU the process ran
on last, and also per CPU keep track of which process' FP state that CPU
has.  For modern CPU's that don't destroy the FPU contents on save time,
that would allow us to do a lazy restore by just re-enabling the
existing FPU state - with no restore cost at all!

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-16 15:45:23 -08:00
Linus Torvalds 6d59d7a9f5 i387: don't ever touch TS_USEDFPU directly, use helper functions
This creates three helper functions that do the TS_USEDFPU accesses, and
makes everybody that used to do it by hand use those helpers instead.

In addition, there's a couple of helper functions for the "change both
CR0.TS and TS_USEDFPU at the same time" case, and the places that do
that together have been changed to use those.  That means that we have
fewer random places that open-code this situation.

The intent is partly to clarify the code without actually changing any
semantics yet (since we clearly still have some hard to reproduce bug in
this area), but also to make it much easier to use another approach
entirely to caching the CR0.TS bit for software accesses.

Right now we use a bit in the thread-info 'status' variable (this patch
does not change that), but we might want to make it a full field of its
own or even make it a per-cpu variable.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-16 13:33:12 -08:00
Linus Torvalds b6c66418dc i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers
Touching TS_USEDFPU without touching CR0.TS is confusing, so don't do
it.  By moving it into the callers, we always do the TS_USEDFPU next to
the CR0.TS accesses in the source code, and it's much easier to see how
the two go hand in hand.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-16 12:22:48 -08:00
Linus Torvalds 15d8791cae i387: fix x86-64 preemption-unsafe user stack save/restore
Commit 5b1cbac377 ("i387: make irq_fpu_usable() tests more robust")
added a sanity check to the #NM handler to verify that we never cause
the "Device Not Available" exception in kernel mode.

However, that check actually pinpointed a (fundamental) race where we do
cause that exception as part of the signal stack FPU state save/restore
code.

Because we use the floating point instructions themselves to save and
restore state directly from user mode, we cannot do that atomically with
testing the TS_USEDFPU bit: the user mode access itself may cause a page
fault, which causes a task switch, which saves and restores the FP/MMX
state from the kernel buffers.

This kind of "recursive" FP state save is fine per se, but it means that
when the signal stack save/restore gets restarted, it will now take the
'#NM' exception we originally tried to avoid.  With preemption this can
happen even without the page fault - but because of the user access, we
cannot just disable preemption around the save/restore instruction.

There are various ways to solve this, including using the
"enable/disable_page_fault()" helpers to not allow page faults at all
during the sequence, and fall back to copying things by hand without the
use of the native FP state save/restore instructions.

However, the simplest thing to do is to just allow the #NM from kernel
space, but fix the race in setting and clearing CR0.TS that this all
exposed: the TS bit changes and the TS_USEDFPU bit absolutely have to be
atomic wrt scheduling, so while the actual state save/restore can be
interrupted and restarted, the act of actually clearing/setting CR0.TS
and the TS_USEDFPU bit together must not.

Instead of just adding random "preempt_disable/enable()" calls to what
is already excessively ugly code, this introduces some helper functions
that mostly mirror the "kernel_fpu_begin/end()" functionality, just for
the user state instead.

Those helper functions should probably eventually replace the other
ad-hoc CR0.TS and TS_USEDFPU tests too, but I'll need to think about it
some more: the task switching functionality in particular needs to
expose the difference between the 'prev' and 'next' threads, while the
new helper functions intentionally were written to only work with
'current'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-16 09:15:04 -08:00
Anton Blanchard 9a45a9407c powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events
perf on POWER stopped working after commit e050e3f0a7 (perf: Fix
broken interrupt rate throttling). That patch exposed a bug in
the POWER perf_events code.

Since the PMCs count upwards and take an exception when the top bit
is set, we want to write 0x80000000 - left in power_pmu_start. We were
instead programming in left which effectively disables the counter
until we eventually hit 0x80000000. This could take seconds or longer.

With the patch applied I get the expected number of samples:

          SAMPLE events:       9948

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@kernel.org>
2012-02-16 16:24:35 +11:00
Benjamin Herrenschmidt 54321242af powerpc: Disable interrupts early in Program Check
Program Check exceptions are the result of WARNs, BUGs, some
type of breakpoints, kprobe, and other illegal instructions.

We want interrupts (and thus preemption) to remain disabled
while doing the initial stage of testing the reason and
branching off to a debugger or kprobe, so we are still on
the original CPU which makes debugging easier in various cases.

This is how the code was intended, hence the local_irq_enable()
right in the middle of program_check_exception().

However, the assembly exception prologue for that exception was
incorrectly marked as enabling interrupts, which defeats that
(and records a redundant enable with lockdep).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-16 16:15:10 +11:00
Stephen Rothwell a1a1d1bfc9 powerpc: Remove legacy iSeries from ppc64_defconfig
Since we are heading towards removing the Legacy iSeries platform, start
by no longer building it for ppc64_defconfig.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-16 16:15:08 +11:00
Benjamin Herrenschmidt 13635dfdc6 powerpc/fsl/pci: Fix PCIe fixup regression
Upstream changes to the way PHB resources are registered
broke the resource fixup for FSL boards.

We can no longer rely on the resource pointer array for the PHB's
pci_bus structure, so let's leave it alone and go straight for
the PHB resources instead. This also makes the code generally
more readable.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-16 16:15:03 +11:00
Ira Snyder 40c8cefaaf powerpc: Fix kernel log of oops/panic instruction dump
A kernel oops/panic prints an instruction dump showing several
instructions before and after the instruction which caused the
oops/panic.

The code intended that the faulting instruction be enclosed in angle
brackets, however a bug caused the faulting instruction to be
interpreted by printk() as the message log level.

To fix this, the KERN_CONT log level is added before the actual text of
the printed message.

=== Before the patch ===

[ 1081.587266] Instruction dump:
[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000
[ 1081.602500]  4e800020 3803ffd0 2b800009

<4>[ 1081.587266] Instruction dump:
<4>[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
<4>[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000
<98090000>[ 1081.602500]  4e800020 3803ffd0 2b800009

=== After the patch ===

[   51.385216] Instruction dump:
[   51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
[   51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009

<4>[   51.385216] Instruction dump:
<4>[   51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
<4>[   51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-16 16:11:23 +11:00
Olof Johansson fee6a3c33a ARM: 7327/1: need to include asm/system.h in asm/processor.h
For files that include asm/processor.h but not asm/system.h:

arch/arm/mach-msm/include/mach/uncompress.h: In function 'putc':
arch/arm/mach-msm/include/mach/uncompress.h:48:3: error: implicit declaration of function 'smp_mb' [-Werror=implicit-function-declaration]

In this case, smp_mb() is from the cpu_relax() call in the msm putc().

It likely went uncaught when the uncompress.h change went in since the
defconfig didn't enable that code path, but later changes (e76f4750f4:
ARM: debug: arrange Kconfig options more logically) resulted in the
option being on for msm_defconfig and thus exposed it.

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15 21:10:49 +00:00
Javi Merino 46e33c606a ARM: 7326/2: PL330: fix null pointer dereference in pl330_chan_ctrl()
This fixes the thrd->req_running field being accessed before thrd
is checked for null. The error was introduced in

   abb959f: ARM: 7237/1: PL330: Fix driver freeze

Reference: <1326458191-23492-1-git-send-email-mans.rullgard@linaro.org>

Cc: stable@kernel.org
Signed-off-by: Mans Rullgard <mans.rullgard@linaro.org>
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15 21:10:49 +00:00
Javi Merino 4272f98a1a ARM: 7164/3: PL330: Fix the size of the dst_cache_ctrl field
dst_cache_ctrl affects bits 3, 1 and 0 of AWCACHE but it is a 3-bit
field in the Channel Control Register (see Table 3-21 of the DMA-330
Technical Reference Manual) and should be programmed as such.

Reference: <1320244259-10496-3-git-send-email-javi.merino@arm.com>

Signed-off-by: Javi Merino <javi.merino@arm.com>
Acked-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15 21:10:49 +00:00
Rabin Vincent 8e43a905dd ARM: 7325/1: fix v7 boot with lockdep enabled
Bootup with lockdep enabled has been broken on v7 since b46c0f7465
("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").

This is because v7_setup (which is called very early during boot) calls
v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
ends up attempting to call into lockdep C code (trace_hardirqs_off())
when we are in no position to execute it (no stack, MMU off).

Fix this by using a notrace variant of save_and_disable_irqs.  The code
already uses the notrace variant of restore_irqs.

Reviewed-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15 21:09:52 +00:00
Linus Torvalds c38e234562 i387: fix sense of sanity check
The check for save_init_fpu() (introduced in commit 5b1cbac37798: "i387:
make irq_fpu_usable() tests more robust") was the wrong way around, but
I hadn't noticed, because my "tests" were bogus: the FPU exceptions are
disabled by default, so even doing a divide by zero never actually
triggers this code at all unless you do extra work to enable them.

So if anybody did enable them, they'd get one spurious warning.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-15 08:05:18 -08:00
Catalin Marinas 08a183f02b ARM: 7323/1: Do not allow ARM_LPAE on pre-ARMv7 architectures
This patch expands the Kconfig dependencies for ARM_LPAE to not allow
enabling when architectures other than ARMv7 are built into the kernel.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15 11:04:36 +00:00
Stephen Warren 62e37ca78b Kbuild: Use dtc's -d (dependency) option
This hooks dtc into Kbuild's dependency system.

Thus, for example, "make dtbs" will rebuild tegra-harmony.dtb if only
tegra20.dtsi has changed yet tegra-harmony.dts has not. The previous
lack of this feature recently caused me to have very confusing "git
bisect" results.

For ARM, it's obvious what to add to $(targets). I'm not familiar enough
with other architectures to know what to add there. Powerpc appears to
already add various .dtb files into $(targets), but the other archs may
need something added to $(targets) to work.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Mark Salter <msalter@redhat.com>
2012-02-14 21:14:44 -05:00
Linus Torvalds ebf4bcbd5f Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Quoth BenH:
 "Here are a few powerpc fixes for 3.3, all pretty trivial.  I also
  added the patch to define GET_IP/SET_IP so we can use some more
  asm-generic goodness."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pseries/eeh: Fix crash when error happens during device probe
  powerpc/pseries: Fix partition migration hang in stop_topology_update
  powerpc/powernv: Disable interrupts while taking phb->lock
  powerpc: Fix WARN_ON in decrementer_check_overflow
  powerpc/wsp: Fix IRQ affinity setting
  powerpc: Implement GET_IP/SET_IP
  powerpc/wsp: Permanently enable PCI class code workaround
2012-02-14 15:21:25 -08:00
Linus Torvalds 694ce18ec3 Two fixes for VCPU offlining; One to fix the string format exposed
by the xen-pci[front|back] to conform to the one used in majority of
 PCI drivers; Two fixes to make the code more resilient to invalid
 configurations.
 
 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPOeReAAoJEFjIrFwIi8fJn9QIANP48kzrGg0uO4bjSf2h/z7G
 pp3ISdtVLk7pwMov2POBqskoXSq8E0yQAfNN8se183wqNXo3Dm4rU1DIG7HQFBk9
 sdcyfHI8x7pat9JClRhGxpQ23Ig9f1iWkShweCcZCO782vfxZyNd65i6t87X7uLq
 7SPtG1XH2RixTX7tHtKKBqdzZ0OMXOEkJ33dgCmyrn+wzohbKrFj5mg+NdOgmzEo
 VgsHPVtuq7orDROe+F9d91eAg0TILQ13th8xfWZ59lQATXu/zAlaueYt87tpy1pb
 oVQvumsn8Xev+7hct9My9Tw45D4m8YOSFLG2HcekkW2WtNmGhTTbIyMh9PsLugk=
 =NDYK
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Two fixes for VCPU offlining; One to fix the string format exposed
by the xen-pci[front|back] to conform to the one used in majority of
PCI drivers; Two fixes to make the code more resilient to invalid
configurations.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

* tag 'stable/for-linus-fixes-3.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xenbus_dev: add missing error check to watch handling
  xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.
  xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
  xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.
  xen/bootup: During bootup suppress XENBUS: Unable to read cpu state
2012-02-14 15:20:11 -08:00
Thadeu Lima de Souza Cascardo 778a785f02 powerpc/pseries/eeh: Fix crash when error happens during device probe
EEH may happen during a PCI driver probe. If the driver is trying to
access some register in a loop, the EEH code will try to print the
driver name. But the driver pointer in struct pci_dev is not set until
probe returns successfully.

Use a function to test if the device and the driver pointer is NULL
before accessing the driver's name.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:39 +11:00
Brian King 444080d13d powerpc/pseries: Fix partition migration hang in stop_topology_update
This fixes a hang that was observed during live partition migration.
Since stop_topology_update must not be called from an interrupt
context, call it earlier in the migration process. The hang observed
can be seen below:

WARNING: at kernel/timer.c:1011
Modules linked in: ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop ibmveth sg ext3 jbd mbcache raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_round_robin dm_multipath scsi_dh sd_mod crc_t10dif ibmvfc scsi_transport_fc scsi_tgt scsi_mod dm_snapshot dm_mod
NIP: c0000000000c52d8 LR: c00000000004be28 CTR: 0000000000000000
REGS: c00000005ffd77d0 TRAP: 0700   Not tainted  (3.2.0-git-00001-g07d106d)
MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 48000084  XER: 00000001
CFAR: c00000000004be20
TASK = c00000005ec78860[0] 'swapper/3' THREAD: c00000005ec98000 CPU: 3
GPR00: 0000000000000001 c00000005ffd7a50 c000000000fbbc98 c000000000ec8340
GPR04: 00000000282a0020 0000000000000000 0000000000004000 0000000000000101
GPR08: 0000000000000012 c00000005ffd4000 0000000000000020 c000000000f3ba88
GPR12: 0000000000000000 c000000007f40900 0000000000000001 0000000000000004
GPR16: 0000000000000001 0000000000000000 0000000000000000 c000000001022310
GPR20: 0000000000000001 0000000000000000 0000000000200200 c000000001029e14
GPR24: 0000000000000000 0000000000000001 0000000000000040 c00000003f74bc80
GPR28: c00000003f74bc84 c000000000f38038 c000000000f16b58 c000000000ec8340
NIP [c0000000000c52d8] .del_timer_sync+0x28/0x60
LR [c00000000004be28] .stop_topology_update+0x20/0x38
Call Trace:
[c00000005ffd7a50] [c00000005ec78860] 0xc00000005ec78860 (unreliable)
[c00000005ffd7ad0] [c00000000004be28] .stop_topology_update+0x20/0x38
[c00000005ffd7b40] [c000000000028378] .__rtas_suspend_last_cpu+0x58/0x260
[c00000005ffd7bf0] [c0000000000fa230] .generic_smp_call_function_interrupt+0x160/0x358
[c00000005ffd7cf0] [c000000000036ec8] .smp_ipi_demux+0x88/0x100
[c00000005ffd7d80] [c00000000005c154] .icp_hv_ipi_action+0x5c/0x80
[c00000005ffd7e00] [c00000000012a088] .handle_irq_event_percpu+0x100/0x318
[c00000005ffd7f00] [c00000000012e774] .handle_percpu_irq+0x84/0xd0
[c00000005ffd7f90] [c000000000022ba8] .call_handle_irq+0x1c/0x2c
[c00000005ec9ba20] [c00000000001157c] .do_IRQ+0x22c/0x2a8
[c00000005ec9bae0] [c0000000000054bc] hardware_interrupt_entry+0x18/0x1c
Exception: 501 at .cpu_idle+0x194/0x2f8
    LR = .cpu_idle+0x194/0x2f8
[c00000005ec9bdd0] [c000000000017e58] .cpu_idle+0x188/0x2f8 (unreliable)
[c00000005ec9be90] [c00000000067ec18] .start_secondary+0x3e4/0x524
[c00000005ec9bf90] [c0000000000093e8] .start_secondary_prolog+0x10/0x14
Instruction dump:
ebe1fff8 4e800020 fbe1fff8 7c0802a6 f8010010 7c7f1b78 f821ff81 78290464
80090014 5400019e 7c0000d0 78000fe0 <0b000000> 4800000c 7c210b78 7c421378

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:39 +11:00
Michael Ellerman f1c853b53c powerpc/powernv: Disable interrupts while taking phb->lock
We need to disable interrupts when taking the phb->lock. Otherwise
we could deadlock with pci_lock taken from an interrupt.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:39 +11:00
Benjamin Herrenschmidt 6fe5f5f3ff powerpc: Fix WARN_ON in decrementer_check_overflow
We use __get_cpu_var() which triggers a false positive warning
in smp_processor_id() thinking interrupts are enabled (at this
point, they are soft-enabled but hard-disabled).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:38 +11:00
Benjamin Herrenschmidt 7a768d30ca powerpc/wsp: Fix IRQ affinity setting
We call the cache_hwirq_map() function with a linux IRQ number
but it expects a HW irq number. This triggers a BUG on multic-chip
setups in addition to not doing the right thing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:38 +11:00
Srikar Dronamraju e62894273c powerpc: Implement GET_IP/SET_IP
With this change, helpers such as instruction_pointer() et al, get defined
in the generic header in terms of GET_IP

Removed the unnecessary definition of profile_pc in !CONFIG_SMP case as
suggested by Mike Frysinger.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:38 +11:00
Benjamin Herrenschmidt 454c0bfd0c powerpc/wsp: Permanently enable PCI class code workaround
It appears that on the Chroma card, the class code of the root
complex is still wrong even on DD2 or later chips. This could
be a firmware issue, but that breaks resource allocation so let's
unconditionally fix it up.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:38 +11:00