* New SoCs i.MX6 Sololite and Vybrid VF610 support
* imx5 and imx6 clock fixes and additions
* Update clock driver to use of_clk_init() function
* Refactor restart routine mxc_restart() to get it work for DT boot
as well
* Clean up mxc specific ulpi access ops
* imx defconfig updates
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJRvsJmAAoJEFBXWFqHsHzOHOIH/jjVCaAFdOskPI4d9qPPAt9C
5o0aJDjerzTm+vH2mbec2507fChaYrLybAypJUj6wDYRf03RhAgPXorY83Y+3WtG
SYz2UWza7MY8GeZv6e9tdrYS3JUSicFXPf8MsOcINsuyIub3dD96z36OqrnWZLFy
uH5V81e4gOHECd4PWIxmhdjwawqmwb/Pqzl0V3+vXi2JM07xrn7/SqlZ7VfUwM2q
DNhu5ugH7FtaFp75YrmTIhp6i+tovguRr0RIt6dnk/9gbJBQnV2Cw2MzdRPT12U5
bC79P7sojkKRtITcq9c1fnUNhkgc0+hS8HoezcQmzKMin6nFmVAh5wQFSlRJMJE=
=mZw+
-----END PGP SIGNATURE-----
Merge tag 'imx-soc-3.11' into imx/dt
imx soc changes for 3.11:
* New SoCs i.MX6 Sololite and Vybrid VF610 support
* imx5 and imx6 clock fixes and additions
* Update clock driver to use of_clk_init() function
* Refactor restart routine mxc_restart() to get it work for DT boot
as well
* Clean up mxc specific ulpi access ops
* imx defconfig updates
WM8962 needs 24MHz clock for its MCLK, so choose PLL4 as the parent of clko1.
Signed-off-by: Nicolin Chen <b42378@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
These options are useful for controlling backlight contrast via PWM.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
There are ulpi access ops implemented in drivers/usb/phy/phy-ulpi.c.
mxc access ops implement the same access operations within mach-imx. This
patch removes the mxc ulpi file and uses phy-ulpi instead for
imx_otg_ulpi_create.
phy-ulpi successfully tested with i.MX27 Phytec phyCARD-S (pca100).
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Add clock support for Vybrid VF610. It uses dtc macro support to
define all clock IDs in vf610-clock.h to keep clock IDs coherence
between kernel and DT.
Signed-off-by: Jingchang Lu <b35083@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
commit 84344b43c (ARM: i.MX5: Allow DT clock providers) introduce the following
sparse warning:
arch/arm/mach-imx/clk.c:12:43: warning: Using plain integer as NULL pointer
There is no need to initialize phandle, so remove it.
Cc: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Fix the following sparse warning:
arch/arm/mach-imx/irq-common.c:24:5: warning: symbol 'mxc_set_irq_fiq' was not declared. Should it be static?
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Instead of explicitly calling clock initialization functions, we can
declare the functions with CLK_OF_DECLARE() and then call common
of_clk_init() to have them invoked properly.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Let the mx53 TVE driver be built by default.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
The CCM_CBCMR register (address 0x02C4018) has different meaning
between the i.MX6 Quad/Dual and the i.MX6 Solo/DualLite.
Compared to the i.MX6 Quad/Dual, the CCM_CBCMR register in the
i.MX6 Solo/DualLite reuses the gpu2d_core bits for the MLB clock
configuration.
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
This patch adds the S/PDIF clocks for i.MX51 and i.MX53. Tested on i.MX53.
The i.MX51 has a second set of spdif_root clock dividers, and on i.MX53
there is an additional input to the spdif_xtal mux.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
MLB PLL should be handled internally in MLB driver,
so remove it from pllv3.
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
CC: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
The MLB PLL clock's operation doesn't fit for clock framework and
it should be handled internally in MLB driver.
Remove initialization of pll8_mlb clock device but leave its
declaration in mx6q_clks to avoid affecting imx6q clock numbering.
Signed-off-by: Jiada Wang <jiada_wang@mentor.com>
CC: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Add clock support for i.MX6 SoloLite. It uses the dtc marco support to
define all clock IDs in imx6sl-clock.h, which will be included by both
clock driver and device tree sources, so that the data will stay sync
all the time between kernel and DT.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
The mxc_arch_reset_init() uses static mapping and calls clk_get_sys() to
get clock. It's suitable for non-DT boot but not for DT boot where
dynamic mapping and of_clk_get() should be used instead. Create
mxc_arch_reset_init_dt() as the DT variant of mxc_arch_reset_init(),
and change DT platforms to use it.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
It's inappropriate to call clk_prepare() in mxc_restart(), because the
restart routine could be called in atomic context. Move clk_get() and
clk_prepare() into mxc_arch_reset_init() and only have the atomic part
clk_enable() be called in mxc_restart().
As a result, mxc_arch_reset_init() needs to be called after clk gets
initialized.
While there, it also changes printk(KERN_ERR ...) to pr_err() and adds
__init annotation for mxc_arch_reset_init().
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
The CCM_CBCMR register (address 0x02C4018) has different meaning
between the i.MX6 Quad/Dual and the i.MX6 Solo/DualLite.
Compared to the i.MX6 Quad/Dual, the CCM_CBCMR register in the
i.MX6 Solo/DualLite doesn't have a gpu3d_shader configuration and
moves the gpu2_core configuration at that place.
Handle these i.MX6 Quad/Dual vs. i.MX6 Solo/DualLite clock differences
by using cpu_is_mx6dl().
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
To improve the performance and power consumption add an i.MX6
specific L2 cache initialization.
This configuration is taken from Freescale's kernel patch
"ENGR00153601 [MX6]Adjust L2 cache parameter" [1]
with two additional improvements:
a) The L2X0_POWER_CTRL has only the two bits we set. So no need
to read the register before. Remove the register read done
in Freescale's patch.
b) In the L2X0_PREFETCH_CTRL register, besides the double linefill (bit[30]),
additionally enable the instruction and data prefetch (bit[29-28]).
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
[1] http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/commit/arch/arm/mach-mx6/mm.c?h=imx_3.0.35_12.09.01&id=814656410b40c67a10b25300e51b0477b2bb96d1
Currently clock providers defined in the DT are not registered
on i.MX5 platforms since of_clk_init() is not called.
This is not a problem for the SOC's own clocks, which are registered
in code, but prevents the DT being used to define clocks for external
hardware.
Fix this by calling of_clk_init() and actually using the DT to obtain
the 4 SOC fixed clocks.
These are already defined in the DT but were previously just used to
manually obtain the rate.
Fall back to the old scheme for non DT platforms.
Since the same method may be useful for other i.MX platforms
implement the imx_obtain_fixed_clock() function in common code.
Actually changing other i.MX platforms to use this should be done
later by someone with access to the appropriate hardware.
Signed-off-by: Martin Fuzzey <mfuzzey@parkeon.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Another week, another batch of fixes for arm-soc platforms.
Again, nothing controversial. A few more than would be ideal, but all
are valid fixes. In particular the prima2 panic patch is critical since
it fixes a problem where multiplatform kernels panic on all but prima2
hardware.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRvKO7AAoJEIwa5zzehBx3AE4P+wdAbXkBzt5L/MIlg2vgfS4O
5snAL6f/k1Yar7kJFL8o8jQ+DIPPo/Lq+kwJ+ZLOh4DkTIzd/JiJE5cGRlDwMF0R
KCbZsvxDf7kYapor7Fg8YHDDgDfZNCvudkBcmRHMEWCJVRjVbXxqrTIRTxNqLZVk
WyuZHKmLFj0/aY8K3Z+BhY8klWVmy6QsT8Cg+1la7qLXEoazXaEqLCSRhPSmvwCy
3ykPpmLdt/1MzVCgq+llNuT7C9N9buOw/bdb0JCYvGWiRxnLi2ee+pimGfOLIYDx
/eeMr091r7OOhFBsIPd8fzfzd5A2cEeR2USPhGEpFCKpPKv5gOF8m7ku2w62WDcN
PrMsYy2N4idxOBpsvLqxx9bkZ8nRqSj2anX6GLqHdL2v4k4pS/kufIihdgjd5Oja
xUvlzk0gVRc8wqENyluRYdrocm307p/URLu/s2EOTQ3ZOZOZ3PRjbEelxoKCp6Pd
RvADywebFMVL6Qyp4D996rvPgWYFnkSIkiSZQwlUHEMUHdE5JPbxcfBGTO5l8tKI
EehgZ6Xc28Jv/pGt5a5DbwNjUN4NPLe6jhMXEGEnOG0SZ+43Xfmlwj93Y96hReJq
+/8AMXx/NaPNrdSoqZBR4GJ6KBM/DATx23dr9VMskX5rBGxYzBVln9ck9boYnqyq
hg5z53tz0uTBtqot6W9v
=c555
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"These are a little later than I planned on since I got caught up with
handling merges for 3.11 most of the week.
Another week, another batch of fixes for arm-soc platforms.
Again, nothing controversial. A few more than would be ideal, but all
are valid fixes. In particular the prima2 panic patch is critical
since it fixes a problem where multiplatform kernels panic on all but
prima2 hardware."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: SAMSUNG: pm: Adjust for pinctrl- and DT-enabled platforms
ARM: prima2: fix incorrect panic usage
arm: mvebu: armada-xp-{gp,openblocks-ax3-4}: specify PCIe range
ARM: Kirkwood: handle mv88f6282 cpu in __kirkwood_variant().
ARM: omap3: clock: fix wrong container_of in clock36xx.c
ARM: dts: OMAP5: Fix missing PWM capability to timer nodes
ARM: dts: omap4-panda|sdp: Fix mux for twl6030 IRQ pin and msecure line
ARM: dts: AM33xx: Fix properties on gpmc node
arm: omap2: fix AM33xx hwmod infos for UART2
ARM: OMAP3: Fix iva2_pwrdm settings for 3703
Pull networking fixes from David Miller:
1) Fix RTNL locking in batman-adv, from Matthias Schiffer.
2) Don't allow non-passthrough macvlan devices to set NOPROMISC via
netlink, otherwise we can end up with corrupted promisc counter
values on the device. From Michael S Tsirkin.
3) Fix stmmac driver build with debugging defines enabled, from Dinh
Nguyen.
4) Make sure name string we give in socket address in AF_PACKET is NULL
terminated, from Daniel Borkmann.
5) Fix leaking of two uninitialized bytes of memory to userspace in
l2tp, from Guillaume Nault.
6) Clear IPCB(skb) before tunneling otherwise we touch dangling IP
options state and crash. From Saurabh Mohan.
7) Fix suspend/resume for davinci_mdio by using suspend_late and
resume_early. From Mugunthan V N.
8) Don't tag ip_tunnel_init_net and ip_tunnel_delete_net with
__net_{init,exit}, they can be called outside of those contexts.
From Eric Dumazet.
9) Fix RX length error in sh_eth driver, from Yoshihiro Shimoda.
10) Fix missing sctp_outq initialization in some code paths of SCTP
stack, from Neil Horman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
sctp: fully initialize sctp_outq in sctp_outq_init
netiucv: Hold rtnl between name allocation and device registration.
tulip: Properly check dma mapping result
net: sh_eth: fix incorrect RX length error if R8A7740
ip_tunnel: remove __net_init/exit from exported functions
drivers: net: davinci_mdio: restore mdio clk divider in mdio resume
drivers: net: davinci_mdio: moving mdio resume earlier than cpsw ethernet driver
net/ipv4: ip_vti clear skb cb before tunneling.
tg3: Wait for boot code to finish after power on
l2tp: Fix sendmsg() return value
l2tp: Fix PPP header erasure and memory leak
bonding: fix igmp_retrans type and two related races
bonding: reset master mac on first enslave failure
packet: packet_getname_spkt: make sure string is always 0-terminated
net: ethernet: stmicro: stmmac: Fix compile error when STMMAC_XMIT_DEBUG used
be2net: Fix 32-bit DMA Mask handling
xen-netback: don't de-reference vif pointer after having called xenvif_put()
macvlan: don't touch promisc without passthrough
batman-adv: Don't handle address updates when bla is disabled
batman-adv: forward late OGMs from best next hop
...
Pull powerpc fixes from Benjamin Herrenschmidt:
"So here are 3 fixes still for 3.10. Fixes are simple, bugs are nasty
(though not recent regressions, nasty enough) and all targeted at
stable"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix missing/delayed calls to irq_work
powerpc: Fix emulation of illegal instructions on PowerNV platform
powerpc: Fix stack overflow crash in resume_kernel when ftracing
Thanks to commit f91eb62f71 ("init: scream bloody murder if interrupts
are enabled too early"), "bloody murder" is now being screamed.
With a MIPS OCTEON config, we use on_each_cpu() in our
irq_chip.irq_bus_sync_unlock() function. This gets called in early as a
result of the time_init() call. Because the !SMP version of
on_each_cpu() unconditionally enables irqs, we get:
WARNING: at init/main.c:560 start_kernel+0x250/0x410()
Interrupts were enabled early
CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0-rc5-Cavium-Octeon+ #801
Call Trace:
show_stack+0x68/0x80
warn_slowpath_common+0x78/0xb0
warn_slowpath_fmt+0x38/0x48
start_kernel+0x250/0x410
Suggested fix: Do what we already do in the SMP version of
on_each_cpu(), and use local_irq_save/local_irq_restore. Because we
need a flags variable, make it a static inline to avoid name space
issues.
[ Change from v1: Convert on_each_cpu to a static inline function, add
#include <linux/irqflags.h> to avoid build breakage on some files.
on_each_cpu_mask() and on_each_cpu_cond() suffer the same problem as
on_each_cpu(), but they are not causing !SMP bugs for me, so I will
defer changing them to a less urgent patch. ]
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull VFS fixes from Al Viro:
"Several fixes + obvious cleanup (you've missed a couple of open-coded
can_lookup() back then)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
snd_pcm_link(): fix a leak...
use can_lookup() instead of direct checks of ->i_op->lookup
move exit_task_namespaces() outside of exit_notify()
fput: task_work_add() can fail if the caller has passed exit_task_work()
ncpfs: fix rmdir returns Device or resource busy
- Remove noisy warnings about experimental support which spams the logs
- Add padding to align directory and attr structures correctly
- Set block number on child buffer on a root btree split
- Disable verifiers during log recovery for non-CRC filesystems
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJRu4gPAAoJENaLyazVq6ZO0GwP/j7i8hEl6hoFZZJ2WX7niFCP
t0r218J9JZDCLSk7+rY26gmxOzifRHAIt5TRwwqSCbNnZbuQZsqFUpvDMSMY3XOj
4qnUlO6diRLonN5ixrOb5YMTQJ8YHG7cB4jvxBDAqPqEfNpRyqikxstcH6KBmtSU
duqhuQMdmHAjMUqfpdt5ewueOCmw6jI79ZqvMnEfSHW7YS7G4SrKYa71HkfRR6CD
+K/FqEoDO/9psbsFlrkQ4Uvqngp8c9c0wQULxreN0BSdRbVqHfrS6eAWGhT3K2HW
7ZGxEiTcwR5XCtDQjhw7vbZQEMeMcl6yZ6J7e+jJc53maySOOrqCaYyyrhzZFw4H
Xh52pcVJtGuGVBHDxpfhI5e7KI4DjEugQK9AaONy02bhhTh3r3CKu5pprDyenyHr
9s/DG8u/gJX8tm8DSBlIXv2iCvY4mTeesYkMaLHgC8uLXmItkRBoUaj1NQvnsTqo
EF1xVVqh3aiueD4+cvu3+x4J4dTFmYQ++Oi3Zt1YpjBBb/h3n3KFUfizhRIp9r43
R4UO5W3b6s4q/1oC+bO6Qlxfny9vcyz+UrkcLpbuo+cRTC3bKi85v2Gaaw69bcB1
1SZCFRuVvDvzffX6Nir699Dj/uU4GETvDw/+y/igcKcETx6L4AgQPV9y/izJq5zr
zLhC+OSCDvuOGaOmRvco
=bijX
-----END PGP SIGNATURE-----
Merge tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Ben Myers:
- Remove noisy warnings about experimental support which spams the logs
- Add padding to align directory and attr structures correctly
- Set block number on child buffer on a root btree split
- Disable verifiers during log recovery for non-CRC filesystems
* tag 'for-linus-v3.10-rc6' of git://oss.sgi.com/xfs/xfs:
xfs: don't shutdown log recovery on validation errors
xfs: ensure btree root split sets blkno correctly
xfs: fix implicit padding in directory and attr CRC formats
xfs: don't emit v5 superblock warnings on write
Here are some small mei driver fixes for 3.10-rc6 that fix some reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlG7RgwACgkQMUfUDdst+ynQjgCcCed/djDG6rEk8OHNwtH0qsGE
3o4AnjEW26lnses9dpudJOzhFGggCKJt
=wN5b
-----END PGP SIGNATURE-----
Merge tag 'char-misc-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc fixes from Greg Kroah-Hartman:
"Here are some small mei driver fixes for 3.10-rc6 that fix some
reported problems"
* tag 'char-misc-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mei: me: clear interrupts on the resume path
mei: nfc: fix nfc device freeing
mei: init: Flush scheduled work before resetting the device
Here are some small USB driver fixes that resolve some reported problems
for 3.10-rc6
Nothing major, just 3 USB serial driver fixes, and two chipidea fixes.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlG7Rq0ACgkQMUfUDdst+ykKmwCg0mta+HehUtBYrhLJGq9uADix
0YMAn1hEPP26BhVl/7a6GL+s8UoSVFxo
=9Vkq
-----END PGP SIGNATURE-----
Merge tag 'usb-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are some small USB driver fixes that resolve some reported
problems for 3.10-rc6
Nothing major, just 3 USB serial driver fixes, and two chipidea fixes"
* tag 'usb-3.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: chipidea: fix id change handling
usb: chipidea: fix no transceiver case
USB: pl2303: fix device initialisation at open
USB: spcp8x5: fix device initialisation at open
USB: f81232: fix device initialisation at open
When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.
This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.
CC: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Normally, the kernel emulates a few instructions that are unimplemented
on some processors (e.g. the old dcba instruction), or privileged (e.g.
mfpvr). The emulation of unimplemented instructions is currently not
working on the PowerNV platform. The reason is that on these machines,
unimplemented and illegal instructions cause a hypervisor emulation
assist interrupt, rather than a program interrupt as on older CPUs.
Our vector for the emulation assist interrupt just calls
program_check_exception() directly, without setting the bit in SRR1
that indicates an illegal instruction interrupt. This fixes it by
making the emulation assist interrupt set that bit before calling
program_check_interrupt(). With this, old programs that use no-longer
implemented instructions such as dcba now work again.
CC: <stable@vger.kernel.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's possible for us to crash when running with ftrace enabled, eg:
Bad kernel stack pointer bffffd12 at c00000000000a454
cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
pc: c00000000000a454: resume_kernel+0x34/0x60
lr: c00000000000335c: performance_monitor_common+0x15c/0x180
sp: bffffd12
msr: 8000000000001032
dar: bffffd12
dsisr: 42000000
If we look at current's stack (paca->__current->stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info->flags and set
_TIF_EMULATE_STACK_STORE.
Dumping the stack we see:
3:mon> t c0000002ecab0000
[c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
[c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
--- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
[c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
[c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
[c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
[c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
[c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
[c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
[c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
--- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
[c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
[c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
[c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
[c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
[c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
[c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
--- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
... and so on
__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.
This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.
The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().
Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.
[ This regression was introduced by fe1952fc0a
"powerpc: Rework runlatch code" by myself --BenH
]
CC: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
exit_notify() does exit_task_namespaces() after
forget_original_parent(). This was needed to ensure that ->nsproxy
can't be cleared prematurely, an exiting child we are going to
reparent can do do_notify_parent() and use the parent's (ours) pid_ns.
However, after 32084504 "pidns: use task_active_pid_ns in
do_notify_parent" ->nsproxy != NULL is no longer needed, we rely
on task_active_pid_ns().
Move exit_task_namespaces() from exit_notify() to do_exit(), after
exit_fs() and before exit_task_work().
This solves the problem reported by Andrey, free_ipc_ns()->shm_destroy()
does fput() which needs task_work_add().
Note: this particular problem can be fixed if we change fput(), and
that change makes sense anyway. But there is another reason to move
the callsite. The original reason for exit_task_namespaces() from
the middle of exit_notify() was subtle and it has already gone away,
now this looks confusing. And this allows us do simplify exit_notify(),
we can avoid unlock/lock(tasklist) and we can use ->exit_state instead
of PF_EXITING in forget_original_parent().
Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
fput() assumes that it can't be called after exit_task_work() but
this is not true, for example free_ipc_ns()->shm_destroy() can do
this. In this case fput() silently leaks the file.
Change it to fallback to delayed_fput_work if task_work_add() fails.
The patch looks complicated but it is not, it changes the code from
if (PF_KTHREAD) {
schedule_work(...);
return;
}
task_work_add(...)
to
if (!PF_KTHREAD) {
if (!task_work_add(...))
return;
/* fallback */
}
schedule_work(...);
As for shm_destroy() in particular, we could make another fix but I
think this change makes sense anyway. There could be another similar
user, it is not safe to assume that task_work_add() can't fail.
Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>