Since commit f44f7f9 (RTC: Initialize kernel state from RTC)
rtc_device_register reads the programmed alarm. As reading the alarm
needs to take the mc13xxx lock, release it before calling
rtc_device_register.
This fixes a deadlock during boot:
INFO: task swapper:1 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
swapper D c02b175c 0 1 0 0x00000000
[<c02b175c>] (schedule+0x304/0x4f4) from [<c02b25a8>] (__mutex_lock_slowpath+0x7c/0x110)
[<c02b25a8>] (__mutex_lock_slowpath+0x7c/0x110) from [<c020b4cc>] (mc13xxx_rtc_read_time+0x1c/0x118)
[<c020b4cc>] (mc13xxx_rtc_read_time+0x1c/0x118) from [<c0208f04>] (__rtc_read_time+0x58/0x5c)
[<c0208f04>] (__rtc_read_time+0x58/0x5c) from [<c0209508>] (rtc_read_time+0x30/0x48)
[<c0209508>] (rtc_read_time+0x30/0x48) from [<c0209dd4>] (__rtc_read_alarm+0x1c/0x290)
[<c0209dd4>] (__rtc_read_alarm+0x1c/0x290) from [<c0208d58>] (rtc_device_register+0x150/0x27c)
[<c0208d58>] (rtc_device_register+0x150/0x27c) from [<c02b0b74>] (mc13xxx_rtc_probe+0x128/0x17c)
[<c02b0b74>] (mc13xxx_rtc_probe+0x128/0x17c) from [<c01d5280>] (platform_drv_probe+0x1c/0x24)
[<c01d5280>] (platform_drv_probe+0x1c/0x24) from [<c01d3e58>] (driver_probe_device+0x80/0x1a8)
[<c01d3e58>] (driver_probe_device+0x80/0x1a8) from [<c01d400c>] (__driver_attach+0x8c/0x90)
[<c01d400c>] (__driver_attach+0x8c/0x90) from [<c01d3654>] (bus_for_each_dev+0x60/0x8c)
[<c01d3654>] (bus_for_each_dev+0x60/0x8c) from [<c01d2f6c>] (bus_add_driver+0x180/0x248)
[<c01d2f6c>] (bus_add_driver+0x180/0x248) from [<c01d4664>] (driver_register+0x70/0x15c)
[<c01d4664>] (driver_register+0x70/0x15c) from [<c01d5700>] (platform_driver_probe+0x18/0x98)
[<c01d5700>] (platform_driver_probe+0x18/0x98) from [<c00273a8>] (do_one_initcall+0x2c/0x168)
[<c00273a8>] (do_one_initcall+0x2c/0x168) from [<c00083ac>] (kernel_init+0xa0/0x150)
[<c00083ac>] (kernel_init+0xa0/0x150) from [<c0033ff8>] (kernel_thread_exit+0x0/0x8)
Reported-by: Vagrant Cascadian <vagrant@debian.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Closes: http://bugs.debian.org/625804
[Tweaked commit log -jstultz]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the clientdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the clientdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
[Fixed up commit log -jstultz]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
[fixed up commit log -jstultz]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Commit f44f7f96a2 ("RTC: Initialize kernel state from RTC") uncovered
an issue in a number of RTC drivers, where the drivers call
rtc_device_register before initializing the device or platform drvdata.
This frequently results in null pointer dereferences when the
rtc_device_register immediately makes use of the rtc device, calling
rtc_read_alarm.
The solution is to ensure the drvdata is initialized prior to registering
the rtc device.
CC: Alessandro Zummo <a.zummo@towertech.it>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: rtc-linux@googlegroups.com
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
[fixed up commit log -jstultz]
Signed-off-by: John Stultz <john.stultz@linaro.org>
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
xHCI: Clear PLC in xhci_bus_resume()
USB: fix regression in usbip by setting has_tt flag
usb/isp1760: Report correct urb status after unlink
omap:usb: add regulator support for EHCI
mfd: Fix usbhs_enable error handling
usb: musb: gadget: Fix out-of-sync runtime pm calls
usb: musb: omap2430: Fix retention idle on musb peripheral only boards
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: do not call __mark_dirty_inode under i_lock
libceph: fix ceph_osdc_alloc_request error checks
ceph: handle ceph_osdc_new_request failure in ceph_writepages_start
libceph: fix ceph_msg_new error path
ceph: use ihold() when i_lock is held
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
[media] ngene: Fix CI data transfer regression Fix CI data transfer regression introduced by previous cleanup.
[media] v4l: make sure drivers supply a zeroed struct v4l2_subdev
[media] Missing frontend config for LME DM04/QQBOX
[media] rc_core: avoid kernel oops when rmmod saa7134
[media] imon: add conditional locking in change_protocol
[media] rc: show RC_TYPE_OTHER in sysfs
[media] ite-cir: modular build on ppc requires delay.h include
[media] mceusb: add Dell transceiver ID
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
flex_arrays: allow zero length flex arrays
flex_array: flex_array_prealloc takes a number of elements, not an end
SELinux: pass last path component in may_create
The SLUB allocator use of the cmpxchg_double logic was wrong: it
actually needs the irq-safe one.
That happens automatically when we use the native unlocked 'cmpxchg8b'
instruction, but when compiling the kernel for older x86 CPUs that do
not support that instruction, we fall back to the generic emulation
code.
And if you don't specify that you want the irq-safe version, the generic
code ends up just open-coding the cmpxchg8b equivalent without any
protection against interrupts or preemption. Which definitely doesn't
work for SLUB.
This was reported by Werner Landgraf <w.landgraf@ru.ru>, who saw
instability with his distro-kernel that was compiled to support pretty
much everything under the sun. Most big Linux distributions tend to
compile for PPro and later, and would never have noticed this problem.
This also fixes the prototypes for the irqsafe cmpxchg_double functions
to use 'bool' like they should.
[ Btw, that whole "generic code defaults to no protection" design just
sounds stupid - if the code needs no protection, there is no reason to
use "cmpxchg_double" to begin with. So we should probably just remove
the unprotected version entirely as pointless. - Linus ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-and-tested-by: werner <w.landgraf@ru.ru>
Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1105041539050.3005@ionos
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The __mark_dirty_inode helper now takes i_lock as of 250df6ed. Fix the
one ceph callers that held i_lock (__ceph_mark_dirty_caps) to return the
flags value so that the callers can do it outside of i_lock.
Signed-off-by: Sage Weil <sage@newdream.net>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: fix gart setup on fusion parts (v2)
drm: Send pending vblank events before disabling vblank.
drm/radeon: fix regression on atom cards with hardcoded EDID record.
drm/radeon/kms: add some new pci ids
Out of the entire GART/VM subsystem, the hw designers changed
the location of 3 regs.
v2: airlied: add parameter for userspace to work from.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is the least-bad behaviour. It means that we signal the
vblank event before it actually happens, but since we're disabling
vblanks there's no guarantee that it will *ever* happen otherwise.
This prevents GL applications which use WaitMSC from hanging
indefinitely.
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Since fafcf94e2b introduced an edid size, it seems to have broken this path.
This manifest as oops on T500 Lenovo laptops with dual graphics primarily.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=33812
cc: stable@kernel.org
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In particular, s_freeing_list needs to be initialized early, since it is
used on some of the error paths when mounts fail. The mapping inode,
for example, would be initialized and then free'd on an error path
before s_freeing_list was initialized, but the inode drop operation
needs the s_freeing_list to be set up.
Normally you'd never see this, because not only is logfs fairly rare,
but a successful mount will never have any issues.
Reported-by: werner <w.landgraf@ru.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hi us,
When i was compiling kernel, a warning happened to me.
The warning said like following.
drivers/staging/wlan-ng/cfg80211.c:709: warning: initialization from
incompatible pointer type.
See http://s1202.photobucket.com/albums/bb364/harrywei/?action=view¤t=patched2.png
for more details.
So i patch like following.
Signed-off-by: Harry Wei <harryxiyou@gmail.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch clears PORT_PLC if xhci_bus_resume() resumes a previous suspended
port, because if a port transition from U3 to U0 state, it will report a
port link state change, and software should clear the corresponding PLC bit.
It also uses hcd->speed to check if a port is a USB2 protocol port.
The patch fixes the issue that USB keyboard can not wakeup system from
hibernation.
Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
We should unlock the page and return -ENOMEM if ceph_osdc_new_request
failed.
Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
If memory allocation failed, calling ceph_msg_put() will cause GPF
since some of ceph_msg variables are not initialized first.
Fix Bug #970.
Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
* 'stable/bug-fixes-for-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: mask_rw_pte mark RO all pagetable pages up to pgt_buf_top
xen/mmu: Add workaround "x86-64, mm: Put early page table high"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wm831x-ts - move BTN_TOUCH reporting to data transfer
Input: wm831x-ts - allow IRQ flags to be specified
Input: wm831x-ts - fix races with IRQ management
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
sysctl: net: call unregister_net_sysctl_table where needed
Revert: veth: remove unneeded ifname code from veth_newlink()
smsc95xx: fix reset check
tg3: Fix failure to enable WoL by default when possible
networking: inappropriate ioctl operation should return ENOTTY
amd8111e: trivial typo spelling: Negotitate -> Negotiate
ipv4: don't spam dmesg with "Using LC-trie" messages
af_unix: Only allow recv on connected seqpacket sockets.
mii: add support of pause frames in mii_get_an
net: ftmac100: fix scheduling while atomic during PHY link status change
usbnet: Transfer of maintainership
usbnet: add support for some Huawei modems with cdc-ether ports
bnx2: cancel timer on device removal
iwl4965: fix "Received BA when not expected"
iwlagn: fix "Received BA when not expected"
dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
usbnet: Resubmit interrupt URB if device is open
iwl4965: fix "TX Power requested while scanning"
iwlegacy: led stay solid on when no traffic
b43: trivial: update module info about ucode16_mimo firmware
...
This patch (as1460) fixes a regression in the usbip driver caused by
the new check for Transaction Translators in USB-2 hubs. The root hub
registered by vhci_hcd needs to have the has_tt flag set, because it
can connect to low- and full-speed devices as well as high-speed
devices.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
CC: <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes a bug in my previous (2.6.38) patch series which caused
urb->status value to be wrong after unlink (broke usbtest 11, 12).
Signed-off-by: Arvid Brodin <arvid.brodin@enea.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ctl_table_headers registered with register_net_sysctl_table should
have been unregistered with the equivalent unregister_net_sysctl_table
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
84c49d8c3e ("veth: remove unneeded
ifname code from veth_newlink()") caused regression on veth
creation. This patch reverts the original one.
Reported-by: Michał Mirosław <mirqus@gmail.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reset loop check should check the MII_BMCR register value for
BMCR_RESET rather than for MII_BMCR (the register address, which also
happens to be zero).
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
tg3 is supposed to enable WoL by default on adapters which support
that, but it fails to do so unless the adapter's
/sys/devices/.../power/wakeup file contains 'enabled' during the
initialization of the adapter. Fix that by making tg3 use
device_set_wakeup_enable() to enable wakeup automatically whenever
WoL should be enabled by default.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
ioctl() calls against a socket with an inappropriate ioctl operation
are incorrectly returning EINVAL rather than ENOTTY:
[ENOTTY]
Inappropriate I/O control operation.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=33992
Signed-off-by: Lifeng Sun <lifongsun@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The use of base for %ebx in this file is arbitrary, *except* that we
also use it to compute the real-mode segment. Therefore, make it so
that r_base really is the true address to which %ebx points.
This resolves kernel bugzilla 33302.
Reported-and-tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/n/tip-08os5wi3yq1no0y4i5m4z7he@git.kernel.org
Current implementation of ohci_set_config_rom() uses a deferred
bus reset via fw_schedule_bus_reset(). If clients add multiple
unit descriptors to the config_rom in quick succession, the
deferred bus reset may not have fired before succeeding update
requests have come in. This can lead to an incorrect partial
update of the config_rom for both addition and removal of
config_rom descriptors, as the ohci_set_config_rom() routine
will return -EBUSY if a previous pending update has not been
completed yet; the requested update just gets dropped on the floor.
This patch recognizes that the "in-flight" update can be modified
until it has been processed by the bus-reset, and the locking
in the bus_reset_tasklet ensures that the update is done atomically
with respect to modifications made by ohci_set_config_rom(). The
-EBUSY error case is simply removed.
[Stefan R: The bug always existed at least theoretically. But it
became easy to trigger since 2.6.36 commit 02d37bed18 "firewire: core:
integrate software-forced bus resets with bus management" which
introduced long mandatory delays between janitorial bus resets.]
Signed-off-by: Benjamin Buchalter <bj@mhlabs.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (trivial style changes)
Cc: <stable@kernel.org> # 2.6.36.y and newer
mask_rw_pte is currently checking if a pfn is a pagetable page if it
falls in the range pgt_buf_start - pgt_buf_end but that is incorrect
because pgt_buf_end is a moving target: pgt_buf_top is the real
boundary.
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
As a consequence of the commit:
commit 4b239f458c
Author: Yinghai Lu <yinghai@kernel.org>
Date: Fri Dec 17 16:58:28 2010 -0800
x86-64, mm: Put early page table high
it causes the Linux kernel to crash under Xen:
mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
(XEN) mm.c:2466:d0 Bad type (saw 7400000000000001 != exp 1000000000000000) for mfn b1d89 (pfn bacf7)
(XEN) mm.c:3027:d0 Error while pinning mfn b1d89
(XEN) traps.c:481:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...
The reason is that at some point init_memory_mapping is going to reach
the pagetable pages area and map those pages too (mapping them as normal
memory that falls in the range of addresses passed to init_memory_mapping
as argument). Some of those pages are already pagetable pages (they are
in the range pgt_buf_start-pgt_buf_end) therefore they are going to be
mapped RO and everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW. When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).
In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).
For this reason, this function is introduced which is called _after_
the init_memory_mapping has completed (in a perfect world we would
call this function from init_memory_mapping, but lets ignore that).
Because we are called _after_ init_memory_mapping the pgt_buf_[start,
end,top] have all changed to new values (b/c another init_memory_mapping
is called). Hence, the first time we enter this function, we save
away the pgt_buf_start value and update the pgt_buf_[end,top].
When we detect that the "old" pgt_buf_start through pgt_buf_end
PFNs have been reserved (so memblock_x86_reserve_range has been called),
we immediately set out to RW the "old" pgt_buf_end through pgt_buf_top.
And then we update those "old" pgt_buf_[end|top] with the new ones
so that we can redo this on the next pagetable.
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
[v1: Updated with Jeremy's comments]
[v2: Added the crash output]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>