I figured out another ACPI related regression today.
randconfig testing triggered an early boot-time hang on a laptop of mine
(32-bit x86, config attached) - the screen was scrolling ACPI AML
exceptions [with no serial port and no early debugging available].
v2.6.24 works fine on that laptop with the same .config, so after a few
hours of bisection (had to restart it 3 times - other regressions
interacted), it honed in on this commit:
| 10270d4838 is first bad commit
|
| Author: Linus Torvalds <torvalds@woody.linux-foundation.org>
| Date: Wed Feb 13 09:56:14 2008 -0800
|
| acpi: fix acpi_os_read_pci_configuration() misuse of raw_pci_read()
reverting this commit ontop of -rc5 gave a correctly booting kernel.
But this commit fixes a real bug so the real question is, why did it
break the bootup?
After quite some head-scratching, the following change stood out:
- pci_id->bus = tu8;
+ pci_id->bus = val;
pci_id->bus is defined as u16:
struct acpi_pci_id {
u16 segment;
u16 bus;
...
and 'tu8' changed from u8 to u32. So previously we'd unconditionally
mask the return value of acpi_os_read_pci_configuration()
(raw_pci_read()) to 8 bits, but now we just trust whatever comes back
from the PCI access routines and only crop it to 16 bits.
But if the high 8 bits of that result contains any noise then we'll
write that into ACPI's PCI ID descriptor and confuse the heck out of the
rest of ACPI.
So lets check the PCI-BIOS code on that theory. We have this codepath
for 8-bit accesses (arch/x86/pci/pcbios.c:pci_bios_read()):
switch (len) {
case 1:
__asm__("lcall *(%%esi); cld\n\t"
"jc 1f\n\t"
"xor %%ah, %%ah\n"
"1:"
: "=c" (*value),
"=a" (result)
: "1" (PCIBIOS_READ_CONFIG_BYTE),
"b" (bx),
"D" ((long)reg),
"S" (&pci_indirect));
Aha! The "=a" output constraint puts the full 32 bits of EAX into
*value. But if the BIOS's routines set any of the high bits to nonzero,
we'll return a value with more set in it than intended.
The other, more common PCI access methods (v1 and v2 PCI reads) clear
out the high bits already, for example pci_conf1_read() does:
switch (len) {
case 1:
*value = inb(0xCFC + (reg & 3));
which explicitly converts the return byte up to 32 bits and zero-extends
it.
So zero-extending the result in the PCI-BIOS read routine fixes the
regression on my laptop. ( It might fix some other long-standing issues
we had with PCI-BIOS during the past decade ... ) Both 8-bit and 16-bit
accesses were buggy.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6:
PCI Hotplug: Fix small mem leak in IBM Hot Plug Controller Driver
PCI: rename DECLARE_PCI_DEVICE_TABLE to DEFINE_PCI_DEVICE_TABLE
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
drivers: fix dma_get_required_mask
firmware: provide stubs for the FW_LOADER=n case
nozomi: fix initialization and early flow control access
sysdev: fix problem with sysdev_class being re-registered
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: Do not append space to guests kernel command line
lguest: Revert 1ce70c4fac, fix real problem.
lguest: Sanitize the lguest clock.
lguest: fix __get_vm_area usage.
lguest: make sure cpu is initialized before accessing it
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] make watchdog/hpwdt.c:asminline_call() static
[WATCHDOG] Remove volatiles from watchdog device structures
[WATCHDOG] replace remaining __FUNCTION__ occurrences
[WATCHDOG] hpwdt: Use dmi_walk() instead of own copy
[WATCHDOG] Fix return value warning in hpwdt
[WATCHDOG] Fix declaration of struct smbios_entry_point in hpwdt
[WATCHDOG] it8712f_wdt support for 16-bit timeout values, WDIOC_GETSTATUS
Fix kernel crash when stifb driver is used with a A1439A CRX (Rattler)
graphics card. (Reference:
http://thread.gmane.org/gmane.linux.ports.hppa/1834)
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix wrong pointer type passed into the dev_dbg() function.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Acked-by: Mike Rapoport <mike@compulab.co.il>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array. Fix this by checking against
i->count before we skip a zero-length iov.
The bug was reproduced with a test program that continually randomly creates
iovs to writev. The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Tested-by: "Kevin Coffman" <kwc@citi.umich.edu>
Cc: "Alexey Dobriyan" <adobriyan@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The original preemptible-RCU patch put the choice between classic and
preemptible RCU into kernel/Kconfig.preempt, which resulted in build failures
on machines not supporting CONFIG_PREEMPT. This choice was therefore moved to
init/Kconfig, which worked, but placed the choice between classic and
preemptible RCU at the top level, a very obtuse choice indeed.
This patch changes from the Kconfig "choice" mechanism to a pair of booleans,
only one of which (CONFIG_PREEMPT_RCU) is user-visible, and is located in
kernel/Kconfig.preempt, where one would expect it to be. The other
(CONFIG_CLASSIC_RCU) is in init/Kconfig so that it is available to all
architectures, hopefully avoiding build breakage. Thanks to Roman Zippel for
suggesting this approach.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Josh Triplett <josh@freedesktop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Return value convention of module's init functions is 0/-E. Sometimes,
e.g. during forward-porting mistakes happen and buggy module created,
where result of comparison "workqueue != NULL" is propagated all the way up
to sys_init_module. What happens is that some other module created
workqueue in question, our module created it again and module was
successfully loaded.
Or it could be some other bug.
Let's make such mistakes much more visible. In retrospective, such
messages would noticeably shorten some of my head-scratching sessions.
Note, that dump_stack() is just a way to get attention from user. Sample
message:
sys_init_module: 'foo'->init suspiciously returned 1, it should follow 0/-E convention
sys_init_module: loading module anyway...
Pid: 4223, comm: modprobe Not tainted 2.6.24-25f666300625d894ebe04bac2b4b3aadb907c861 #5
Call Trace:
[<ffffffff80254b05>] sys_init_module+0xe5/0x1d0
[<ffffffff8020b39b>] system_call_after_swapgs+0x7b/0x80
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit c9a3ba55 (module: wait for dependent modules doing init.) didn't quite
work because the waiter holds the module lock, meaning that the state of the
module it's waiting for cannot change.
Fortunately, it's fairly simple to update the state outside the lock and do
the wakeup.
Thanks to Jan Glauber for tracking this down and testing (qdio and qeth).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Free pages in the hugetlb pool are free and as such have a reference count of
zero. Regular allocations into the pool from the buddy are "freed" into the
pool which results in their page_count dropping to zero. However, surplus
pages can be directly utilized by the caller without first being freed to the
pool. Therefore, a call to put_page_testzero() is in order so that such a
page will be handed to the caller with a correct count.
This has not affected end users because the bad page count is reset before the
page is handed off. However, under CONFIG_DEBUG_VM this triggers a BUG when
the page count is validated.
Thanks go to Mel for first spotting this issue and providing an initial fix.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The pca953x driver is an I2C driver so gpio_chip->can_sleep should be set.
This lets upper layers know they should use the gpio_*_cansleep() calls to
access values, and may not access them from nonsleeping contexts.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: "eric miao" <eric.y.miao@gmail.com>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recent patch titled
Reduce CPU wastage on idle md array with a write-intent bitmap.
would sometimes leave the array with dirty bitmap bits that stay dirty. A
subsequent write would sort things out so it isn't a big problem, but should
be fixed nonetheless.
We need to make sure that when the bitmap becomes not "allclean", the
daemon_sleep really does get set to a sensible value.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an md array is "auto-read-only", then this appears in /proc/mdstat as
/dev/md0: active(auto-read-only)
whereas if it is truely readonly, it appears as
/dev/md0: active (read-only)
The difference being a space.
One program known to parse this file expects the space and gets badly
confused. It will be fixed, but it would be best if what the kernel generates
is more consistent too.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Address 3 known bugs in the current memory policy reference counting method.
I have a series of patches to rework the reference counting to reduce overhead
in the allocation path. However, that series will require testing in -mm once
I repost it.
1) alloc_page_vma() does not release the extra reference taken for
vma/shared mempolicy when the mode == MPOL_INTERLEAVE. This can result in
leaking mempolicy structures. This is probably occurring, but not being
noticed.
Fix: add the conditional release of the reference.
2) hugezonelist unconditionally releases a reference on the mempolicy when
mode == MPOL_INTERLEAVE. This can result in decrementing the reference
count for system default policy [should have no ill effect] or premature
freeing of task policy. If this occurred, the next allocation using task
mempolicy would use the freed structure and probably BUG out.
Fix: add the necessary check to the release.
3) The current reference counting method assumes that vma 'get_policy()'
methods automatically add an extra reference a non-NULL returned mempolicy.
This is true for shmem_get_policy() used by tmpfs mappings, including
regular page shm segments. However, SHM_HUGETLB shm's, backed by
hugetlbfs, just use the vma policy without the extra reference. This
results in freeing of the vma policy on the first allocation, with reuse of
the freed mempolicy structure on subsequent allocations.
Fix: Rather than add another condition to the conditional reference
release, which occur in the allocation path, just add a reference when
returning the vma policy in shm_get_policy() to match the assumptions.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Cc: <eric.whitney@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I have found a very small typo in Documentation/scheduler/sched-stats.txt.
See the end of this mail.
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This should improve reliability of detection of cards already in socket on
driver load.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of assuming that host is powered on only once at card insertion, allow
for the possibility that memstick layer may need to cycle card's power to get
it out from some unhealthy states.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Additional input received from JMicron on MemoryStick host interfaces showed
that some assumtions in fifo handling code were incorrect. This patch also
fixes data corruption used to occure during PIO transfers.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bus driver may need to be informed that host is being suspended/resumed.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thanks to some input from kind people at JMicron it is now possible to have
more correct definitions of protocol structures and bit field semantics.
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix memory size multiplier during detection.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove locking registers after they are unlocked during switch to/from MMIO
mode. This fixes regression on the Blade3D (Trident 9880) caused by the
previous patch (probe fixes).
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the following section mismatches:
WARNING: drivers/built-in.o(.exit.text+0x5a): Section mismatch in reference from the function of_platform_serial_exit() to the variable .devinit.data:of_platform_serial_driver
The function __exit of_platform_serial_exit() references
a variable __devinitdata of_platform_serial_driver.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This macro is used to define tables, not to declare them.
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In drivers/pci/hotplug/ibmphp_ebda.c::ebda_rsrc_controller(), storage is
allocated with kzalloc() and assigned to 'tmp_slot'. Then lots of
stuff, like ->flag, ->supported_speed etc is set in tmp_slot. A bit
further down there's then this test :
if (!bus_info_ptr1) {
rc = -ENODEV;
goto error;
}
At this point, tmp_slot has not been assigned to anything, so when
erroring-out we want to free it, but nothing at the 'error:' label
free's 'tmp_slot' - and we can't really free 'tmp_slot' at 'error:'
since we may jump to that label later when 'tmp_slot' *has* been used
and we do not want it freed. So, the only sane option left seems to be
to kfree(tmp_slot) just before jumping to the 'error:' label in the one
place where this is what actually makes sense. The following patch does
just that and thus kills off a tiny potential memory leak.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
a) DECLARE_PCI_DEVICE_TABLE is misnamed. It is used to *define* tables,
not to declare them. It should be called DEFINE_PCI_DEVICE_TABLE.
b) It's lame, anyway. We could implement any number of such helper
thingies, but we choose not to.
So I wouldn't go adding code which uses this thing until it has a correct
name, and until we've decided that we actually want to live with it.
From: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There's a bug in the current implementation of dma_get_required_mask()
where it ands the returned mask with the current device mask. This
rather defeats the purpose if you're using the call to determine what
your mask should be (since you will at that time have the default
DMA_32BIT_MASK). This bug results in any driver that uses this function
*always* getting a 32 bit mask, which is wrong.
Fix by removing the and with dev->dma_mask.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
libsas has a case where it uses the firmware loader to provide services,
but doesn't want to select it all the time. This currently causes a
compile failure in libsas if FW_LOADER=n. Fix this by providing error
stubs for the firmware loader API in the FW_LOADER=n case.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Due to some flaws in the initialization and flow control
code kernel oopses could be triggered e.g. when accessing
the card too early after insertion.
See e.g. kernel.org bug #10077.
The main part of the fix is a trivial state management
making sure the card is realy ready to use before allowing
any access.
Signed-off-by: Frank Seidel <fseidel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We need to initialize the kobject for a sysdev_class as it could have
been recycled (stupid static kobjects...)
We also do the same thing in case sysdev devices are being
re-registered.
Thanks to Balaji Rao <balajirrao@gmail.com> for pointing out the
problem.
Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The lguest launcher appends a space to the kernel command line (if kernel
arguments are specified on its command line). This space is unneeded. More
importantly, this appended space will make Red Hat's nash script interpreter
(used in a Fedora style initramfs) add an empty argument to init's command
line. This empty argument will make kernel arguments like "init=/bin/bash"
fail (because the shell will try to execute a script with an empty name).
This could be considered a bug in nash, but is easily fixed in the lguest
launcher too.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Ahmed managed to crash the Host in release_pgd(), which cannot be a Guest
bug, and indeed it wasn't.
The bug was that handing a 0 as the address of the toplevel page table
being manipulated can cause the lookup code in find_pgdir() to return
an uninitialized cache entry (we shadow up to 4 top level page tables
for each Guest).
Commit 37cc8d7f96 introduced this
behaviour in the Guest, uncovering the bug.
The patch which he submitted (which removed the /4 from the index
calculation) simply ensured that these high-indexed entries hit the
early exit path of guest_set_pmd(). But you get lots of segfaults in
guest userspace as the PMDs aren't being updated.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Now the TSC code handles a zero return from calculate_cpu_khz(),
lguest can simply pass through the value it gets from the Host: if
non-zero, all the normal TSC code applies.
Otherwise (or if the Host really doesn't support TSC), the clocksource
code will fall back to the slower but reasonable lguest clock.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Robert Bragg's 5dc3318528 tightened
(ie. fixed) the checking in __get_vm_area, and it broke lguest.
lguest should pass the exact "end" it wants, not some random constant
(it was possible previously that it would actually get an address
different from SWITCHER_ADDR).
Also, Fabio Checconi pointed out that we should make sure we're not
hitting the fixmap area.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Robert Bragg <robert@sixbynine.org>
If req is LHREQ_INITIALIZE, and the guest has been initialized before
(unlikely), it will attempt to access cpu->tsk even though cpu is not yet
initialized.
Signed-off-by: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In some situations, ocfs2_set_nn_state might get called with sc = NULL and
valid = 0. If sc = NULL, we can't dereference it to get the o2nm_node
member. Instead, do what o2net_initialize_handshake does and use NULL when
calling o2net_reconnect_delay and o2net_idle_timeout.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>