I couldn't find any memory policy documentation in the Documentation
directory, so here is my attempt to document it.
There's lots more that could be written about the internal design--including
data structures, functions, etc. However, if you agree that this is better
that the nothing that exists now, perhaps it could be merged. This will
provide a baseline for updates to document the many policy patches that are
currently being worked.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Acked-by: Rob Landley <rob@landley.net>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes an off-by-one in a BUG_ON() spotted by the Coverity
checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Amy Griffis <amy.griffis@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix up the maintainers info in the tpm drivers. Kylene will be out for
some time, so copying the sourceforge list is the best way to get some
attention.
Cc: Marcel Selhorst <tpm@selhorst.net>
Cc: Kylene Jo Hall <kjhall@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Booting SPARSEMEM on NUMA systems trips a BUG in page_alloc.c:
Initializing HighMem for node 0 (00038000:00100000)
Initializing HighMem for node 1 (00100000:001ffe00)
------------[ cut here ]------------
kernel BUG at /home/apw/git/linux-2.6/mm/page_alloc.c:456!
[...]
This occurs because the section to node id mapping is not being
setup correctly during init under SPARSEMEM_STATIC, leading to an
attempt to free pages from all nodes into the zones on node 0.
When the zone_table[] was removed in the following commit, a new
section to node mapping table was introduced:
commit 89689ae7f9
[PATCH] Get rid of zone_table[]
That conversion inadvertantly only initialised the node mapping in
SPARSEMEM_EXTREME. Ensure we initialise the node mapping in
SPARSEMEM_STATIC.
[akpm@linux-foundation.org: make the stubs static inline]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When ecryptfs_lookup() is called against special files, eCryptfs generates
the following errors because it tries to treat them like regular eCryptfs
files.
Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags [0x8000]
Error opening lower_file to read header region
Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95]
Valid metadata not found in header region or xattr region; treating file as unencrypted
For instance, the problem can be reproduced by the steps below.
# mkdir /root/crypt /mnt/crypt
# mount -t ecryptfs /root/crypt /mnt/crypt
# mknod /mnt/crypt/c0 c 0 0
# umount /mnt/crypt
# mount -t ecryptfs /root/crypt /mnt/crypt
# ls -l /mnt/crypt
This patch fixes it by adding a check similar to directories and
symlinks.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] Add support for 1533 bridge to alim1535_wdt
[WATCHDOG] Add a 00-INDEX file to Documentation/watchdog/
[WATCHDOG] Eurotechwdt.c - clean-up comments
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Revert f642b26380.
[SPARC64]: Need to clobber global reg vars in switch_to().
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[IRDA] irda_nl_get_mode: always results in failure
[PPP]: Fix output buffer size in ppp_decompress_frame().
[IRDA]: Avoid a label defined but not used warning in irda_init()
[IPV6]: Fix kernel panic while send SCTP data with IP fragments
[SNAP]: Check packet length before reading
[DCCP]: Allocation in atomic context
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] Change atomic_read/set to inline functions with barrier semantics.
[S390] kprobes: fix instruction length calculation
[S390] hypfs: inode corruption due to missing locking
[S390] disassembler: fix b2 opcodes like srst, bsg, and others
[S390] vmur: fix reference counting for vmur device structure
[S390] vmur: fix diag14 exceptions with addresses > 2GB.
[S390] qdio: Refresh buffer states for IQDIO Asynchronous output queue
[S390] qdio: fix EQBS handling on CCQ96
[S390] cio: change confusing message in cmf.
[S390] cio: dont forget to set last slot to NULL in ccw_uevent().
Touching vmalloc memory in the middle of a lazy mode update can generate
a kernel PDE update, which must be flushed immediately. The fix is to
leave lazy mode when doing a vmalloc sync.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After doing some tests this seems to be the best variant for s390 and
should be correct as well. With gcc 4.2.1 we get the following kernel
image sizes using the default configuration:
atomic_t type volatile, atomic_read/set defines 5311824 bytes
atomic_t type int, atomic_read/set defines 5270864 bytes
atomic_t type int, atomic_read/set inline asm 5279056 bytes
atomic_t type int, atomic_read/set inline barrier 5270864 bytes
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Placing a kprobe on "bc" instruction (s390/s390x) can cause an oops.
The instruction length is encoded into the first two bits of the s390
instruction. Kprobe is incorrectly computing the instruction length.
The instruction length is used for determining what type of "fix-up" is
needed for conditional branch instruction. The problem can bee seen by
placing a kprobe on a "bc" instruction that will not branch. The
results is that Kprobe incorrectly computes the new instruction
pointer (psw.addr) after single stepping the instruction. The problem
is corrected with this patch.
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
hypfs removes the whole hypfs directory tree and creates a new one, when a
process triggers an update by writing to the "update" attribute. When removing
and creating files, it is necessary to lock the inode of the parent directory
where the files live. Currently hypfs does not lock the parent inode, which
can lead to inode corruption. This patch:
* Introduces correct locking
* Fixes i_nlink reference counting for inodes, when creating directories
* Adds info printk, when hypfs filesystem has been mounted
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The instruction table for b2 opcodes was missing an opfrag value
for the cpya instruction. All instructions specified after cpya
were not considered by the disassembler. The fix is simple and
obvious - add the opfrag field to the cpya instruction.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
When a vmur device is removed due to a detach of the device, currently the
ur device structure is freed. Unfortunately it can happen, that there is
still a user of the device structure, when the character device is open
during the detach process. To fix this, reference counting for the vmur
structure is introduced.
In addition to that, the online, offline, probe and remove functions are
serialized now using a global mutex.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
There are several s390 diagnose calls, which must be executed below the
2GB memory boundary. In order to enforce this, those diagnoses must be
compiled into the kernel. Currently diag 14 can be called within the
vmur kernel module from addresses above 2GB. This leads to specification
exceptions. This patch moves diag10, diag14 and diag210 into the new
diag.c file.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Hipersocket Multicast queue works asynchronously. When sending buffers,
the buffer state change may happen delayed. The tasklet for checking
changes in the outbound queue excluded IQDIO async queues from this
process. This created either a hang situation when the queue ran full,
or presented a hang situation a interface close time.
The tasklet processing is changed to include IQDIO async queues when
requesting buffer state refresh.
Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
QDIO returned from EQBS instruction in any case after return code
CCQ=96 was issued regardless whether buffer states for at least one
buffer were extracted or not.
This caused FCP devices to hang when running under z/VM and having
QIOASSASIST=ON and having high I/O rates.
In order to fix this qdio return code processing of EQBS instruction
after CCQ=96 is changed that buffers are returned and if no buffers
where extracted the instruction is repeated at once.
Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
cmf currently prints a message that more than 4096 channels are not
allowed in basic mode - however, this can only be enforced if cmf was
a module (which is no longer possible). It makes much more sense to
not check the specified number of channels and just print a message if
the block for basic mode could not be allocated (which may happen for
any number of specified channels).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It seems an extraneous trailing ';' has slipped in to the error handling for a
name registration failure causing the error path to trigger unconditionally.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch addresses the issue with "osize too small" errors in mppe
encryption. The patch fixes the issue with wrong output buffer size
being passed to ppp decompression routine.
--------------------
As pointed out by Suresh Mahalingam, the issue addressed by
ppp-fix-osize-too-small-errors-when-decoding patch is not fully resolved yet.
The size of allocated output buffer is correct, however it size passed to
ppp->rcomp->decompress in ppp_generic.c if wrong. The patch fixes that.
--------------------
Signed-off-by: Konstantin Sharlaimov <konstantin.sharlaimov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Easily avoidable compiler warnings bug me.
Building irmod without CONFIG_SYSCTL currently results in :
net/irda/irmod.c:132: warning: label 'out_err_2' defined but not used
But that can easily be avoided by simply moving the label inside
the existing "#ifdef CONFIG_SYSCTL" one line above it.
This patch moves the label and buys us one less warning with no
ill effects.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The snap_rcv code reads 5 bytes so we should make sure that
we have 5 bytes in the head before proceeding.
Based on diagnosis and fix by Evgeniy Polyakov, reported by
Alan J. Wylie.
Patch also kills the skb->sk assignment before kfree_skb
since it's redundant.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gerd Hoffmann pointed out that my patch from yesterday can lead
to a null pointer dereference if the kernel is booted with no
console, and no earlyprintk defined. This fixes that issue.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The initial user manuals for MPC8544/8533 had some issues with properly
documenting the device IDs for MPC8544/8533. These processors are almost
identical and both show up on the reference boards.
Fix up the quirks for PCIe support to handle MPC8533/E.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
There are special PHY settings available on Yukon EC-U chip that
should not get cleared. This should solve mysterious errors on some
motherboards (like Gigabyte DS-3).
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes handling of USB ISO completion error -EXDEV and includes
several other changes to current CVS version at isdn4linux.de (changes
in debug flags, style of code remarks, etc)
Signed-off-by: Martin Bachem <info@colognechip.com>
Acked-by: Karsten Keil <kkeil@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I did some testing and found quite a lot of problems (doesn't
boot at all on non NUMA and misassigns cores on Opteron systems).
Mark it as experimental and warn against its use for now.
It's still default y for SUMMIT/NUMAQ because it'll presumably
work on these systems.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This causes boot failures for some people.
It looks like in fact that some SILO provided
ramdisk images should not be KERNBASE normalized.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 196705c9bb. It was
reported to cause a regression by Daniel Exner, and Arjan van de Ven
points out that we actually already have infrastructure in place for
setting limits on acceptable DMA latency that would be the much more
correct fix for the problem with some Broadcom EHCI controllers.
Fixed up trivial conflicts due to the changes to support big-endian host
controller descriptors in drivers/usb/host/{ehci-sched.c,ehci.h}.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch uses kzalloc to zero all of struct dio rather than manually
trying to track which fields we rely on being zero. It passed aio+dio
stress testing and some bug regression testing on ext3.
This patch was introduced by Linus in the conversation that lead up to
Badari's minimal fix to manually zero .map_bh.b_state in commit:
6a648fa721
It makes the code a bit smaller. Maybe a couple fewer cachelines to
load, if we're lucky:
text data bss dec hex filename
3285925 568506 1304616 5159047 4eb887 vmlinux
3285797 568506 1304616 5158919 4eb807 vmlinux.patched
I was unable to measure a stable difference in the number of cpu cycles
spent in blockdev_direct_IO() when pushing aio+dio 256K reads at
~340MB/s.
So the resulting intent of the patch isn't a performance gain but to
avoid exposing ourselves to the risk of finding another field like
.map_bh.b_state where we rely on zeroing but don't enforce it in the
code.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (6028): Turn an unnecessary mdelay() into msleep().
V4L/DVB (6027): Get rid of an ill-behaved msleep in i2c write
V4L/DVB (6026): Avoid powering up the camera on resume
V4L/DVB (6016): get_dvb_firmware: update script for new location of tda10046 firmware
V4L/DVB (5991): dvb-pll: Set minimum and maximum frequency properly
V4L/DVB (5969): ivtv: report ivtv version in status log
V4L/DVB (5967): ivtv: fix VIDIOC_S_FBUF:new OSD values where never set
V4L/DVB (5968): videodev2.h: remove superfluous FBUF GLOBAL_INV_ALPHA support
Commit a491486a20 introduced a locking
problem in JFFS2 -- we up() the alloc_sem when we weren't previously
holding it. This leads to all kinds of fun behaviour later.
There was a _reason_ for the
if (1 /* alternative path needs testing */ ||
which the above-mentioned commit removed :)
Discovered and debugged by Giulio Fedel <giulio.fedel@andorsystems.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a followup to the cleanups for earlyprintk patch from Gerd Hoffmann
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=69331af79cf29e26d1231152a172a1a10c2df511
This ensures that a bootconsole is unregistered if it is not replaced.
The current implementation spews garbage out the bootconsole in this case,
since the bootconsole structure is normally in the init section, and is
freed, but still used.
Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This disk reports total number of sectors instead of maximum sector address
in response to READ_NATIVE_MAX_ADDRESS command and also happily accepts
SET_MAX_ADDRESS command with the bogus value. This results in +1 sector
capacity being used and errors on attempts to use the last sector.
...
hdd: Host Protected Area detected.
    current capacity is 78165360 sectors (40020 MB)
    native  capacity is 78165361 sectors (40020 MB)
hdd: Host Protected Area disabled.
...
hdd: reading: block=78165360, sectors=1, buffer=0xc1e63000
hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=78165360, sector=78165360
...
Add hpa_list[] table and workaround the issue in idedisk_check_hpa().
v2:
* Add missing export and improve patch description a bit.
v3:
* Add list termination. (From Mikko)
Fixes kernel bugzilla bug #8816.
Thanks to Mikko for investigating the issue and helping with this patch.
Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Programming DMA mode may destroy current PIO mode setting so if
CONFIG_HPT34X_AUTODMA=n (the default case) make ide_tune_dma() fail
early by disabling all host DMA masks and re-tune PIO mode.
This fix doesn't help with the driver being broken but is needed
for some other changes.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
If ->dma_base is not set (== PCI BAR4 cannot be reserved) then DMA hooks
shouldn't be initialized or bad things will happen.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
If ->dma_base is not set (== PCI BAR4 cannot be reserved) then DMA hooks
shouldn't be initialized or bad things will happen.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
If ->dma_base is not set (== PCI BAR4 cannot be reserved) then DMA hooks
shouldn't be initialized or bad things will happen.
Also this host driver requires valid PCI BAR4 for normal operation so
check it in ->init_chipset and fail initialization if not set.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
If ->dma_base is not set (== PCI BAR4 cannot be reserved) then DMA hooks
shouldn't be initialized or bad things will happen.
Also this host driver requires valid PCI BAR4 for normal operation so
check it in ->init_chipset and fail initialization if not set.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use ->OUTBSYNC instead of ->OUTB when writing command register
(needed for scc_pata and pmac host drivers).
* Don't check DRDY bit of the status register on ATAPI devices
(ATAPI devices are free to ignore DRDY bit).
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Cc: Akira Iguchi <akira2.iguchi@toshiba.co.jp>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Move ide_in_drive_list() from ide-dma.c to ide-iops.c.
* Add ivb_list[] table for listening early UDMA66 devices which don't conform
to ATA4 standard wrt cable detection (bit14 is zero, only bit13 is valid)
and use only device side cable detection for them since host side cable
detection may be unreliable.
* Add model "QUANTUM FIREBALLlct10 05" with firwmare "A03.0900" to the list
(from Craig's bugreport).
v2:
* Improve kernel message basing on suggestion from Sergei.
v3:
* Don't print kernel message when no device side cable detection is done,
plus some minor fixes. (Noticed by Sergei)
Thanks to Craig for testing this patch.
Cc: Craig Block <chblock3@yahoo.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
pmac_ide_tune_chipset() don't set drive->init_speed.
Fix it by setting drive->{current,init}_speed in pmac_ide_do_setfeature()
and clean up pmac_ide_{tune_chipset,mdma_enable,udma_enable}().
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add DMA blacklist checking (->ide_dma_on check probably can go now).
* Add ->atapi_dma flag checking and remove no longer needed
ns87415_ide_dma_check() from ns87415 host driver.
* Remove now needless __ide_dma_check() wrapper and symbol export.
* Check drive->autodma instead of hwif->autodma (there should be no changes in
behavior as all users of config_drive_for_dma() set both ->autodma flags).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>