argv_split() is a helper function which takes a string, splits it at
whitespace, and returns a NULL-terminated argv vector. This is
deliberately simple - it does no quote processing of any kind.
[ Seems to me that this is something which is already being done in
the kernel, but I couldn't find any other implementations, either to
steal or replace. Keep an eye out. ]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Add CRC7 routines, used for example in MMC over SPI communication.
Kerneldoc updates
[akpm@linux-foundation.org: fix funny mix of const and non-const]
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
variant in the past. But with __GFP_ZERO it is possible now to do zeroing
while allocating.
Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
we can.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This should avoid build problems on architectures without a "readb()",
that got bitten by check_signature() being uninlined.
Noted by Heiko Carstens.
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Optimize integer-to-string conversion in vsprintf.c for base 10. This is
by far the most used conversion, and in some use cases it impacts
performance. For example, top reads /proc/$PID/stat for every process, and
with 4000 processes decimal conversion alone takes noticeable time.
Using code from
http://www.cs.uiowa.edu/~jones/bcd/decimal.html
(with permission from the author, Douglas W. Jones)
binary-to-decimal-string conversion is done in groups of five digits at
once, using only additions/subtractions/shifts (with -O2; -Os throws in
some multiply instructions).
On i386 arch gcc 4.1.2 -O2 generates ~500 bytes of code.
This patch is run tested. Userspace benchmark/test is also attached.
I tested it on PIII and AMD64 and new code is generally ~2.5 times
faster. On AMD64:
# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv: .......... 62 ns per iteration
Testing correctness
12895992590592 ok... [Ctrl-C]
# ./vsprintf_verify-O2
Original decimal conv: .......... 151 ns per iteration
Patched decimal conv: .......... 62 ns per iteration
Testing correctness
26025406464 ok... [Ctrl-C]
More realistic test: top from busybox project was modified to
report how many us it took to scan /proc (this does not account
any processing done after that, like sorting process list),
and then I test it with 4000 processes:
#!/bin/sh
i=4000
while test $i != 0; do
sleep 30 &
let i--
done
busybox top -b -n3 >/dev/null
on unpatched kernel:
top: 4120 processes took 102864 microseconds to scan
top: 4120 processes took 91757 microseconds to scan
top: 4120 processes took 92517 microseconds to scan
top: 4120 processes took 92581 microseconds to scan
on patched kernel:
top: 4120 processes took 75460 microseconds to scan
top: 4120 processes took 66451 microseconds to scan
top: 4120 processes took 67267 microseconds to scan
top: 4120 processes took 67618 microseconds to scan
The speedup comes from much faster generation of /proc/PID/stat
by sprintf() calls inside the kernel.
Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* There is no point in having full "0...9a...z" constant vector,
if we use only "0...9a...f" (and "x" for "0x").
* Post-decrement usually needs a few more instructions, so use
pre decrement instead where makes sense:
- while (i < precision--) {
+ while (i <= --precision) {
* if base != 10 (=> base 8 or 16), we can avoid using division
in a loop and use mask/shift, obtaining much faster conversion.
(More complex optimization for base 10 case is in the second patch).
Overall, size vsprintf.o shows ~80 bytes smaller text section
with this patch applied.
Signed-off-by: Douglas W Jones <jones@cs.uiowa.edu>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current generic bug implementation has a call to dump_stack() in case a
WARN_ON(whatever) gets hit. Since report_bug(), which calls dump_stack(),
gets called from an exception handler we can do better: just pass the
pt_regs structure to report_bug() and pass it to show_regs() in case of a
warning. This will give more debug informations like register contents,
etc... In addition this avoids some pointless lines that dump_stack()
emits, since it includes a stack backtrace of the exception handler which
is of no interest in case of a warning. E.g. on s390 the following lines
are currently always present in a stack backtrace if dump_stack() gets
called from report_bug():
[<000000000001517a>] show_trace+0x92/0xe8)
[<0000000000015270>] show_stack+0xa0/0xd0
[<00000000000152ce>] dump_stack+0x2e/0x3c
[<0000000000195450>] report_bug+0x98/0xf8
[<0000000000016cc8>] illegal_op+0x1fc/0x21c
[<00000000000227d6>] sysc_return+0x0/0x10
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is a rather bizarre thing to have inlined in io.h. Stick it in lib/
instead.
While we're there, despaghetti it a bit, and fix its off-by-one behaviour when
passed a zero length.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that we have implemented hotunplug-time counter spilling,
percpu_counter_sum() only needs to look at online CPUs.
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
per-cpu counters presently must iterate over all possible CPUs in the
exhaustive percpu_counter_sum().
But it can be much better to only iterate over the presently-online CPUs. To
do this, we must arrange for an offlined CPU's count to be spilled into the
counter's central count.
We can do this for all percpu_counters in the machine by linking them into a
single global list and walking that list at CPU_DEAD time.
(I hope. Might have race windows in which the percpu_counter_sum() count is
inaccurate?)
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new configuration variable
CONFIG_SLUB_DEBUG_ON
If set then the kernel will be booted by default with slab debugging
switched on. Similar to CONFIG_SLAB_DEBUG. By default slab debugging
is available but must be enabled by specifying "slub_debug" as a
kernel parameter.
Also add support to switch off slab debugging for a kernel that was
built with CONFIG_SLUB_DEBUG_ON. This works by specifying
slub_debug=-
as a kernel parameter.
Dave Jones wanted this feature.
http://marc.info/?l=linux-kernel&m=118072189913045&w=2
[akpm@linux-foundation.org: clean up switch statement]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove all ids from the given idr tree. idr_destroy() only frees up
unused, cached idp_layers, but this function will remove all id mappings
and leave all idp_layers unused.
A typical clean-up sequence for objects stored in an idr tree, will use
idr_for_each() to free all objects, if necessay, then idr_remove_all() to
remove all ids, and idr_destroy() to free up the cached idr_layers.
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds an iterator function for the idr data structure. Compared
to just iterating through the idr with an integer and idr_find, this
iterator is (almost, but not quite) linear in the number of elements, as
opposed to the number of integers in the range covered by the idr. This
makes a difference for sparse idrs, but more importantly, it's a nicer way
to iterate through the elements.
The drm subsystem is moving to idr for tracking contexts and drawables, and
with this change, we can use the idr exclusively for tracking these
resources.
[akpm@linux-foundation.org: fix comment]
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (37 commits)
[XFS] Fix lockdep annotations for xfs_lock_inodes
[LIB]: export radix_tree_preload()
[XFS] Fix XFS_IOC_FSBULKSTAT{,_SINGLE} & XFS_IOC_FSINUMBERS in compat mode
[XFS] Compat ioctl handler for handle operations
[XFS] Compat ioctl handler for XFS_IOC_FSGEOMETRY_V1.
[XFS] Clean up function name handling in tracing code
[XFS] Quota inode has no parent.
[XFS] Concurrent Multi-File Data Streams
[XFS] Use uninitialized_var macro to stop warning about rtx
[XFS] XFS should not be looking at filp reference counts
[XFS] Use is_power_of_2 instead of open coding checks
[XFS] Reduce shouting by removing unnecessary macros from dir2 code.
[XFS] Simplify XFS min/max macros.
[XFS] Kill off xfs_count_bits
[XFS] Cancel transactions on xfs_itruncate_start error.
[XFS] Use do_div() on 64 bit types.
[XFS] Fix remount,readonly path to flush everything correctly.
[XFS] Cleanup inode extent size hint extraction
[XFS] Prevent ENOSPC from aborting transactions that need to succeed
[XFS] Prevent deadlock when flushing inodes on unmount
...
XFS filestreams functionality uses radix trees and the preload
functions. XFS can be built as a module and hence we need
radix_tree_preload() exported. radix_tree_preload_end() is a
static inline, so it doesn't need exporting.
Signed-Off-By: Dave Chinner <dgc@sgi.com>
Signed-Off-By: Tim Shimmin <tes@sgi.com>
As kobj sysfs dentries and inodes are gonna be made reclaimable,
dentry can't be used as naming token for sysfs file/directory, replace
kobj->dentry with kobj->sd. The only external interface change is
shadow directory handling. All other changes are contained in kobj
and sysfs.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Implement idr based id allocator. ida is used the same way idr is
used but lacks id -> ptr translation and thus consumes much less
memory. struct ida_bitmap is attached as leaf nodes to idr tree which
is managed by the idr code. Each ida_bitmap is 128bytes long and
contains slightly less than a thousand slots.
ida is more aggressive with releasing extra resources acquired using
ida_pre_get(). After every successful id allocation, ida frees one
reserved idr_layer if possible. Reserved ida_bitmap is not freed
automatically but only one ida_bitmap is reserved and it's almost
always used right away. Under most circumstances, ida won't hold on
to memory for too long which isn't actively used.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Separate out idr_mark_full() from sub_alloc() and make marking the
allocated slot full the responsibility of idr_get_new_above_int().
Allocation part of idr_get_new_above_int() is renamed to
idr_get_empty_slot(). New idr_get_new_above_int() allocates a slot
using the function, install the user pointer and marks it full using
idr_mark_full().
This change doesn't introduce any behavior change. This will be
used by ida.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In sub_alloc(), when bitmap search fails, it goes up one level to
continue search. This is done by updating the id cursor and searching
the upper level again. If the cursor was at the end of the upper
level, we need to go further than that.
This wasn't implemented and when that happens the part of the cursor
which indexes into the upper level wraps and sub_alloc() ends up
searching the wrong bitmap. It allocates id which doesn't match the
actual slot.
This patch fixes this by restarting from the top if the search needs
to go higher than one level.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We get uevents for a bus/class going away, but not one registering.
Add the missing uevent in kset_register(), which will send an
event for a new bus/class. Suppress all unwanted uevents for bus
subdirectories like /bus/*/devices/, /bus/*/drivers/.
Now we get for module usbcore:
add /module/usbcore (module)
add /bus/usb (bus)
add /class/usb_host (class)
add /bus/usb/drivers/hub (drivers)
add /bus/usb/drivers/usb (drivers)
remove /bus/usb/drivers/usb (drivers)
remove /bus/usb/drivers/hub (drivers)
remove /class/usb_host (class)
remove /bus/usb (bus)
remove /module/usbcore (module)
instead of:
add /module/usbcore (module)
add /bus/usb/drivers/hub (drivers)
add /bus/usb/drivers/usb (drivers)
remove /bus/usb/drivers/usb (drivers)
remove /bus/usb/drivers/hub (drivers)
remove /class/usb_host (class)
remove /bus/usb/drivers (bus)
remove /bus/usb/devices (bus)
remove /bus/usb (bus)
remove /module/usbcore (module)
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is a hybrid version of the patch to add the LZO1X compression
algorithm to the kernel. Nitin and myself have merged the best parts of
the various patches to form this version which we're both happy with (and
are jointly signing off).
The performance of this version is equivalent to the original minilzo code
it was based on. Bytecode comparisons have also been made on ARM, i386 and
x86_64 with favourable results.
There are several users of LZO lined up including jffs2, crypto and reiser4
since its much faster than zlib.
Signed-off-by: Nitin Gupta <nitingupta910@gmail.com>
Signed-off-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a prefix string parameter. Callers are responsible for any string
length/alignment that they want to see in the output. I.e., callers should
pad strings to achieve alignment if they want that.
Add rowsize parameter. This is the number of raw data bytes to be printed
per line. Must be 16 or 32.
Add a groupsize parameter. This allows callers to dump values as 1-byte,
2-byte, 4-byte, or 8-byte numbers. Default is 1-byte numbers. If the
total length is not an even multiple of groupsize, 1-byte numbers are
printed.
Add an "ascii" output parameter. This causes ASCII data output following
the hex data output.
Clean up some doc examples.
Align the ASCII output on all lines that are produced by one call.
Add a new interface, print_hex_dump_bytes(), that is a shortcut to
print_hex_dump(), using default parameter values to print 16 bytes in
byte-size chunks of hex + ASCII output, using printk level KERN_DEBUG.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thanks to Jean Delvare <khali@linux-fr.org> for pointing it out to me.
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make timer-stats have almost zero overhead when enabled in the config but
not used. (this way distros can enable it more easily)
Also update the documentation about overhead of timer_stats - it was
written for the first version which had a global lock and a linear list
walk based lookup ;-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There have been a number of instances where people have accidentally compiled
rcutorture into the kernel (CONFIG_RCU_TORTURE_TEST=y), which has never been
useful, and has often resulted in great frustration.
The attached patch prohibits rcutorture from being compiled into the
kernel. It may be excluded altogether or compiled as a module. People
wishing to have rcutorture hammer their machine immediately upon boot
are free to hand-edit lib/Kconfig.debug to remove the "depends on m"
line.
Thanks to Randy Dunlap for the trick that makes this work.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@kernel.org>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Disable stacktrace filter support for x86-64 for now. Will be enable when we
can get the dwarf2 unwinder back.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'audit.b38' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] Abnormal End of Processes
[PATCH] match audit name data
[PATCH] complete message queue auditing
[PATCH] audit inode for all xattr syscalls
[PATCH] initialize name osid
[PATCH] audit signal recipients
[PATCH] add SIGNAL syscall class (v3)
[PATCH] auditing ptrace
Based on ace_dump_mem() from Grant Likely for the Xilinx SystemACE
CompactFlash interface.
Add print_hex_dump() & hex_dumper() to lib/hexdump.c and linux/kernel.h.
This patch adds the functions print_hex_dump() & hex_dumper().
print_hex_dump() can be used to perform a hex + ASCII dump of data to
syslog, in an easily viewable format, thus providing a common text hex dump
format.
hex_dumper() provides a dump-to-memory function. It converts one "line" of
output (16 bytes of input) at a time.
Example usages:
print_hex_dump(KERN_DEBUG, DUMP_PREFIX_ADDRESS, frame->data, frame->len);
hex_dumper(frame->data, frame->len, linebuf, sizeof(linebuf));
Example output using %DUMP_PREFIX_OFFSET:
0009ab42: 40414243 44454647 48494a4b 4c4d4e4f-@ABCDEFG HIJKLMNO
Example output using %DUMP_PREFIX_ADDRESS:
ffffffff88089af0: 70717273 74757677 78797a7b 7c7d7e7f-pqrstuvw xyz{|}~.
[akpm@linux-foundation.org: cleanups, add export]
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When auditing syscalls that send signals, log the pid and security
context for each target process. Optimize the data collection by
adding a counter for signal-related rules, and avoiding allocating an
aux struct unless we have more than one target process. For process
groups, collect pid/context data in blocks of 16. Move the
audit_signal_info() hook up in check_kill_permission() so we audit
attempts where permission is denied.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'juju' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (138 commits)
firewire: Convert OHCI driver to use standard goto unwinding for error handling.
firewire: Always use parens with sizeof.
firewire: Drop single buffer request support.
firewire: Add a comment to describe why we split the sg list.
firewire: Return SCSI_MLQUEUE_HOST_BUSY for out of memory cases in queuecommand.
firewire: Handle the last few DMA mapping error cases.
firewire: Allocate scsi_host up front and allocate the sbp2_device as hostdata.
firewire: Provide module aliase for backwards compatibility.
firewire: Add to fw-core-y instead of assigning fw-core-objs in Makefile.
firewire: Break out shared IEEE1394 constant to separate header file.
firewire: Use linux/*.h instead of asm/*.h header files.
firewire: Uppercase most macro names.
firewire: Coding style cleanup: no spaces after function names.
firewire: Convert card_rwsem to a regular mutex.
firewire: Clean up comment style.
firewire: Use lib/ implementation of CRC ITU-T.
CRC ITU-T V.41
firewire: Rename fw-device-cdev.c to fw-cdev.c and move header to include/linux.
firewire: Future proof the iso ioctls by adding a handle for the iso context.
firewire: Add read/write and size annotations to IOC numbers.
...
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This will add the CRC calculation according
to the CRC ITU-T V.41 to the kernel lib/ folder.
This code has been derived from the rt2x00 driver,
currently found only in the wireless-dev tree, but
this library is generic and could be used by more
drivers who currently use their own implementation.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Also useful for the new firewire stack.
Signed-off-by: Kristian Hoegsberg <krh@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* git://git.infradead.org/mtd-2.6: (21 commits)
[MTD] [CHIPS] Remove MTD_OBSOLETE_CHIPS (jedec, amd_flash, sharp)
[MTD] Delete allegedly obsolete "bank_size" field of mtd_info.
[MTD] Remove unnecessary user space check from mtd.h.
[MTD] [MAPS] Remove flash maps for no longer supported 405LP boards
[MTD] [MAPS] Fix missing printk() parameter in physmap_of.c MTD driver
[MTD] [NAND] platform NAND driver: add driver
[MTD] [NAND] platform NAND driver: update header
[JFFS2] Simplify and clean up jffs2_add_tn_to_tree() some more.
[JFFS2] Remove another bogus optimisation in jffs2_add_tn_to_tree()
[JFFS2] Remove broken insert_point optimisation in jffs2_add_tn_to_tree()
[JFFS2] Remember to calculate overlap on nodes which replace older nodes
[JFFS2] Don't advance c->wbuf_ofs to next eraseblock after wbuf flush
[MTD] [NAND] at91_nand.c: CMDLINE_PARTS support
[MTD] [NAND] Tidy up handling of page number in nand_block_bad()
[MTD] block2mtd_paramline[] mustn't be __initdata
[MTD] [NAND] Support multiple chips in CAFÉ driver
[MTD] [NAND] Rename cafe.c to cafe_nand.c and remove the multi-obj magic
[MTD] [NAND] Use rslib for CAFÉ ECC
[RSLIB] Support non-canonical GF representations
[JFFS2] Remove dead file histo_mips.h
...
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress. This
patch introduces such notifications and causes them to be used during
suspend and resume transitions. It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).
[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several people have observed that perhaps LOG_BUF_SHIFT should be in a more
obvious place than under DEBUG_KERNEL. Under some circumstances (such as the
PARISC architecture), DEBUG_KERNEL can increase kernel size, which is an
undesirable trade off for something as trivial as increasing the kernel log
buffer size.
Instead, move LOG_BUF_SHIFT into "General Setup", so that people are more
likely to be able to change it such a circumstance that the default buffer
size is insufficient.
Signed-off-by: Alistair John Strachan <s0348365@sms.ed.ac.uk>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I was playing with some code that sometimes got a string where a %n
match should have been done where the input string ended, for example
like this:
sscanf("abc123", "abc%d%n", &a, &n); /* doesn't work */
sscanf("abc123a", "abc%d%n", &a, &n); /* works */
However, the scanf function in the kernel doesn't convert the %n in that
case because it has already matched the complete input after %d and just
completely stops matching then. This patch fixes that.
[akpm@linux-foundation.org: cleanups]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the Kconfig selection of semaphore debugging from the ALPHA and FRV
Kconfig files, and centralize it in lib/Kconfig.debug.
There doesn't seem to be much point in letting individual architectures
independently define the same Kconfig option when it can just as easily be
put in a single Kconfig file and made dependent on a subset of
architectures. that way, at least the option shows up in the same relative
location in the menu each time.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kbuild spits outs following warning on a
defconfig x86_64 build:
WARNING: swiotlb.o - Section mismatch: reference to .init.text:swiotlb_init from __ksymtab between '__ksymtab_swiotlb_init' (at offset 0xa0) and '__ksymtab_swiotlb_free_coherent'
This warning happens because the function swiotlb_init is marked __init and
EXPORT_SYMBOL(). A 'git grep swiotlb_init' showed no users in drivers/ so
remove the EXPORT_SYMBOL.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The last zlib_inflate update broke certain corner cases for ppp_deflate
decompression handling. This patch fixes some logic to make things work
properly again. Users other than ppp_deflate (the only Z_PACKET_FLUSH
user) should be unaffected.
Fixes bug 8405 (confirmed by Stefan)
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Stefan Wenk <stefan.wenk@gmx.at>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds support for the Analog Devices Blackfin processor architecture, and
currently supports the BF533, BF532, BF531, BF537, BF536, BF534, and BF561
(Dual Core) devices, with a variety of development platforms including those
avaliable from Analog Devices (BF533-EZKit, BF533-STAMP, BF537-STAMP,
BF561-EZKIT), and Bluetechnix! Tinyboards.
The Blackfin architecture was jointly developed by Intel and Analog Devices
Inc. (ADI) as the Micro Signal Architecture (MSA) core and introduced it in
December of 2000. Since then ADI has put this core into its Blackfin
processor family of devices. The Blackfin core has the advantages of a clean,
orthogonal,RISC-like microprocessor instruction set. It combines a dual-MAC
(Multiply/Accumulate), state-of-the-art signal processing engine and
single-instruction, multiple-data (SIMD) multimedia capabilities into a single
instruction-set architecture.
The Blackfin architecture, including the instruction set, is described by the
ADSP-BF53x/BF56x Blackfin Processor Programming Reference
http://blackfin.uclinux.org/gf/download/frsrelease/29/2549/Blackfin_PRM.pdf
The Blackfin processor is already supported by major releases of gcc, and
there are binary and source rpms/tarballs for many architectures at:
http://blackfin.uclinux.org/gf/project/toolchain/frs There is complete
documentation, including "getting started" guides available at:
http://docs.blackfin.uclinux.org/ which provides links to the sources and
patches you will need in order to set up a cross-compiling environment for
bfin-linux-uclibc
This patch, as well as the other patches (toolchain, distribution,
uClibc) are actively supported by Analog Devices Inc, at:
http://blackfin.uclinux.org/
We have tested this on LTP, and our test plan (including pass/fails) can
be found at:
http://docs.blackfin.uclinux.org/doku.php?id=testing_the_linux_kernel
[m.kozlowski@tuxland.pl: balance parenthesis in blackfin header files]
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Aubrey Li <aubrey.li@analog.com>
Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Architectures that don't support DMA can say so by adding a config NO_DMA
to their Kconfig file. This will prevent compilation of some dma specific
driver code. Also dma-mapping-broken.h isn't needed anymore on at least
s390. This avoids compilation and linking of otherwise dead/broken code.
Other architectures that include dma-mapping-broken.h are arm26, h8300,
m68k, m68knommu and v850. If these could be converted as well we could get
rid of the header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
"John W. Linville" <linville@tuxdriver.com>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: <James.Bottomley@SteelEye.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <geert@linux-m68k.org>
Cc: <zippel@linux-m68k.org>
Cc: <spyro@f2s.com>
Cc: <uclinux-v850@lsi.nec.co.jp>
Cc: <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The nr_cpu_ids value is currently only calculated in smp_init. However, it
may be needed before (SLUB needs it on kmem_cache_init!) and other kernel
components may also want to allocate dynamically sized per cpu array before
smp_init. So move the determination of possible cpus into sched_init()
where we already loop over all possible cpus early in boot.
Also initialize both nr_node_ids and nr_cpu_ids with the highest value they
could take. If we have accidental users before these values are determined
then the current valud of 0 may cause too small per cpu and per node arrays
to be allocated. If it is set to the maximum possible then we only waste
some memory for early boot users.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (38 commits)
kconfig: fix mconf segmentation fault
kbuild: enable use of code from a different dir
kconfig: error out if recursive dependencies are found
kbuild: scripts/basic/fixdep segfault on pathological string-o-death
kconfig: correct minor typo in Kconfig warning message.
kconfig: fix path to modules.txt in Kconfig help
usr/Kconfig: fix typo
kernel-doc: alphabetically-sorted entries in index.html of 'htmldocs'
kbuild: be more explicit on missing .config file
kbuild: clarify the creation of the LOCALVERSION_AUTO string.
kbuild: propagate errors from find in scripts/gen_initramfs_list.sh
kconfig: refer to qt3 if we cannot find qt libraries
kbuild: handle compressed cpio initramfs-es
kbuild: ignore section mismatch warning for references from .paravirtprobe to .init.text
kbuild: remove stale comment in modpost.c
kbuild/mkuboot.sh: allow spaces in CROSS_COMPILE
kbuild: fix make mrproper for Documentation/DocBook/man
kbuild: remove kconfig binaries during make mrproper
kconfig/menuconfig: do not hardcode '.config'
kbuild: override build timestamp & version
...
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
[PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
[PATCH] i386: type may be unused
[PATCH] i386: Some additional chipset register values validation.
[PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
[PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
[PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
[PATCH] i386: white space fixes in i387.h
[PATCH] i386: Drop noisy e820 debugging printks
[PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
[PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
[PATCH] x86-64: Share identical video.S between i386 and x86-64
[PATCH] x86-64: Remove CONFIG_REORDER
[PATCH] x86-64: Print type and size correctly for unknown compat ioctls
[PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
[PATCH] i386: Little cleanups in smpboot.c
[PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
[PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
[PATCH] i386: Add X86_FEATURE_RDTSCP
[PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
[PATCH] i386: Implement alternative_io for i386
...
Fix up trivial conflict in include/linux/highmem.h manually.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We used to BUG_ON() for a badly mapped IO port, which is certainly
correct, but actually made it harder to debug the case where the ATA
drivers had incorrectly mapped a nonconnected ATA port.
So make badly mapped ports trigger a WARN_ON(), and throw the IO away
instead (and return all ones for reads). For things like broken driver
initialization - which is the most likely cause anyway - that should
mean that the machine comes up and is usable (at least that was the case
for the ATA breakage that triggered this patch).
It tends to be a whole lot easier to do a "dmesg" on a working machine
than to try to capture logs off a dead one.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (49 commits)
[SCTP]: Set assoc_id correctly during INIT collision.
[SCTP]: Re-order SCTP initializations to avoid race with sctp_rcv()
[SCTP]: Fix the SO_REUSEADDR handling to be similar to TCP.
[SCTP]: Verify all destination ports in sctp_connectx.
[XFRM] SPD info TLV aggregation
[XFRM] SAD info TLV aggregationx
[AF_RXRPC]: Sort out MTU handling.
[AF_IUCV/IUCV] : Add missing section annotations
[AF_IUCV]: Implementation of a skb backlog queue
[NETLINK]: Remove bogus BUG_ON
[IPV6]: Some cleanups in include/net/ipv6.h
[TCP]: zero out rx_opt in tcp_disconnect()
[BNX2]: Fix TSO problem with small MSS.
[NET]: Rework dev_base via list_head (v3)
[TCP] Highspeed: Limited slow-start is nowadays in tcp_slow_start
[BNX2]: Update version and reldate.
[BNX2]: Print bus information for PCIE devices.
[BNX2]: Add 1-shot MSI handler for 5709.
[BNX2]: Restructure PHY event handling.
[BNX2]: Add indirect spinlock.
...
Make the match_*() functions take a const pointer to the options table
and make strings pointers in the options table const too.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to work on cleaning up the relationship between kobjects, ksets and
ktypes. The removal of 'struct subsystem' is the first step of this,
especially as it is not really needed at all.
Thanks to Kay for fixing the bugs in this patch.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The following patch adds some extra clarification to the CONFIG_DEBUG_INFO
Kconfig help text. The current text is mostly a recursive definition and
doesn't really say much of anything. When I first read this I thought it
was going to enable extra verbosity in debug messages or something, but it
is only enabling the "gcc -g" compile option in the Makefile.
Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
inflate_dynamic() has piggy stack usage too, so heap allocate it too.
I'm not sure it actually gets used, but it shows up large in "make
checkstack".
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
inflate_fixed and huft_build together use around 2.7k of stack. When
using 4k stacks, I saw stack overflows from interrupts arriving while
unpacking the root initrd:
do_IRQ: stack overflow: 384
[<c0106b64>] show_trace_log_lvl+0x1a/0x30
[<c01075e6>] show_trace+0x12/0x14
[<c010763f>] dump_stack+0x16/0x18
[<c0107ca4>] do_IRQ+0x6d/0xd9
[<c010202b>] xen_evtchn_do_upcall+0x6e/0xa2
[<c0106781>] xen_hypervisor_callback+0x25/0x2c
[<c010116c>] xen_restore_fl+0x27/0x29
[<c0330f63>] _spin_unlock_irqrestore+0x4a/0x50
[<c0117aab>] change_page_attr+0x577/0x584
[<c0117b45>] kernel_map_pages+0x8d/0xb4
[<c016a314>] cache_alloc_refill+0x53f/0x632
[<c016a6c2>] __kmalloc+0xc1/0x10d
[<c0463d34>] malloc+0x10/0x12
[<c04641c1>] huft_build+0x2a7/0x5fa
[<c04645a5>] inflate_fixed+0x91/0x136
[<c04657e2>] unpack_to_rootfs+0x5f2/0x8c1
[<c0465acf>] populate_rootfs+0x1e/0xe4
(This was under Xen, but there's no reason it couldn't happen on bare
hardware.)
This patch mallocs the local variables, thereby reducing the stack
usage to sane levels.
Also, up the heap size for the kernel decompressor to deal with the
extra allocation.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Tim Yamin <plasmaroo@gentoo.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
For the CAFÉ NAND controller, we need to support non-canonical
representations of the Galois field. Allow the caller to provide its own
function for generating the field, and CAFÉ can use rslib instead of its
own implementation.
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Add a kvasprintf() function to complement kasprintf().
No in-tree users yet, but I have some coming up.
[akpm@linux-foundation.org: EXPORT it]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Keir Fraser <keir@xensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement pcim_iounmap_regions() - the opposite of
pcim_iomap_regions().
Signed-off-by: Tejun heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch contains the overdue removal of the mount/umount uevents.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This dots some i's and crosses some t's after left over from when
kobject_kset_add_dir was built from kobject_add_dir.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It isn't used at all by the driver core anymore, and the few usages of
it within the kernel have now all been fixed as most of them were using
it incorrectly. So remove it.
Now the whole struct subsys can be removed from the system, but that's
for a later patch...
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We leak a reference if we attempt to add a kobject with no name.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Collapses a do..while() loop within an if() to a simple while() loop for
simplicity and readability.
Signed-off-by: John Anthony Kazos Jr. <jakj@j-a-k-j.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Provide rename event for when we rename network devices.
Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
some atomic operations are only atomic, not ordered. Thus a CPU is allowed
to reorder memory references to an object to before the reference is
obtained. This fixes it.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
- correct function name in comments
- parrent assignment does metter only inside "if" block,
so move it inside this block.
Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
- uses a kset in "struct class" to keep track of all directories
belonging to this class
- merges with the /sys/devices/virtual logic.
- removes the namespace-dir if the last member of that class
leaves the directory.
There may be locking or refcounting fixes left, I stopped when it seemed
to work with network and sound modules. :)
From: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: (67 commits)
[SCSI] SUNESP: Complete driver rewrite to version 2.0
[SPARC64]: Convert PCI over to generic struct iommu/strbuf.
[SPARC]: device_node name constification fallout
[SPARC64]: Convert SBUS over to generic iommu/strbuf structs.
[SPARC64]: Add generic iommu and strbuf structs to iommu.h
[SPARC64]: Consolidate {sbus,pci}_iommu_arena.
[SPARC]: Make device_node name and type const
[SPARC64]: constify some paramaters of OF routines
[TIGON3]: of_get_property() returns const.
[SPARC64]: Fix PCI rework to adhere to of_get_property() const return.
[SPARC64]: Document and fix calculation of pages_avail.
[SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c
[SPARC64]: Use bootmem_bootmap_pages() in choose_bootmap_pfn().
[SPARC64]: Add proper header file extern for cmdline_memory_size.
[SPARC64]: Kill sparc_ultra_dump_{i,d}tlb()
[SPARC64]: Use DECLARE_BITMAP and BITS_TO_LONGS in mm/init.c
[SPARC64]: Give move verbose show_mem() output just like i386.
[SPARC64]: Mark show_mem() printk's with KERN_INFO.
[SPARC64]: Kill kvaddr_to_phys() and friends.
[SPARC64]: Privatize sun4u_get_pte() and fix name.
...
Stacktrace support on MIPS doesn't use frame pointers. Since this option
considerably increases the size of the kernel code, force lockdep to not
use it.
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Switch cb_lock to mutex and allow netlink kernel users to override it
with a subsystem specific mutex for consistent locking in dump callbacks.
All netlink_dump_start users have been audited not to rely on any
side-effects of the previously used spinlock.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow s390 to properly override the generic
__div64_32() implementation by:
1) Using obj-y for div64.o in s390's makefile instead
of lib-y
2) Adding the weak attribute to the generic implementation.
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor optimization of div64_64. do_div() already does optimization
for the case of 32 by 32 divide, so no need to do it here.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the current version of the 64 bit divide common code.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If error happen we jump to "out" label, in this case new_device not yet
became the parent but it wasn't putted.
Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
lib/genalloc.c: In function 'gen_pool_alloc':
lib/genalloc.c:151: warning: passing argument 2 of '__set_bit' from incompatible pointer type
lib/genalloc.c: In function 'gen_pool_free':
lib/genalloc.c:190: warning: passing argument 2 of '__clear_bit' from incompatible pointer type
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no prompt for CONFIG_STACKTRACE, so FAULT_INJECTION cannot be
selected without LOCKDEP enabled. (found by Paolo 'Blaisorblade'
Giarrusso)
In order to fix such broken Kconfig dependency, this patch splits up the
stacktrace filter support for fault injection by new Kconfig option, which
enables to use fault injection on the architecture which doesn't have
general stacktrace support.
Cc: "Paolo 'Blaisorblade' Giarrusso" <blaisorblade@yahoo.it>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We frequently need the maximum number of possible processors in order to
allocate arrays for all processors. So far this was done using
highest_possible_processor_id(). However, we do need the number of
processors not the highest id. Moreover the number was so far dynamically
calculated on each invokation. The number of possible processors does not
change when the system is running. We can therefore calculate that number
once.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
Documentation/kernel-docs.txt update.
arch/cris: typo in KERN_INFO
Storage class should be before const qualifier
kernel/printk.c: comment fix
update I/O sched Kconfig help texts - CFQ is now default, not AS.
Remove duplicate listing of Cris arch from README
kbuild: more doc. cleanups
doc: make doc. for maxcpus= more visible
drivers/net/eexpress.c: remove duplicate comment
add a help text for BLK_DEV_GENERIC
correct a dead URL in the IP_MULTICAST help text
fix the BAYCOM_SER_HDX help text
fix SCSI_SCAN_ASYNC help text
trivial documentation patch for platform.txt
Fix typos concerning hierarchy
Fix comment typo "spin_lock_irqrestore".
Fix misspellings of "agressive".
drivers/scsi/a100u2w.c: trivial typo patch
Correct trivial typo in log2.h.
Remove useless FIND_FIRST_BIT() macro from cardbus.c.
...
Correct mis-spellings of "algorithm", "appear", "consistent" and
(shame, shame) "kernel".
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
The function 'kobject_add' tries to verify the name of
a new kobject instance is properly set before continuing.
if (!kobj->k_name)
kobj->k_name = kobj->name;
if (!kobj->k_name) {
pr_debug("kobject attempted to be registered with no name!\n");
WARN_ON(1);
return -EINVAL;
}
The statement:
if (!kobj->k_name) {
pr_debug("kobject attempted to be registered with no name!\n");
WARN_ON(1);
return -EINVAL;
}
is useless the way it is right now, because it can never be true. I
think the
code was intended to be:
if (!kobj->k_name)
kobj->k_name = kobj->name;
if (!*kobj->k_name) {
pr_debug("kobject attempted to be registered with no name!\n");
WARN_ON(1);
return -EINVAL;
}
because this would make sure the kobj->name buffer has something in it.
So the missing '*' is just a typo. Although, I would much prefer
expression like:
if (*kobj->k_name == '\0') {
pr_debug("kobject attempted to be registered with no name!\n");
WARN_ON(1);
return -EINVAL;
}
because this would've made the intention clear, in this patch I just restore
the missing '*' without changing the coding style of the function.
Signed-off-by: Martin Stoilov <mstoilov@odesys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It appears that the pcim_iomap_regions() function doesn't get the error
handling right. It BUGs early at boot with a backtrace along the lines of:
ahci_init
pci_register_driver
driver_register
[...]
ahci_init_one
pcim_iomap_region
pcim_iounmap
The following patch allows me to boot. Only the if(mask..) continue;
part fixes the problem actually, the gotos where changed so that we
don't try to unmap something we couldn't map anyway.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Drivers registering IRQ handlers with SA_SHIRQ really ought to be able to
handle an interrupt happening before request_irq() returns. They also
ought to be able to handle an interrupt happening during the start of their
call to free_irq(). Let's test that hypothesis....
[bunk@stusta.de: Kconfig fixes]
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The return value of scnprintf() never exceeds @size.
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Split the implementation-agnostic stuff in separate files.
* Make sure that targets using non-default request_irq() pull
kernel/irq/devres.o
* Introduce new symbols (HAS_IOPORT and HAS_IOMEM) defaulting to positive;
allow architectures to turn them off (we needed these symbols anyway for
dependencies of quite a few drivers).
* protect the ioport-related parts of lib/devres.o with CONFIG_HAS_IOPORT.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the few references to the obsolete kernel config option
DEBUG_RWSEMS.
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Part of long forgotten patch
http://groups.google.com/group/fa.linux.kernel/msg/e98e941ce1cf29f6?dmode=source
Since then, m32r grabbed two copies.
Leave s390 copy because of important absence of CONFIG_VT, but remove
references to non-existent timerlist_lock. ia64 also loses timerlist_lock.
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A variety of (mostly) innocuous fixes to the embedded kernel-doc content in
source files, including:
* make multi-line initial descriptions single line
* denote some function names, constants and structs as such
* change erroneous opening '/*' to '/**' in a few places
* reword some text for clarity
Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
devres change moved iomap.o from obj-$(CONFIG_GENERIC_IOMAP) to lib-y
making it not linked if no in-kernel driver uses it. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement pcim_iomap_regions(). This function takes mask of BARs to
request and iomap. No BAR should have length of zero. BARs are
iomapped using pcim_iomap_table().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement device resource management, in short, devres. A device
driver can allocate arbirary size of devres data which is associated
with a release function. On driver detach, release function is
invoked on the devres data, then, devres data is freed.
devreses are typed by associated release functions. Some devreses are
better represented by single instance of the type while others need
multiple instances sharing the same release function. Both usages are
supported.
devreses can be grouped using devres group such that a device driver
can easily release acquired resources halfway through initialization
or selectively release resources (e.g. resources for port 1 out of 4
ports).
This patch adds devres core including documentation and the following
managed interfaces.
* alloc/free : devm_kzalloc(), devm_kzfree()
* IO region : devm_request_region(), devm_release_region()
* IRQ : devm_request_irq(), devm_free_irq()
* DMA : dmam_alloc_coherent(), dmam_free_coherent(),
dmam_declare_coherent_memory(), dmam_pool_create(),
dmam_pool_destroy()
* PCI : pcim_enable_device(), pcim_pin_device(), pci_is_managed()
* iomap : devm_ioport_map(), devm_ioport_unmap(), devm_ioremap(),
devm_ioremap_nocache(), devm_iounmap(), pcim_iomap_table(),
pcim_iomap(), pcim_iounmap()
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The problem. When implementing a network namespace I need to be able
to have multiple network devices with the same name. Currently this
is a problem for /sys/class/net/*.
What I want is a separate /sys/class/net directory in sysfs for each
network namespace, and I want to name each of them /sys/class/net.
I looked and the VFS actually allows that. All that is needed is
for /sys/class/net to implement a follow link method to redirect
lookups to the real directory you want.
Implementing a follow link method that is sensitive to the current
network namespace turns out to be 3 lines of code so it looks like a
clean approach. Modifying sysfs so it doesn't get in my was is a bit
trickier.
I am calling the concept of multiple directories all at the same path
in the filesystem shadow directories. With the directory entry really
at that location the shadow master.
The following patch modifies sysfs so it can handle a directory
structure slightly different from the kobject tree so I can implement
the shadow directories for handling /sys/class/net/.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If we allow NULL as the new parent in device_move(), we need to make sure
that the device is placed into the same place as it would if it was
newly registered:
- Consider the device virtual tree. In order to be able to reuse code,
setup_parent() has been tweaked a bit.
- kobject_move() can fall back to the kset's kobject.
- sysfs_move_dir() uses the sysfs root dir as fallback.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It should be ok to pass in NULL for some kobject functions, so add error
checking for all exported kobject functions to be more robust.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add abstraction so that the file can be used by environments other than IA64
and EM64T, namely for Xen.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
- add proper __init decoration to swiotlb's init code (and the code calling
it, where not already the case)
- replace uses of 'unsigned long' with dma_addr_t where appropriate
- do miscellaneous simplicfication and cleanup
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Convert all phys_to_virt/virt_to_phys uses to bus_to_virt/virt_to_bus, as is
what is meant and what is needed in (at least) some virtualized environments
like Xen.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch fixes
- marking I-cache clean of pages DMAed to now only done for IA64
- broken multiple inclusion in include/asm-x86_64/swiotlb.h
- missing call to mark_clean in swiotlb_sync_sg()
- a (perhaps only theoretical) issue in swiotlb_dma_supported() when
io_tlb_end is exactly at the end of memory
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Since kobject_uevent() function does not return an integer value to
indicate if its operation was completed with success or not, it is worth
changing it in order to report a proper status (success or error) instead
of returning void.
[randy.dunlap@oracle.com: Fix inline kobject functions]
Cc: Mauricio Lin <mauriciolin@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
With WARN_ON addition to kobject_init()
[ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/dont-use/broken-out/gregkh-driver-kobject-warn.patch ]
I started seeing following WARNING on CPU offline followed by online on my
x86_64 system.
WARNING at lib/kobject.c:172 kobject_init()
Call Trace:
[<ffffffff8020ab45>] dump_trace+0xaa/0x3ef
[<ffffffff8020aec4>] show_trace+0x3a/0x50
[<ffffffff8020b0f6>] dump_stack+0x15/0x17
[<ffffffff80350abc>] kobject_init+0x3f/0x8a
[<ffffffff80350be1>] kobject_register+0x1a/0x3e
[<ffffffff803bbd89>] sysdev_register+0x5b/0xf9
[<ffffffff80211d0b>] mce_create_device+0x77/0xf4
[<ffffffff80211dc2>] mce_cpu_callback+0x3a/0xe5
[<ffffffff805632fd>] notifier_call_chain+0x26/0x3b
[<ffffffff8023f6f3>] raw_notifier_call_chain+0x9/0xb
[<ffffffff802519bf>] _cpu_up+0xb4/0xdc
[<ffffffff80251a12>] cpu_up+0x2b/0x42
[<ffffffff803bef00>] store_online+0x4a/0x72
[<ffffffff803bb6ce>] sysdev_store+0x24/0x26
[<ffffffff802baaa2>] sysfs_write_file+0xcf/0xfc
[<ffffffff8027fc6f>] vfs_write+0xae/0x154
[<ffffffff80280418>] sys_write+0x47/0x6f
[<ffffffff8020963e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:
This is a false positive as mce.c is unregistering/registering sysfs
interfaces cleanly on hotplug.
kref_put() and conditional decrement of refcnt seems to be the root cause
for this and the patch below resolves the issue for me.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It has caused more problems than it ever really solved, and is
apparently not getting cleaned up and fixed. We can put it back when
it's stable and isn't likely to make warning or bug events worse.
In the meantime, enable frame pointers for more readable stack traces.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove useless includes of linux/io.h, don't even try to build iomap_copy
on uml (it doesn't have readb() et.al., so...)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When some objects are allocated by one CPU but freed by another CPU we can
consume lot of cycles doing divides in obj_to_index().
(Typical load on a dual processor machine where network interrupts are
handled by one particular CPU (allocating skbufs), and the other CPU is
running the application (consuming and freeing skbufs))
Here on one production server (dual-core AMD Opteron 285), I noticed this
divide took 1.20 % of CPU_CLK_UNHALTED events in kernel. But Opteron are
quite modern cpus and the divide is much more expensive on oldest
architectures :
On a 200 MHz sparcv9 machine, the division takes 64 cycles instead of 1
cycle for a multiply.
Doing some math, we can use a reciprocal multiplication instead of a divide.
If we want to compute V = (A / B) (A and B being u32 quantities)
we can instead use :
V = ((u64)A * RECIPROCAL(B)) >> 32 ;
where RECIPROCAL(B) is precalculated to ((1LL << 32) + (B - 1)) / B
Note :
I wrote pure C code for clarity. gcc output for i386 is not optimal but
acceptable :
mull 0x14(%ebx)
mov %edx,%eax // part of the >> 32
xor %edx,%edx // useless
mov %eax,(%esp) // could be avoided
mov %edx,0x4(%esp) // useless
mov (%esp),%ebx
[akpm@osdl.org: small cleanups]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add MODULE_* attributes to the new bit reversal library. Most notably
MODULE_LICENSE which prevents superfluous kernel tainting.
Signed-off-by: Cal Peake <cp@absolutedigital.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Refactor Kconfig content to maximize nesting of menus by menuconfig and
xconfig.
Tested by simultaneously running `make xconfig` with and without
patch, and comparing displays.
Signed-off-by: Don Mullis <dwm@meer.net>
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trivial optimization and simplification of should_fail().
Do cheaper disqualification tests first (performance gain not quantified).
Simplify logic; eliminate goto.
Signed-off-by: Don Mullis <dwm@meer.net>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clamp /debug/fail*/stacktrace-depth to MAX_STACK_TRACE_DEPTH. Ensures that a
read of /debug/fail*/stacktrace-depth always returns a truthful answer.
Signed-off-by: Don Mullis <dwm@meer.net>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use bool-true-false throughout.
Signed-off-by: Don Mullis <dwm@meer.net>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
`select' doesn't work very well. With alpha `make allmodconfig' we end up
with CONFIG_STACKTRACE enabled, so we end up with undefined save_stacktrace()
at link time.
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Don Mullis <dwm@meer.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Fix some spelling and grammatical errors
- Make the Kconfig menu more conventional. First you select
fault-injection, then under that you select particular clients of it.
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Don Mullis <dwm@meer.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides stacktrace filtering feature.
The stacktrace filter allows failing only for the caller you are
interested in.
For example someone may want to inject kmalloc() failures into
only e100 module. they want to inject not only direct kmalloc() call,
but also indirect allocation, too.
- e100_poll --> netif_receive_skb --> packet_rcv_spkt --> skb_clone
--> kmem_cache_alloc
This patch enables to detect function calls like this by stacktrace
and inject failures. The script Documentaion/fault-injection/failmodule.sh
helps it.
The range of text section of loaded e100 is expected to be
[/sys/module/e100/sections/.text, /sys/module/e100/sections/.exit.text)
So failmodule.sh stores these values into /debug/failslab/address-start
and /debug/failslab/address-end. The maximum stacktrace depth is specified
by /debug/failslab/stacktrace-depth.
Please see the example that demonstrates how to inject slab allocation
failures only for a specific module
in Documentation/fault-injection/fault-injection.txt
[dwm@meer.net: reject failure if any caller lies within specified range]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Don Mullis <dwm@meer.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides process filtering feature.
The process filter allows failing only permitted processes
by /proc/<pid>/make-it-fail
Please see the example that demostrates how to inject slab allocation
failures into module init/cleanup code
in Documentation/fault-injection/fault-injection.txt
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides fault-injection capability for disk IO.
Boot option:
fail_make_request=<probability>,<interval>,<space>,<times>
<interval> -- specifies the interval of failures.
<probability> -- specifies how often it should fail in percent.
<space> -- specifies the size of free space where disk IO can be issued
safely in bytes.
<times> -- specifies how many times failures may happen at most.
Debugfs:
/debug/fail_make_request/interval
/debug/fail_make_request/probability
/debug/fail_make_request/specifies
/debug/fail_make_request/times
Example:
fail_make_request=10,100,0,-1
echo 1 > /sys/blocks/hda/hda1/make-it-fail
generic_make_request() on /dev/hda1 fails once per 10 times.
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides fault-injection capability for alloc_pages()
Boot option:
fail_page_alloc=<interval>,<probability>,<space>,<times>
<interval> -- specifies the interval of failures.
<probability> -- specifies how often it should fail in percent.
<space> -- specifies the size of free space where memory can be
allocated safely in pages.
<times> -- specifies how many times failures may happen at most.
Debugfs:
/debug/fail_page_alloc/interval
/debug/fail_page_alloc/probability
/debug/fail_page_alloc/specifies
/debug/fail_page_alloc/times
/debug/fail_page_alloc/ignore-gfp-highmem
/debug/fail_page_alloc/ignore-gfp-wait
Example:
fail_page_alloc=10,100,0,-1
The page allocation (alloc_pages(), ...) fails once per 10 times.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides fault-injection capability for kmalloc.
Boot option:
failslab=<interval>,<probability>,<space>,<times>
<interval> -- specifies the interval of failures.
<probability> -- specifies how often it should fail in percent.
<space> -- specifies the size of free space where memory can be
allocated safely in bytes.
<times> -- specifies how many times failures may happen at most.
Debugfs:
/debug/failslab/interval
/debug/failslab/probability
/debug/failslab/specifies
/debug/failslab/times
/debug/failslab/ignore-gfp-highmem
/debug/failslab/ignore-gfp-wait
Example:
failslab=10,100,0,-1
slab allocation (kmalloc(), kmem_cache_alloc(),..) fails once per 10 times.
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides base functions implement to fault-injection
capabilities.
- The function should_fail() is taken from failmalloc-1.0
(http://www.nongnu.org/failmalloc/)
[akpm@osdl.org: cleanups, comments, add __init]
Cc: <okuji@enbug.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Don Mullis <dwm@meer.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch replaces bitreverse() by bitrev32. The only users of bitreverse()
are crc32 itself and via-velocity.
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch provides two bit reverse functions and bit reverse table.
- reverse the order of bits in a u32 value
u8 bitrev8(u8 x);
- reverse the order of bits in a u32 value
u32 bitrev32(u32 x);
- byte reverse table
const u8 byte_rev_table[256];
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes i386 use the generic BUG machinery. There are no functional
changes from the old i386 implementation.
The main advantage in using the generic BUG machinery for i386 is that the
inlined overhead of BUG is just the ud2a instruction; the file+line(+function)
information are no longer inlined into the instruction stream. This reduces
cache pollution, and makes disassembly work properly.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds common handling for kernel BUGs, for use by architectures as
they wish. The code is derived from arch/powerpc.
The advantages of having common BUG handling are:
- consistent BUG reporting across architectures
- shared implementation of out-of-line file/line data
- implement CONFIG_DEBUG_BUGVERBOSE consistently
This means that in inline impact of BUG is just the illegal instruction
itself, which is an improvement for i386 and x86-64.
A BUG is represented in the instruction stream as an illegal instruction,
which has file/line information associated with it. This extra information is
stored in the __bug_table section in the ELF file.
When the kernel gets an illegal instruction, it first confirms it might
possibly be from a BUG (ie, in kernel mode, the right illegal instruction).
It then calls report_bug(). This searches __bug_table for a matching
instruction pointer, and if found, prints the corresponding file/line
information. If report_bug() determines that it wasn't a BUG which caused the
trap, it returns BUG_TRAP_TYPE_NONE.
Some architectures (powerpc) implement WARN using the same mechanism; if the
illegal instruction was the result of a WARN, then report_bug(Q) returns
CONFIG_DEBUG_BUGVERBOSE; otherwise it returns BUG_TRAP_TYPE_BUG.
lib/bug.c keeps a list of loaded modules which can be searched for __bug_table
entries. The architecture must call
module_bug_finalize()/module_bug_cleanup() from its corresponding
module_finalize/cleanup functions.
Unsetting CONFIG_DEBUG_BUGVERBOSE will reduce the kernel size by some amount.
At the very least, filename and line information will not be recorded for each
but, but architectures may decide to store no extra information per BUG at
all.
Unfortunately, gcc doesn't have a general way to mark an asm() as noreturn, so
architectures will generally have to include an infinite loop (or similar) in
the BUG code, so that gcc knows execution won't continue beyond that point.
gcc does have a __builtin_trap() operator which may be useful to achieve the
same effect, unfortunately it cannot be used to actually implement the BUG
itself, because there's no way to get the instruction's address for use in
generating the __bug_table entry.
[randy.dunlap@oracle.com: Handle BUG=n, GENERIC_BUG=n to prevent build errors]
[bunk@stusta.de: include/linux/bug.h must always #include <linux/module.h]
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the locking self-test failures (of 'FAILURE' type) easier to debug by
printing more information.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Always build hweight8/16/32/64() functions into the kernel so that loadable
modules may use them.
I didn't remove GENERIC_HWEIGHT since ALPHA_EV67, ia64, and some variants
of UltraSparc(64) provide their own hweight functions.
Fixes config/build problems with NTFS=m and JOYSTICK_ANALOG=m.
Kernel: arch/x86_64/boot/bzImage is ready (#19)
Building modules, stage 2.
MODPOST 94 modules
WARNING: "hweight32" [fs/ntfs/ntfs.ko] undefined!
WARNING: "hweight16" [drivers/input/joystick/analog.ko] undefined!
WARNING: "hweight8" [drivers/input/joystick/analog.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
generating compiler warnings of unused symbols, hence forcing people to add
#ifdefs.
the compiler can skip truly unused functions just fine:
text data bss dec hex filename
1624412 728710 3674856 6027978 5bfaca vmlinux.before
1624412 728710 3674856 6027978 5bfaca vmlinux.after
[akpm@osdl.org: topology.c fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This allows a hyphenated range of positive numbers in the string passed
to command line helper function, get_options.
Currently the command line option "isolcpus=" takes as its argument a
list of cpus.
Format: <cpu number>,...,<cpu number>
Valid values of <cpu_number> include all cpus, 0 to "number of CPUs in
system - 1". This can get extremely long when isolating the majority of
cpus on a large system. The kernel isolcpus code would not need any
changing to use this feature. To use it, the change would be in the
command line format for 'isolcpus='
Format:
<cpu number>,...,<cpu number>
or
<cpu number>-<cpu number> (must be a positive range in ascending
order.)
or a mixture
<cpu number>,...,<cpu number>-<cpu number>
Signed-off-by: Derek Fults <dfults@sgi.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Print the other (hopefully) known good pointer when list_head debugging
too, which may yield additional clues.
Also fix for 80-columns to win akpm brownie points.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make PRINTK_TIME depend on PRINTK. Only display/offer it if PRINTK is
enabled.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make radix tree lookups safe to be performed without locks. Readers are
protected against nodes being deleted by using RCU based freeing. Readers
are protected against new node insertion by using memory barriers to ensure
the node itself will be properly written before it is visible in the radix
tree.
Each radix tree node keeps a record of their height (above leaf nodes).
This height does not change after insertion -- when the radix tree is
extended, higher nodes are only inserted in the top. So a lookup can take
the pointer to what is *now* the root node, and traverse down it even if
the tree is concurrently extended and this node becomes a subtree of a new
root.
"Direct" pointers (tree height of 0, where root->rnode points directly to
the data item) are handled by using the low bit of the pointer to signal
whether rnode is a direct pointer or a pointer to a radix tree node.
When a reader wants to traverse the next branch, they will take a copy of
the pointer. This pointer will be either NULL (and the branch is empty) or
non-NULL (and will point to a valid node).
[akpm@osdl.org: cleanups]
[Lee.Schermerhorn@hp.com: bugfixes, comments, simplifications]
[clameter@sgi.com: build fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Replace all uses of kmem_cache_t with struct kmem_cache.
The patch was generated using the following script:
#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#
set -e
for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done
The script was run like this
sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a spinlock lockup occurs, arrange for the NMI code to emit an all-cpu
backtrace, so we get to see which CPU is holding the lock, and where.
Cc: Andi Kleen <ak@muc.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
* master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc: (194 commits)
[POWERPC] Add missing EXPORTS for mpc52xx support
[POWERPC] Remove obsolete PPC_52xx and update CLASSIC32 comment
[POWERPC] ps3: add a default zImage target
[POWERPC] Add of_platform_bus support to mpc52xx psc uart driver
[POWERPC] typo fix and whitespace cleanup on mpc52xx-uart driver
[POWERPC] Fix debug printks for 32-bit resources in the PCI code
[POWERPC] Replace kmalloc+memset with kzalloc
[POWERPC] Linkstation / kurobox support
[POWERPC] Add the e300c3 core to the CPU table.
[POWERPC] ppc: m48t35 add missing bracket
[POWERPC] iSeries: don't build head_64.o unnecessarily
[POWERPC] iSeries: stop dt_mod.o being rebuilt unnecessarily
[POWERPC] Fix cputable.h for combined build
[POWERPC] Allow CONFIG_BOOTX_TEXT on iSeries
[POWERPC] Allow xmon to build on legacy iSeries
[POWERPC] Change ppc64_defconfig to use AUTOFS_V4 not V3
[POWERPC] Tell firmware we can handle POWER6 compatible mode
[POWERPC] Clean images in arch/powerpc/boot
[POWERPC] Fix OF pci flags parsing
[POWERPC] defconfig for lite5200 board
...
Allow architectures to provide their own implementation of the big endian MMIO
accessors and "repeat" MMIO accessors for use by the generic iomap.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
More-or-less-tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (36 commits)
Driver core: show drivers in /sys/module/
Documentation/driver-model/platform.txt update/rewrite
Driver core: platform_driver_probe(), can save codespace
driver core: Use klist_remove() in device_move()
driver core: Introduce device_move(): move a device to a new parent.
Driver core: make drivers/base/core.c:setup_parent() static
driver core: Introduce device_find_child().
sysfs: sysfs_write_file() writes zero terminated data
cpu topology: consider sysfs_create_group return value
Driver core: Call platform_notify_remove later
ACPI: Change ACPI to use dev_archdata instead of firmware_data
Driver core: add dev_archdata to struct device
Driver core: convert sound core to use struct device
Driver core: change mem class_devices to be real devices
Driver core: convert fb code to use struct device
Driver core: convert firmware code to use struct device
Driver core: convert mmc code to use struct device
Driver core: convert ppdev code to use struct device
Driver core: convert PPP code to use struct device
Driver core: convert cpuid code to use struct device
...
Provide a function device_move() to move a device to a new parent device. Add
auxilliary functions kobject_move() and sysfs_move_dir().
kobject_move() generates a new uevent of type KOBJ_MOVE, containing the
previous path (DEVPATH_OLD) in addition to the usual values. For this, a new
interface kobject_uevent_env() is created that allows to add further
environmental data to the uevent at the kobject layer.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Changes persistant -> persistent. www.dictionary.com does not know
persistant (with an A), but should it be one of those things you can
spell in more than one correct way, let me know.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3914/1: [Jornada7xx] - Typo Fix in cpu-sa1110.c (b != B)
[ARM] 3913/1: n2100: fix IRQ routing for second ethernet port
[ARM] Add KBUILD_IMAGE target support
[ARM] Fix suspend oops caused by PXA2xx PCMCIA driver
[ARM] Fix i2c-pxa slave mode support
[ARM] 3900/1: Fix VFP Division by Zero exception handling.
[ARM] 3899/1: Fix the normalization of the denormal double precision number.
[ARM] 3909/1: Disable UWIND_INFO for ARM (again)
[ARM] Add __must_check to uaccess functions
[ARM] Add realview SMP default configuration
[ARM] Fix SMP irqflags support
strstrip() does not remove the last blank from strings which only consist
of blanks.
Example:
char string[] = " ";
strstrip(string);
results in " ", but should produce an empty string!
The following patch solves this problem:
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by Joern Engel <joern@wh.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
According to Daniel Jacobowitz, UNWIND_INFO is not useful on ARM, and
in fact doesn't even compile.
This patch disables the option for ARM.
Signed-off-by: Kevin Hilman <khilman@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Qooting Adrian:
- net/sunrpc/svc.c uses highest_possible_node_id()
- include/linux/nodemask.h says highest_possible_node_id() is
out-of-line #if MAX_NUMNODES > 1
- the out-of-line highest_possible_node_id() is in lib/cpumask.c
- lib/Makefile: lib-$(CONFIG_SMP) += cpumask.o
CONFIG_ARCH_DISCONTIGMEM_ENABLE=y, CONFIG_SMP=n, CONFIG_SUNRPC=y
-> highest_possible_node_id() is used in net/sunrpc/svc.c
CONFIG_NODES_SHIFT defined and > 0
-> include/linux/numa.h: MAX_NUMNODES > 1
-> compile error
The bug is not present on architectures where ARCH_DISCONTIGMEM_ENABLE
depends on NUMA (but m32r isn't the only affected architecture).
So move the function into page_alloc.c
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This library function should be in obj-y and not in lib-y. But when we do
that it clashes unpleasantly with the assembly-language implementation in the
ia64 architecture.
Instead of trying to fix it all up, just remove the generic carta_random32 in
the expectation that the recently-made-generic random32() will suffice.
If/when perfmon is migrated to random32, ia64's private carta_random32
implementation can also be removed.
Cc: Stephane Eranian <eranian@hpl.hp.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make net_random() more widely available by calling it random32
akpm: hopefully this will permit the removal of carta_random32. That needs
confirmation from Stephane - this code looks somewhat more computationally
expensive, and has a different (ie: callee-stateful) interface.
[akpm@osdl.org: lots of build fixes, cleanups]
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lib/bitmap.c:bitmap_parse() is a library function that received as input a
user buffer. This seemed to have originated from the way the write_proc
function of the /proc filesystem operates.
This has been reworked to not use kmalloc and eliminates a lot of
get_user() overhead by performing one access_ok before using __get_user().
We need to test if we are in kernel or user space (is_user) and access the
buffer differently. We cannot use __get_user() to access kernel addresses
in all cases, for example in architectures with separate address space for
kernel and user.
This function will be useful for other uses as well; for example, taking
input for /sysfs instead of /proc, so it was changed to accept kernel
buffers. We have this use for the Linux UWB project, as part as the
upcoming bandwidth allocator code.
Only a few routines used this function and they were changed too.
Signed-off-by: Reinette Chatre <reinette.chatre@linux.intel.com>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a follow-up patch based on the review for perfmon2. This patch
adds the carta_random32() library routine + carta_random32.h header file.
This is fast, simple, and efficient pseudo number generator algorithm. We
use it in perfmon2 to randomize the sampling periods. In this context, we
do not need any fancy randomizer.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Cc: David Mosberger <david.mosberger@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In order to encourage people to notice when they break the exported
headers, add a config option which automatically runs the sanity checks
when building vmlinux. That way, those who use allyesconfig will notice
failures.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We got several false bug reports because of enabled
CONFIG_DETECT_SOFTLOCKUP. Disable soft lockup detection on s390, since it
doesn't work on a virtualized architecture.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This annotation makes it possible to assign a subclass on lock init. This
annotation is meant to reduce the _nested() annotations by assigning a
default subclass.
One could do without this annotation and rely on lockdep_set_class()
exclusively, but that would require a manual stack of struct lock_class_key
objects.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
Many files include the filename at the beginning, serveral used a wrong one.
Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
It is a non-standard heap-sort algorithm implementation because the index
of child node is wrong . The sort function still outputs right result, but
the performance is O( n * ( log(n) + 1 ) ) , about 10% ~ 20% worse than
standard algorithm.
Signed-off-by: keios <keios.cn@gmail.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Acked-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The last in-kernel user of errno is gone, so we should remove the definition
and everything referring to it. This also removes the now-unused lib/execve.c
file that was introduced earlier.
Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the
kernel.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The use of execve() in the kernel is dubious, since it relies on the
__KERNEL_SYSCALLS__ mechanism that stores the result in a global errno
variable. As a first step of getting rid of this, change all users to a
global kernel_execve function that returns a proper error code.
This function is a terrible hack, and a later patch removes it again after the
kernel syscalls are gone.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A simple module to test Linux Kernel Dump mechanism. This module uses
jprobes to install/activate pre-defined crash points. At different crash
points, various types of crashing scenarios are created like a BUG(),
panic(), exception, recursive loop and stack overflow. The user can
activate a crash point with specific type by providing parameters at the
time of module insertion. Please see the file header for usage
information. The module is based on the Linux Kernel Dump Test Tool by
Fernando <http://lkdtt.sourceforge.net>.
This module could be merged with mainline. Jprobes is used here so that the
context in which crash point is hit, could be maintained. This implements
all the crash points as done by LKDTT except the one in the middle of
tasklet_action().
Signed-off-by: Ankita Garg <ankita@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The exported kernel interfaces of genpool allocator need to adhere to
the requirements of kernel-doc.
Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Modules using the genpool allocator need to be able to destroy the data
structure when unloading.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The existing implementation of ioremap_page_range(), which was taken
from i386, does this:
flush_cache_all();
/* modify page tables */
flush_tlb_all();
I think this is a bit defensive, so this patch changes the generic
implementation to do:
/* modify page tables */
flush_cache_vmap(start, end);
instead, which is similar to what vmalloc() does. This should still
be correct because we never modify existing PTEs. According to
James Bottomley:
The problem the flush_tlb_all() is trying to solve is to avoid stale tlb
entries in the ioremap area. We're just being conservative by flushing
on both map and unmap. Technically what vmalloc/vfree does (only flush
the tlb on unmap) is just fine because it means that the only tlb
entries in the remap area must belong to in-use mappings.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-m32r@ml.linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a generic implementation of ioremap_page_range() in
lib/ioremap.c based on the i386 implementation. It differs from the
i386 version in the following ways:
* The PTE flags are passed as a pgprot_t argument and must be
determined up front by the arch-specific code. No additional
PTE flags are added.
* Uses set_pte_at() instead of set_pte()
[bunk@stusta.de: warning fix]
]dhowells@redhat.com: nommu build fix]
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-m32r@ml.linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These two BUG_ON()s are redundant and undesired: we're checking for this
condition further on in the function, only better.
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Constify two structs.
Correct some typos.
Compile-tested and run-tested (module inserted) on 2.6.18-rc4-mm3.
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Un-inlining rwsem_down_failed_common() (two callsites) reduced lib/rwsem.o
on my Athlon, gcc 4.1.2 from 5935 to 5480 Bytes (455 Bytes saved).
I thus guess that reduced icache footprint (and better function caching) is
worth more than any function call overhead.
Signed-off-by: Andreas Mohr <andi@lisas.de>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In spinlock_debug.c, the spinloops call __delay() on every iteration.
Because that is an external function, (jiffies_per_loop * HZ), the loop's
iteration limit, gets recomputed every time. Caching it explicitly
prevents that.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A list_del() debugging check. Has been in -mm for years. Dave moved
list_del() out-of-line in the debug case, so this is now suitable for
mainline.
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
[PATCH] Don't set calgary iommu as default y
[PATCH] i386/x86-64: New Intel feature flags
[PATCH] x86: Add a cumulative thermal throttle event counter.
[PATCH] i386: Make the jiffies compares use the 64bit safe macros.
[PATCH] x86: Refactor thermal throttle processing
[PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
[PATCH] Fix unwinder warning in traps.c
[PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
[PATCH] x86: Move direct PCI scanning functions out of line
[PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
[PATCH] Don't leak NT bit into next task
[PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
[PATCH] Fix some broken white space in ia32_signal.c
[PATCH] Initialize argument registers for 32bit signal handlers.
[PATCH] Remove all traces of signal number conversion
[PATCH] Don't synchronize time reading on single core AMD systems
[PATCH] Remove outdated comment in x86-64 mmconfig code
[PATCH] Use string instructions for Core2 copy/clear
[PATCH] x86: - restore i8259A eoi status on resume
[PATCH] i386: Split multi-line printk in oops output.
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits)
Driver core: Don't call put methods while holding a spinlock
Driver core: Remove unneeded routines from driver core
Driver core: Fix potential deadlock in driver core
PCI: enable driver multi-threaded probe
Driver Core: add ability for drivers to do a threaded probe
sysfs: add proper sysfs_init() prototype
drivers/base: check errors
drivers/base: Platform notify needs to occur before drivers attach to the device
v4l-dev2: handle __must_check
add CONFIG_ENABLE_MUST_CHECK
add __must_check to device management code
Driver core: fixed add_bind_files() definition
Driver core: fix comments in drivers/base/power/resume.c
sysfs_remove_bin_file: no return value, dump_stack on error
kobject: must_check fixes
Driver core: add ability for devices to create and remove bin files
Class: add support for class interfaces for devices
Driver core: create devices/virtual/ tree
Driver core: add device_rename function
Driver core: add ability for classes to handle devices properly
...
This adds support for the Atmel AVR32 architecture as well as the AT32AP7000
CPU and the AT32STK1000 development board.
AVR32 is a new high-performance 32-bit RISC microprocessor core, designed for
cost-sensitive embedded applications, with particular emphasis on low power
consumption and high code density. The AVR32 architecture is not binary
compatible with earlier 8-bit AVR architectures.
The AVR32 architecture, including the instruction set, is described by the
AVR32 Architecture Manual, available from
http://www.atmel.com/dyn/resources/prod_documents/doc32000.pdf
The Atmel AT32AP7000 is the first CPU implementing the AVR32 architecture. It
features a 7-stage pipeline, 16KB instruction and data caches and a full
Memory Management Unit. It also comes with a large set of integrated
peripherals, many of which are shared with the AT91 ARM-based controllers from
Atmel.
Full data sheet is available from
http://www.atmel.com/dyn/resources/prod_documents/doc32003.pdf
while the CPU core implementation including caches and MMU is documented by
the AVR32 AP Technical Reference, available from
http://www.atmel.com/dyn/resources/prod_documents/doc32001.pdf
Information about the AT32STK1000 development board can be found at
http://www.atmel.com/dyn/products/tools_card.asp?tool_id=3918
including a BSP CD image with an earlier version of this patch, development
tools (binaries and source/patches) and a root filesystem image suitable for
booting from SD card.
Alternatively, there's a preliminary "getting started" guide available at
http://avr32linux.org/twiki/bin/view/Main/GettingStarted which provides links
to the sources and patches you will need in order to set up a cross-compiling
environment for avr32-linux.
This patch, as well as the other patches included with the BSP and the
toolchain patches, is actively supported by Atmel Corporation.
[dmccr@us.ibm.com: Fix more pxx_page macro locations]
[bunk@stusta.de: fix `make defconfig']
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Dave McCracken <dmccr@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Based on patch from David Rientjes <rientjes@google.com>, but
changed by AK.
Optimizes the 64-bit hamming weight for x86_64 processors assuming they
have fast multiplication. Uses five fewer bitops than the generic
hweight64. Benchmark on one EMT64 showed ~25% speedup with 2^24
consecutive calls.
Define a new ARCH_HAS_FAST_MULTIPLIER that can be set by other
architectures that can also multiply fast.
Signed-off-by: Andi Kleen <ak@suse.de>
The klist utility routines currently call _put methods while holding a
spinlock. This is of course illegal; a put routine could try to
unregister a device and hence need to sleep.
No problems have arisen until now because in many cases klist removals
were done synchronously, so the _put methods were never actually used.
In other cases we may simply have been lucky.
This patch (as784) reworks the klist routines so that _put methods are
called only _after_ the klist's spinlock has been released.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Those 1500 warnings can be a bit of a pain. Add a config option to shut them
up.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
several targets have no ....at() family and m32r calls its only chown variant
chown32(), with __NR_chown being undefined. creat(2) is also absent in some
targets.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Take default arch/*/kernel/audit.c to lib/, have those with special
needs (== biarch) define AUDIT_ARCH in their Kconfig.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The pattern is set after trying to compute the prefix table, which tries
to use it. Initialize it before calling compute_prefix_tbl, make
compute_prefix_tbl consistently use only the data from struct ts_bm
and remove the now unnecessary arguments.
Signed-off-by: Michael Rash <mbr@cipherdyne.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
We've confirmed that the debug version of write_lock() can get stuck for long
enough to cause NMI watchdog timeouts and hence a crash.
We don't know why, yet. Disable it for now.
Also disable the similar read_lock() code. Just in case.
Thanks to Dave Olson <olson@unixfolk.com> for reporting and testing.
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove uevent dock notifications. There are no consumers
of these events at present, and uevents are likely not the
correct way to send this type of event anyway.
Until I get some kind of idea if anyone in userspace cares
about dock events, I will just not send any.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The recent zlib update (commit 4f3865fb57)
broke ppc32 zImage decompression as it tries to decompress to address zero
and the updated zlib_inflate checks that strm->next_out isn't a null
pointer.
This little patch fixes it.
[rpurdie@rpsys.net: add comment]
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The lockdep options should depend on DEBUG_KERNEL since:
- they are kernel debugging options and
- they do otherwise break the DEBUG_KERNEL menu structure
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently, the code in lib/idr.c uses a bare spin_lock(&idp->lock) to do
internal locking. This is a nasty trap for code that might call idr
functions from different contexts; for example, it seems perfectly
reasonable to call idr_get_new() from process context and idr_remove() from
interrupt context -- but with the current locking this would lead to a
potential deadlock.
The simplest fix for this is to just convert the idr locking to use
spin_lock_irqsave().
In particular, this fixes a very complicated locking issue detected by
lockdep, involving the ib_ipoib driver's priv->lock and dev->_xmit_lock,
which get involved with the ib_sa module's query_idr.lock.
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Zach Brown <zach.brown@oracle.com>,
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Offer the following lock validation options:
CONFIG_PROVE_LOCKING
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the lock validator framework to prove spinlock and rwlock locking
correctness.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the lock validator framework to prove rwsem locking correctness.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Ingo Molnar <mingo@elte.hu>
lockdep so far only allowed read-recursion for the same lock instance.
This is enough in the overwhelming majority of cases, but a hostap case
triggered and reported by Miles Lane relies on same-class
different-instance recursion. So we relax the restriction on read-lock
recursion.
(This change does not allow rwsem read-recursion, which is still
forbidden.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Do 'make oldconfig' and accept all the defaults for new config options -
reboot into the kernel and if everything goes well it should boot up fine and
you should have /proc/lockdep and /proc/lockdep_stats files.
Typically if the lock validator finds some problem it will print out
voluminous debug output that begins with "BUG: ..." and which syslog output
can be used by kernel developers to figure out the precise locking scenario.
What does the lock validator do? It "observes" and maps all locking rules as
they occur dynamically (as triggered by the kernel's natural use of spinlocks,
rwlocks, mutexes and rwsems). Whenever the lock validator subsystem detects a
new locking scenario, it validates this new rule against the existing set of
rules. If this new rule is consistent with the existing set of rules then the
new rule is added transparently and the kernel continues as normal. If the
new rule could create a deadlock scenario then this condition is printed out.
When determining validity of locking, all possible "deadlock scenarios" are
considered: assuming arbitrary number of CPUs, arbitrary irq context and task
context constellations, running arbitrary combinations of all the existing
locking scenarios. In a typical system this means millions of separate
scenarios. This is why we call it a "locking correctness" validator - for all
rules that are observed the lock validator proves it with mathematical
certainty that a deadlock could not occur (assuming that the lock validator
implementation itself is correct and its internal data structures are not
corrupted by some other kernel subsystem). [see more details and conditionals
of this statement in include/linux/lockdep.h and
Documentation/lockdep-design.txt]
Furthermore, this "all possible scenarios" property of the validator also
enables the finding of complex, highly unlikely multi-CPU multi-context races
via single single-context rules, increasing the likelyhood of finding bugs
drastically. In practical terms: the lock validator already found a bug in
the upstream kernel that could only occur on systems with 3 or more CPUs, and
which needed 3 very unlikely code sequences to occur at once on the 3 CPUs.
That bug was found and reported on a single-CPU system (!). So in essence a
race will be found "piecemail-wise", triggering all the necessary components
for the race, without having to reproduce the race scenario itself! In its
short existence the lock validator found and reported many bugs before they
actually caused a real deadlock.
To further increase the efficiency of the validator, the mapping is not per
"lock instance", but per "lock-class". For example, all struct inode objects
in the kernel have inode->inotify_mutex. If there are 10,000 inodes cached,
then there are 10,000 lock objects. But ->inotify_mutex is a single "lock
type", and all locking activities that occur against ->inotify_mutex are
"unified" into this single lock-class. The advantage of the lock-class
approach is that all historical ->inotify_mutex uses are mapped into a single
(and as narrow as possible) set of locking rules - regardless of how many
different tasks or inode structures it took to build this set of rules. The
set of rules persist during the lifetime of the kernel.
To see the rough magnitude of checking that the lock validator does, here's a
portion of /proc/lockdep_stats, fresh after bootup:
lock-classes: 694 [max: 2048]
direct dependencies: 1598 [max: 8192]
indirect dependencies: 17896
all direct dependencies: 16206
dependency chains: 1910 [max: 8192]
in-hardirq chains: 17
in-softirq chains: 105
in-process chains: 1065
stack-trace entries: 38761 [max: 131072]
combined max dependencies: 2033928
hardirq-safe locks: 24
hardirq-unsafe locks: 176
softirq-safe locks: 53
softirq-unsafe locks: 137
irq-safe locks: 59
irq-unsafe locks: 176
The lock validator has observed 1598 actual single-thread locking patterns,
and has validated all possible 2033928 distinct locking scenarios.
More details about the design of the lock validator can be found in
Documentation/lockdep-design.txt, which can also found at:
http://redhat.com/~mingo/lockdep-patches/lockdep-design.txt
[bunk@stusta.de: cleanups]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce DEBUG_LOCKING_API_SELFTESTS, which uses the generic lock debugging
code's silent-failure feature to run a matrix of testcases. There are 210
testcases currently:
+-----------------------
| Locking API testsuite:
+------------------------------+------+------+------+------+------+------+
| spin |wlock |rlock |mutex | wsem | rsem |
-------------------------------+------+------+------+------+------+------+
A-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok | ok |
double unlock: ok | ok | ok | ok | ok | ok |
bad unlock order: ok | ok | ok | ok | ok | ok |
--------------------------------------+------+------+------+------+------+
recursive read-lock: | ok | | ok |
--------------------------------------+------+------+------+------+------+
non-nested unlock: ok | ok | ok | ok |
--------------------------------------+------+------+------+
hard-irqs-on + irq-safe-A/12: ok | ok | ok |
soft-irqs-on + irq-safe-A/12: ok | ok | ok |
hard-irqs-on + irq-safe-A/21: ok | ok | ok |
soft-irqs-on + irq-safe-A/21: ok | ok | ok |
sirq-safe-A => hirqs-on/12: ok | ok | ok |
sirq-safe-A => hirqs-on/21: ok | ok | ok |
hard-safe-A + irqs-on/12: ok | ok | ok |
soft-safe-A + irqs-on/12: ok | ok | ok |
hard-safe-A + irqs-on/21: ok | ok | ok |
soft-safe-A + irqs-on/21: ok | ok | ok |
hard-safe-A + unsafe-B #1/123: ok | ok | ok |
soft-safe-A + unsafe-B #1/123: ok | ok | ok |
hard-safe-A + unsafe-B #1/132: ok | ok | ok |
soft-safe-A + unsafe-B #1/132: ok | ok | ok |
hard-safe-A + unsafe-B #1/213: ok | ok | ok |
soft-safe-A + unsafe-B #1/213: ok | ok | ok |
hard-safe-A + unsafe-B #1/231: ok | ok | ok |
soft-safe-A + unsafe-B #1/231: ok | ok | ok |
hard-safe-A + unsafe-B #1/312: ok | ok | ok |
soft-safe-A + unsafe-B #1/312: ok | ok | ok |
hard-safe-A + unsafe-B #1/321: ok | ok | ok |
soft-safe-A + unsafe-B #1/321: ok | ok | ok |
hard-safe-A + unsafe-B #2/123: ok | ok | ok |
soft-safe-A + unsafe-B #2/123: ok | ok | ok |
hard-safe-A + unsafe-B #2/132: ok | ok | ok |
soft-safe-A + unsafe-B #2/132: ok | ok | ok |
hard-safe-A + unsafe-B #2/213: ok | ok | ok |
soft-safe-A + unsafe-B #2/213: ok | ok | ok |
hard-safe-A + unsafe-B #2/231: ok | ok | ok |
soft-safe-A + unsafe-B #2/231: ok | ok | ok |
hard-safe-A + unsafe-B #2/312: ok | ok | ok |
soft-safe-A + unsafe-B #2/312: ok | ok | ok |
hard-safe-A + unsafe-B #2/321: ok | ok | ok |
soft-safe-A + unsafe-B #2/321: ok | ok | ok |
hard-irq lock-inversion/123: ok | ok | ok |
soft-irq lock-inversion/123: ok | ok | ok |
hard-irq lock-inversion/132: ok | ok | ok |
soft-irq lock-inversion/132: ok | ok | ok |
hard-irq lock-inversion/213: ok | ok | ok |
soft-irq lock-inversion/213: ok | ok | ok |
hard-irq lock-inversion/231: ok | ok | ok |
soft-irq lock-inversion/231: ok | ok | ok |
hard-irq lock-inversion/312: ok | ok | ok |
soft-irq lock-inversion/312: ok | ok | ok |
hard-irq lock-inversion/321: ok | ok | ok |
soft-irq lock-inversion/321: ok | ok | ok |
hard-irq read-recursion/123: ok |
soft-irq read-recursion/123: ok |
hard-irq read-recursion/132: ok |
soft-irq read-recursion/132: ok |
hard-irq read-recursion/213: ok |
soft-irq read-recursion/213: ok |
hard-irq read-recursion/231: ok |
soft-irq read-recursion/231: ok |
hard-irq read-recursion/312: ok |
soft-irq read-recursion/312: ok |
hard-irq read-recursion/321: ok |
soft-irq read-recursion/321: ok |
--------------------------------+-----+----------------
Good, all 210 testcases passed! |
--------------------------------+
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
CONFIG_FRAME_POINTER support for s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Framework to generate and save stacktraces quickly, without printing anything
to the console.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Generic lock debugging:
- generalized lock debugging framework. For example, a bug in one lock
subsystem turns off debugging in all lock subsystems.
- got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from
the mutex/rtmutex debugging code: it caused way too much prototype
hackery, and lockdep will give the same information anyway.
- ability to do silent tests
- check lock freeing in vfree too.
- more finegrained debugging options, to allow distributions to
turn off more expensive debugging features.
There's no separate 'held mutexes' list anymore - but there's a 'held locks'
stack within lockdep, which unifies deadlock detection across all lock
classes. (this is independent of the lockdep validation stuff - lockdep first
checks whether we are holding a lock already)
Here are the current debugging options:
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
which do:
config DEBUG_MUTEXES
bool "Mutex debugging, basic checks"
config DEBUG_LOCK_ALLOC
bool "Detect incorrect freeing of live mutexes"
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With the lock validator we detect mutex deadlocks (and more), the mutex
deadlock checking code is both redundant and slower. So remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The recent vsnprintf() fix introduced an off-by-one, and it's now
possible to overrun the target buffer by one byte.
The "end" pointer points to past the end of the buffer, so if we
have to truncate the result, it needs to be done though "end[-1]".
[ This is just an alternate and simpler patch to one proposed by Andrew
and Jeremy, who actually noticed the problem ]
Acked-by: Andrew Morton <akpm@osdl.org>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Temporarily add EXPORT_UNUSED_SYMBOL and EXPORT_UNUSED_SYMBOL_GPL. These
will be used as a transition measure for symbols that aren't used in the
kernel and are on the way out. When a module uses such a symbol, a warning
is printk'd at modprobe time.
The main reason for removing unused exports is size: eacho export takes
roughly between 100 and 150 bytes of kernel space in the binary. This
patch gives users the option to immediately get this size gain via a config
option.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
RT-mutex tester: scriptable tester for rt mutexes, which allows userspace
scripting of mutex unit-tests (and dynamic tests as well), using the actual
rt-mutex implementation of the kernel.
[akpm@osdl.org: fixlet]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Runtime debugging functionality for rt-mutexes.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add the priority-sorted list (plist) implementation.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix function definitions to be ANSI-compliant:
lib/zlib_inflate/inffast.c:68:1: warning: non-ANSI definition of function 'inflate_fast'
lib/zlib_inflate/inftrees.c:33:1: warning: non-ANSI definition of function 'zlib_inflate_table'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
so that userspace can be notified of dock and undock events.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
typo fixes
Clean up 'inline is not at beginning' warnings for usb storage
Storage class should be first
i386: Trivial typo fixes
ixj: make ixj_set_tone_off() static
spelling fixes
fix paniced->panicked typos
Spelling fixes for Documentation/atomic_ops.txt
move acknowledgment for Mark Adler to CREDITS
remove the bouncing email address of David Campbell
These are the i386-specific pieces to enable reliable stack traces. This is
going to be even more useful once CFI annotations get added to he assembly
code, namely to entry.S.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These are the x86_64-specific pieces to enable reliable stack traces. The
only restriction with this is that it currently cannot unwind across the
interrupt->normal stack boundary, as that transition is lacking proper
annotation.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These are the generic bits needed to enable reliable stack traces based
on Dwarf2-like (.eh_frame) unwind information. Subsequent patches will
enable x86-64 and i386 to make use of this.
Thanks to Andi Kleen and Ingo Molnar, who pointed out several possibilities
for improvement.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds idr_replace() to replace an existing pointer in a single
operation.
Device-mapper will use this to update the pointer it stored against a given
id.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The place in the documentation of the Linux kernel to acknowledge
contributions is the CREDITS file.
Give Mark Adler an entry there instead of including a string in the
kernel image.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Implement kasprintf, a kernel version of asprintf. This allocates the
memory required for the formatted string, including the trailing '\0'.
Returns NULL on allocation failure.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This change allows callers to use a 0-byte buffer and a NULL buffer pointer
with vsnprintf, so it can be used to determine how large the resulting
formatted string will be.
Previously the code effectively treated a size of 0 as a size of 4G (on
32-bit systems), with other checks preventing it from actually trying to
emit the string - but the terminal \0 would still be written, which would
crash if the buffer is NULL.
This change changes the boundary check so that 'end' points to the putative
location of the terminal '\0', which is only written if size > 0.
vsnprintf still allows the buffer size to be set very large, to allow
unbounded buffer sizes (to implement sprintf, etc).
[akpm@osdl.org: fix long-vs-longlong confusion]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make kernel-doc corrections & additions to lib/crc*.c. Add crc functions to
kernel-api.tmpl in DocBook.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make corrections/fixes to kernel-doc in lib/bitmap.c and include it in DocBook
template.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In radix_tree_tag_get(), return normalized value of 0/1, as indicated
by its comment.
Signed-off-by: Wu Fengguang <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a new strstrip() function to lib/string.c for removing leading and
trailing whitespace from a string.
Cc: Michael Holzheu <holzheu@de.ibm.com>
Acked-by: Ingo Oeser <ioe-lkml@rameria.de>
Acked-by: Joern Engel <joern@wohnheim.fh-wedel.de>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Michael Holzheu <HOLZHEU@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The percpu counter data type are changed in this set of patches to support
more users like ext3 who need more than 32 bit to store the free blocks
total in the filesystem.
- Generic perpcu counters data type changes. The size of the global counter
and local counter were explictly specified using s64 and s32. The global
counter is changed from long to s64, while the local counter is changed from
long to s32, so we could avoid doing 64 bit update in most cases.
- Users of the percpu counters are updated to make use of the new
percpu_counter_init() routine now taking an additional parameter to allow
users to pass the initial value of the global counter.
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The comment states: 'Setting a tag on a not-present item is a BUG.' Hence
if 'index' is larger than the maxindex; the item _cannot_ be presen; it
should also be a BUG.
Also, this allows the following statement (assume a fresh tree):
radix_tree_tag_set(root, 16, 1);
to fail silently, but when preceded by:
radix_tree_insert(root, 32, item);
it would BUG, because the height has been extended by the insert.
In neither case was 16 present.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reduce radix tree node memory usage by about a factor of 4 for small files
(< 64K). There are pointer traversal and memory usage costs for large
files with dense pagecache.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The ability to have height 0 radix trees (a direct pointer to the data item
rather than going through a full node->slot) quietly disappeared with
old-2.6-bkcvs commit ffee171812d51652f9ba284302d9e5c5cc14bdfd. On 64-bit
machines this causes nearly 600 bytes to be used for every <= 4K file in
pagecache.
Re-introduce this feature, root tags stored in spare ->gfp_mask bits.
Simplify radix_tree_delete's complex tag clearing arrangement (which would
become even more complex) by just falling back to tag clearing functions
(the pagecache radix-tree never uses this path anyway, so the icache
savings will mean it's actually a speedup).
On my 4GB G5, this saves 8MB RAM per kernel kernel source+object tree in
pagecache.
Pagecache lookup, insertion, and removal speed for small files will also be
improved.
This makes RCU radix tree harder, but it's worth it.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme
instead of the buddy scheme. The purpose of this change is to eliminate
the touching of the actual memory being allocated.
Since the change modifies the interface, a change to the uncached allocator
(arch/ia64/kernel/uncached.c) is also required.
Both Andrey Volkov and Jes Sorenson have expressed a desire that the
gen_pool allocator not write to the memory being managed. See the
following:
http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2
Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: Andrey Volkov <avolkov@varma-el.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Upgrade the zlib_inflate implementation in the kernel from a patched
version 1.1.3/4 to a patched 1.2.3.
The code in the kernel is about seven years old and I noticed that the
external zlib library's inflate performance was significantly faster (~50%)
than the code in the kernel on ARM (and faster again on x86_32).
For comparison the newer deflate code is 20% slower on ARM and 50% slower
on x86_32 but gives an approx 1% compression ratio improvement. I don't
consider this to be an improvement for kernel use so have no plans to
change the zlib_deflate code.
Various changes have been made to the zlib code in the kernel, the most
significant being the extra functions/flush option used by ppp_deflate.
This update reimplements the features PPP needs to ensure it continues to
work.
This code has been tested on ARM under both JFFS2 (with zlib compression
enabled) and ppp_deflate and on x86_32. JFFS2 sees an approx. 10% real
world file read speed improvement.
This patch also removes ZLIB_VERSION as it no longer has a correct value.
We don't need version checks anyway as the kernel's module handling will
take care of that for us. This removal is also more in keeping with the
zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've
added something to the zlib.h header to note its a modified version.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Joern Engel <joern@wh.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce __iowrite64_copy. It will be used by the Myri-10G Ethernet
driver to post requests to the NIC. This driver will be submitted soon.
__iowrite64_copy copies to I/O memory in units of 64 bits when possible (on
64 bit architectures). It reverts to __iowrite32_copy on 32 bit
architectures.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.infradead.org/~dwmw2/rbtree-2.6:
[RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Update UML kernel/physmem.c to use rb_parent() accessor macro
[RBTREE] Update hrtimers to use rb_parent() accessor macro.
[RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
[RBTREE] Merge colour and parent fields of struct rb_node.
[RBTREE] Remove dead code in rb_erase()
[RBTREE] Update JFFS2 to use rb_parent() accessor macro.
[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
[RBTREE] Update key.c to use rb_parent() accessor macro.
[RBTREE] Update ext3 to use rb_parent() accessor macro.
[RBTREE] Change rbtree off-tree marking in I/O schedulers.
[RBTREE] Add accessor macros for colour and parent fields of rb_node
Since rb_insert_color() is part of the _public_ API, while the others are
purely internal, switch to be consistent with that.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
People don't like released kernels yelling at them, no matter how real the
error might be. So only report it if CONFIG_KOBJECT_DEBUG is enabled.
Sent on request of Andrew Morton.
(akpm: should bring this back post-2.6.17)
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For sparc32 we need R_SPARC_UA32 relocation support, for
sparc64 we need the handle R_SPARC_DISP32 relocations.
Based upon reports and initial patch by Martin Habets.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following possible cleanups:
- #if 0 the following unused global function:
- subsys_remove_file()
- remove the following unused EXPORT_SYMBOL's:
- kset_find_obj
- subsystem_init
- remove the following unused EXPORT_SYMBOL_GPL:
- kobject_add_dir
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes a build error for various odd combinations of CONFIG_HOTPLUG
and CONFIG_NET.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Nigel Cunningham <ncunningham@cyclades.com>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We only used a single bit for colour information, so having a whole
machine word of space allocated for it was a bit wasteful. Instead,
store it in the lowest bit of the 'parent' pointer, since that was
always going to be aligned anyway.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Observe rb_erase(), when the victim node 'old' has two children so
neither of the simple cases at the beginning are taken.
Observe that it effectively does an 'rb_next()' operation to find the
next (by value) node in the tree. That is; we go to the victim's
right-hand child and then follow left-hand pointers all the way
down the tree as far as we can until we find the next node 'node'. We
end up with 'node' being either the same immediate right-hand child of
'old', or one of its descendants on the far left-hand side.
For a start, we _know_ that 'node' has a parent. We can drop that check.
We also know that if 'node's parent is 'old', then 'node' is the
right-hand child of its parent. And that if 'node's parent is _not_
'old', then 'node' is the left-hand child of its parent.
So instead of checking for 'node->rb_parent == old' in one place and
also checking 'node's heritage separately when we're trying to change
its link from its parent, we can shuffle things around a bit and do
it like this...
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
DEBUG_MUTEX flag is on by default in current kernel configuration.
During performance testing, we saw mutex debug functions like
mutex_debug_check_no_locks_freed (called by kfree()) is expensive as it
goes through a global list of memory areas with mutex lock and do the
checking. For benchmarks such as Volanomark and Hackbench, we have seen
more than 40% drop in performance on some platforms. We suggest to set
DEBUG_MUTEX off by default. Or at least do that later when we feel that
the mutex changes in the current code have stabilized.
Signed-off-by: Tim Chen <tim.c.chen@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It works like this:
Open the file
Read all the contents.
Call poll requesting POLLERR or POLLPRI (so select/exceptfds works)
When poll returns,
close the file and go to top of loop.
or lseek to start of file and go back to the 'read'.
Events are signaled by an object manager calling
sysfs_notify(kobj, dir, attr);
If the dir is non-NULL, it is used to find a subdirectory which
contains the attribute (presumably created by sysfs_create_group).
This has a cost of one int per attribute, one wait_queuehead per kobject,
one int per open file.
The name "sysfs_notify" may be confused with the inotify
functionality. Maybe it would be nice to support inotify for sysfs
attributes as well?
This patch also uses sysfs_notify to allow /sys/block/md*/md/sync_action
to be pollable
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some string functions were safely overrideable in lib/string.c, but their
corresponding declarations in linux/string.h were not. Correct this, and
make strcspn overrideable.
Odds of someone wanting to do optimized assembly of these are small, but
for the sake of cleanliness, might as well bring them into line with the
rest of the file.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
While cleaning up parisc_ksyms.c earlier, I noticed that strpbrk wasn't
being exported from lib/string.c. Investigating further, I noticed a
changeset that removed its export and added it to _ksyms.c on a few more
architectures. The justification was that "other arches do it."
I think this is wrong, since no architecture currently defines
__HAVE_ARCH_STRPBRK, there's no reason for any of them to be exporting it
themselves. Therefore, consolidate the export to lib/string.c.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove CONFIG_DEBUG_IOREMAP, it's now obsolete and won't work anyway.
Remove it from lib/KConfig since it was only available on parisc.
Signed-off-by: Helge Deller <deller@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
<linux@horizon.com> wrote:
This is an extremely well-known technique. You can see a similar version that
uses a multiply for the last few steps at
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel whch
refers to "Software Optimization Guide for AMD Athlon 64 and Opteron
Processors"
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF
It's section 8.6, "Efficient Implementation of Population-Count Function in
32-bit Mode", pages 179-180.
It uses the name that I am more familiar with, "popcount" (population count),
although "Hamming weight" also makes sense.
Anyway, the proof of correctness proceeds as follows:
b = a - ((a >> 1) & 0x55555555);
c = (b & 0x33333333) + ((b >> 2) & 0x33333333);
d = (c + (c >> 4)) & 0x0f0f0f0f;
#if SLOW_MULTIPLY
e = d + (d >> 8)
f = e + (e >> 16);
return f & 63;
#else
/* Useful if multiply takes at most 4 cycles */
return (d * 0x01010101) >> 24;
#endif
The input value a can be thought of as 32 1-bit fields each holding their own
hamming weight. Now look at it as 16 2-bit fields. Each 2-bit field a1..a0
has the value 2*a1 + a0. This can be converted into the hamming weight of the
2-bit field a1+a0 by subtracting a1.
That's what the (a >> 1) & mask subtraction does. Since there can be no
borrows, you can just do it all at once.
Enumerating the 4 possible cases:
0b00 = 0 -> 0 - 0 = 0
0b01 = 1 -> 1 - 0 = 1
0b10 = 2 -> 2 - 1 = 1
0b11 = 3 -> 3 - 1 = 2
The next step consists of breaking up b (made of 16 2-bir fields) into
even and odd halves and adding them into 4-bit fields. Since the largest
possible sum is 2+2 = 4, which will not fit into a 4-bit field, the 2-bit
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
"which will not fit into a 2-bit field"
fields have to be masked before they are added.
After this point, the masking can be delayed. Each 4-bit field holds a
population count from 0..4, taking at most 3 bits. These numbers can be added
without overflowing a 4-bit field, so we can compute c + (c >> 4), and only
then mask off the unwanted bits.
This produces d, a number of 4 8-bit fields, each in the range 0..8. From
this point, we can shift and add d multiple times without overflowing an 8-bit
field, and only do a final mask at the end.
The number to mask with has to be at least 63 (so that 32 on't be truncated),
but can also be 128 or 255. The x86 has a special encoding for signed
immediate byte values -128..127, so the value of 255 is slower. On other
processors, a special "sign extend byte" instruction might be faster.
On a processor with fast integer multiplies (Athlon but not P4), you can
reduce the final few serially dependent instructions to a single integer
multiply. Consider d to be 3 8-bit values d3, d2, d1 and d0, each in the
range 0..8. The multiply forms the partial products:
d3 d2 d1 d0
d3 d2 d1 d0
d3 d2 d1 d0
+ d3 d2 d1 d0
----------------------
e3 e2 e1 e0
Where e3 = d3 + d2 + d1 + d0. e2, e1 and e0 obviously cannot generate
any carries.
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
By defining generic hweight*() routines
- hweight64() will be defined on all architectures
- hweight_long() will use architecture optimized hweight32() or hweight64()
I found two possible cleanups by these reasons.
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch introduces the C-language equivalents of the functions below:
int ext2_set_bit(int nr, volatile unsigned long *addr);
int ext2_clear_bit(int nr, volatile unsigned long *addr);
int ext2_test_bit(int nr, const volatile unsigned long *addr);
unsigned long ext2_find_first_zero_bit(const unsigned long *addr,
unsigned long size);
unsinged long ext2_find_next_zero_bit(const unsigned long *addr,
unsigned long size);
In include/asm-generic/bitops/ext2-non-atomic.h
This code largely copied from:
include/asm-powerpc/bitops.h
include/asm-parisc/bitops.h
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch introduces the C-language equivalents of the functions below:
unsigned int hweight32(unsigned int w);
unsigned int hweight16(unsigned int w);
unsigned int hweight8(unsigned int w);
unsigned long hweight64(__u64 w);
In include/asm-generic/bitops/hweight.h
This code largely copied from: include/linux/bitops.h
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch introduces the C-language equivalents of the functions below:
unsigned logn find_next_bit(const unsigned long *addr, unsigned long size,
unsigned long offset);
unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
unsigned long offset);
unsigned long find_first_zero_bit(const unsigned long *addr,
unsigned long size);
unsigned long find_first_bit(const unsigned long *addr, unsigned long size);
In include/asm-generic/bitops/find.h
This code largely copied from: arch/powerpc/lib/bitops.c
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
DEBUG_KERNEL is often enabled just for sysrq, but this doesn't
mean the user wants more heavyweight debugging information.
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
BUG_ON() Conversion in drivers/video/
BUG_ON() Conversion in drivers/parisc/
BUG_ON() Conversion in drivers/block/
BUG_ON() Conversion in sound/sparc/cs4231.c
BUG_ON() Conversion in drivers/s390/block/dasd.c
BUG_ON() Conversion in lib/swiotlb.c
BUG_ON() Conversion in kernel/cpu.c
BUG_ON() Conversion in ipc/msg.c
BUG_ON() Conversion in block/elevator.c
BUG_ON() Conversion in fs/coda/
BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
BUG_ON() Conversion in input/serio/hil_mlc.c
BUG_ON() Conversion in md/dm-hw-handler.c
BUG_ON() Conversion in md/bitmap.c
The comment describing how MS_ASYNC works in msync.c is confusing
rcu: undeclared variable used in documentation
fix typos "wich" -> "which"
typo patch for fs/ufs/super.c
Fix simple typos
tabify drivers/char/Makefile
...
text data bss dec hex filename
before: 3605597 1363528 363328 5332453 515de5 vmlinux
after: 3605295 1363612 363200 5332107 515c8b vmlinux
218 bytes saved.
Also, optimise any_online_cpu() out of existence on CONFIG_SMP=n.
This function seems inefficient. Can't we simply AND the two masks, then use
find_first_bit()?
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shrinks the only caller (net/bridge/netfilter/ebtables.c) by 174 bytes.
Also, optimise highest_possible_processor_id() out of existence on
CONFIG_SMP=n.
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Documentation changes to help radix tree users avoid overrunning the tags
array. RADIX_TREE_TAGS moves to linux/radix-tree.h and is now known as
RADIX_TREE_MAX_TAGS (Nick Piggin's idea). Tag parameters are changed to
unsigned, and some comments are updated.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The Kconfig text for CONFIG_DEBUG_SLAB and CONFIG_DEBUG_PAGEALLOC have always
seemed a bit confusing. Change them to:
CONFIG_DEBUG_SLAB: "Debug slab memory allocations"
CONFIG_DEBUG_PAGEALLOC: "Debug page memory allocations"
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
As a foundation for reliable stack unwinding, this adds a config option
(available to all architectures except IA64 and those where the module
loader might have problems with the resulting relocations) to enable the
generation of frame unwind information.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>,
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Restructure the bitmap_*_region() operations, to avoid code duplication.
Also reduces binary text size by about 100 bytes (ia64 arch). The original
Bottomley bitmap_*_region patch added about 1000 bytes of compiled kernel text
(ia64). The Mundt multiword extension added another 600 bytes, and this
restructuring patch gets back about 100 bytes.
But the real motivation was the reduced amount of duplicated code.
Tested by Paul Mundt using <= BITS_PER_LONG as well as power of
2 aligned multiword spanning allocations.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add support to the lib/bitmap.c bitmap_*_region() routines
For bitmap regions larger than one word (nbits > BITS_PER_LONG). This removes
a BUG_ON() in lib bitmap.
I have an updated store queue API for SH that is currently using this with
relative success, and at first glance, it seems like this could be useful for
x86 (arch/i386/kernel/pci-dma.c) as well. Particularly for anything using
dma_declare_coherent_memory() on large areas and that attempts to allocate
large buffers from that space.
Paul Jackson also did some cleanup to this patch.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mundt <lethal@linux-sh.org> says:
This patch set implements a number of patches to clean up and restructure the
bitmap region code, in addition to extending the interface to support
multiword spanning allocations.
The current implementation (before this patch set) is limited by only being
able to allocate pages <= BITS_PER_LONG, as noted by the strategically
positioned BUG_ON() at lib/bitmap.c:752:
/* We don't do regions of pages > BITS_PER_LONG. The
* algorithm would be a simple look for multiple zeros in the
* array, but there's no driver today that needs this. If you
* trip this BUG(), you get to code it... */
BUG_ON(pages > BITS_PER_LONG);
As I seem to have been the first person to trigger this, the result ends up
being the following patch set with the help of Paul Jackson.
The final patch in the series eliminates quite a bit of code duplication, so
the bitmap code size ends up being smaller than the current implementation as
an added bonus.
After these are applied, it should already be possible to do multiword
allocations with dma_alloc_coherent() out of ranges established by
dma_declare_coherent_memory() on x86 without having to change any of the code,
and the SH store queue API will follow up on this as the other user that needs
support for this.
This patch:
Some code cleanup on the lib/bitmap.c bitmap_*_region() routines:
* spacing
* variable names
* comments
Has no change to code function.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sam's tree includes a new check, which found that we're exporting strpbrk()
multiple times.
It seems that the convention is that this is exported from the arch files, so
reove the lib/string.c export.
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adding kobject_add_dir() function which creates a subdirectory
for a given kobject.
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that kobject_add() is used more than kobject_register() the kernel
wasn't always letting people know that they were doing something wrong.
This change fixes this.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Avoid an atomic operation in kref_put() when the last reference is
dropped. On most platforms, atomic_read() is a plan read of the counter
and involves no atomic at all.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Moving uevent_seqnum and uevent_helper to kobject_uevent.c
because they are used even if CONFIG_SYSFS=n
while kernel/ksysfs.c is built only if CONFIG_SYSFS=y,
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This change reverts the 033b96fd30 commit
from Kay Sievers that removed the mount/umount uevents from the kernel.
Some older versions of HAL still depend on these events to detect when a
new device has been mounted. These events are not correctly emitted,
and are broken by design, and so, should not be relied upon by any
future program. Instead, the /proc/mounts file should be polled to
properly detect this kind of event.
A feature-removal-schedule.txt entry has been added, noting when this
interface will be removed from the kernel.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If a tag is set for a node being deleted from a radix_tree, then that
tag gets cleared from the parent of the node, even if it is set for some
siblings of the node begin deleted.
This patch changes the logic to include a test for any_tag_set similar
to the logic a little futher down. Care is taken to ensure that
'nr_cleared_tags' remains equals to the number of entries in the 'tags'
array which are set to '0' (which means that this tag is not set in the
tree below pathp->node, and should be cleared at pathp->node and
possibly above.
[ Nick says: "Linus FYI, I was able to modify the radix tree test
harness to catch the bug and can no longer trigger it after the fix.
Resulting code passes all other harness tests as well of course." ]
Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes all self references and fixes references to files
in the now defunct arch/ppc64 tree. I think this accomplises
everything wanted, though there might be a few references I missed.
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The spinlock-debug wait-loop was using loops_per_jiffy to detect too long
spinlock waits - but on fast CPUs this led to a way too fast timeout and false
messages.
The fix is to include a __delay(1) call in the loop, to correctly approximate
the intended delay timeout of 1 second. The code assumes that every
architecture implements __delay(1) to last around 1/(loops_per_jiffy*HZ)
seconds.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The buffer used for kobject uevent is too small for some of the events generated
by the input layer. Bump it to 2k.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
kobject_get_path() will oops if one of the component names is
NULL. Fix that by returning NULL instead of oopsing.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
So we might as well check to verify this, and let the user know that
something is wrong if they didn't do it correctly, instead of oopsing
later on in kobject_get_name() or somewhere else.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The implementation of int_sqrt() assumes that longs have 32 bits. On
systems that have 64 bit longs this will result in gross errors when the
argument to the function is greater than 2^32 - 1 on such systems. I doubt
whether any such use is currently made of int_sqrt() but the attached patch
fixes the problem anyway.
Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The current logic does not calculate correctly the good shift array:
Let x be the pattern that is being searched. Let y be the block of data.
The good shift array aligns the segment:
x[i+1 ... m-1] = y[i+j+1 ... j+m-1]
with its rightmost occurrence in x that fulfils x[i] neq y[i+j].
In previous version, the good shift array for the pattern ANPANMAN is:
[1, 8, 3, 8, 8, 8, 8, 8]
and should be:
[1, 8, 3, 6, 6, 6, 6, 6]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This arch-independent routine copies data to a memory-mapped I/O region,
using 32-bit accesses. The naming is double-underscored to make it clear
that it does not guarantee write ordering, nor does it perform a memory
barrier afterwards; the kernel doc also explicitly states this. This style
of access is required by some devices.
This change also introduces include/linux/io.h, at Andrew's suggestion. It
only has one occupant at the moment, but is a logical destination for
oft-replicated contents of include/asm-*/{io,iomap}.h to migrate to.
Signed-off-by: Bryan O'Sullivan <bos@pathscale.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If optimizing for size (CONFIG_CC_OPTIMIZE_FOR_SIZE), allow gcc4 compilers
to decide what to inline and what not - instead of the kernel forcing gcc
to inline all the time. This requires several places that require to be
inlined to be marked as such, previous patches in this series do that.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
AK: I hacked Muli's original patch a lot and there were a lot
of changes - all bugs are probably to blame on me now.
There were also some changes in the fall back behaviour
for swiotlb - in particular it doesn't try to use GFP_DMA
now anymore. Also all DMA mapping operations use the
same core dma_alloc_coherent code with proper fallbacks now.
And various other changes and cleanups.
Known problems: iommu=force swiotlb=force together breaks
needs more testing.
This patch cleans up x86_64's DMA mapping dispatching code. Right now
we have three possible IOMMU types: AGP GART, swiotlb and nommu, and
in the future we will also have Xen's x86_64 swiotlb and other HW
IOMMUs for x86_64. In order to support all of them cleanly, this
patch:
- introduces a struct dma_mapping_ops with function pointers for each
of the DMA mapping operations of gart (AMD HW IOMMU), swiotlb
(software IOMMU) and nommu (no IOMMU).
- gets rid of:
if (swiotlb)
return swiotlb_xxx();
- PCI_DMA_BUS_IS_PHYS is now checked against the dma_ops being set
This makes swiotlb faster by avoiding double copying in some cases.
Signed-Off-By: Muli Ben-Yehuda <mulix@mulix.org>
Signed-Off-By: Jon D. Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I know several people using MAGIC_SYSRQ not for kernel debugging but for
trying to do a halfway normal shutdown in case of problems.
Since there's no technical reason why MAGIC_SYSRQ would have to depend on
DEBUG_KERNEL, I'm therefore suggesting to drop this dependency.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert atomic_dec_and_lock to use new atomic primitives.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the default behaviour for the remap operators in bitmap, cpumask and
nodemask.
As previously submitted, the pair of masks <A, B> defined a map of the
positions of the set bits in A to the corresponding bits in B. This is still
true.
The issue is how to map the other positions, corresponding to the unset (0)
bits in A. As previously submitted, they were all mapped to the first set bit
position in B, a constant map.
When I tried to code per-vma mempolicy rebinding using these remap operators,
I realized this was wrong.
This patch changes the default to map all the unset bit positions in A to the
same positions in B, the identity map.
For example, if A has bits 4-7 set, and B has bits 9-12 set, then the map
defined by the pair <A, B> maps each bit position in the first 32 bits as
follows:
0 ==> 0
...
3 ==> 3
4 ==> 9
...
7 ==> 12
8 ==> 8
9 ==> 9
...
31 ==> 31
This now corresponds to the typical behaviour desired when migrating pages and
policies from one cpuset to another.
The pages on nodes within the original cpuset, and the references in memory
policies to nodes within the original cpuset, are migrated to the
corresponding cpuset-relative nodes in the destination cpuset. Other pages
and node references are left untouched.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shrink the height of a radix tree when it is partially truncated - we only do
shrinkage of full truncation at present.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Correctly determine the tags to be cleared in radix_tree_delete() so we
don't keep moving up the tree clearing tags that we don't need to. For
example, if a tag is simply not set in the deleted item, nor anywhere up
the tree, radix_tree_delete() would attempt to clear it up the entire
height of the tree.
Also, tag_set() was made conditional so as not to dirty too many cachelines
high up in the radix tree. Instead, put this logic into
radix_tree_tag_set().
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce helper any_tag_set() rather than repeat the same code sequence 4
times.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Export a number of features required to build all the modules. It also
implements the following simple features:
(*) csum_partial_copy_from_user() for MMU as well as no-MMU.
(*) __ucmpdi2().
so that they can be exported too.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sanitize some s390 Kconfig options. We have ARCH_S390, ARCH_S390X,
ARCH_S390_31, 64BIT, S390_SUPPORT and COMPAT. Replace these 6 options by
S390, 64BIT and COMPAT.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch cleans up the alloc_bootmem fix for swiotlb. Patch removes
alloc_bootmem_*_limit api and fixes alloc_boot_*low api to do the right
thing -- allocate from low32 memory.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
bad_range is supposed to be a temporary check. It would be a pity to throw it
out. Make it depend on CONFIG_DEBUG_VM instead.
CONFIG_HOLES_IN_ZONE systems were relying on this to check pfn_valid in the
page allocator. Add that to page_is_buddy instead.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lib/lib.a(kobject_uevent.o)(.text+0x25f): In function `kobject_uevent':
: undefined reference to `__alloc_skb'
lib/lib.a(kobject_uevent.o)(.text+0x2a1): In function `kobject_uevent':
: undefined reference to `skb_over_panic'
lib/lib.a(kobject_uevent.o)(.text+0x31d): In function `kobject_uevent':
: undefined reference to `skb_over_panic'
lib/lib.a(kobject_uevent.o)(.text+0x356): In function `kobject_uevent':
: undefined reference to `netlink_broadcast'
lib/lib.a(kobject_uevent.o)(.init.text+0x9): In function `kobject_uevent_init':
: undefined reference to `netlink_kernel_create'
make: *** [.tmp_vmlinux1] Error 1
Netlink is unconditionally enabled if CONFIG_NET, so that's OK.
kobject_uevent.o is compiled even if !CONFIG_HOTPLUG, which is lazy.
Let's compound the sin.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The klist reference counting in the find functions that use
klist_iter_init_node is broken. If the function (for example
driver_find_device) is called with a NULL start object then everything is
fine, the first call to next_device()/klist_next increases the ref-count of
the first node on the list and does nothing for the start object which is
NULL.
If they are called with a valid start object then klist_next will decrement
the ref-count for the start object but nobody has incremented it. Logical
place to fix this would be klist_iter_init_node because the function puts a
reference of the object into the klist_iter struct.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frank Pavlic <pavlic@de.ibm.com>
Cc: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The distinction between hotplug and uevent does not make sense these
days, netlink events are the default.
udev depends entirely on netlink uevents. Only during early boot and
in initramfs, /sbin/hotplug is needed. So merge the two functions and
provide only one interface without all the options.
The netlink layer got a nice generic interface with named slots
recently, which is probably a better facility to plug events for
subsystem specific events.
Also the new poll() interface to /proc/mounts is a nicer way to
notify about changes than sending events through the core.
The uevents should only be used for driver core related requests to
userspace now.
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The names of these events have been confusing from the beginning
on, as they have been more like claim/release events. We needed these
events for noticing HAL if storage devices have been mounted.
Thanks to Al, we have the proper solution now and can poll()
/proc/mounts instead to get notfied about mount tree changes.
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It makes zero sense to have hotplug, but not the netlink
events enabled today. Remove this option and merge the
kobject_uevent.h header into the kobject.h header file.
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a spinlock debugging check hits, we print the CPU number as an
informational thing - but there is no guarantee that preemption is off
at that point - hence we should use raw_smp_processor_id(). Otherwise
DEBUG_PREEMPT will print a warning.
With this fix the warning goes away and only the spinlock-debugging info
is printed.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The overflow checking condition in lib/swiotlb.c was wrong.
It would first run a NULL pointer through virt_to_phys before
testing it. Since pci_map_sg overflow is not that uncommon
and causes data corruption (including broken file systems) when not
properly detected I think it's better to fix it in 2.6.15.
This affects x86-64 and IA64.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
genalloc improperly stores the sizes of freed chunks, allocates overlapping
memory regions, and oopses after its in-band data is overwritten.
Signed-off-by: Chris Humbert <mahadri-kernel@drigon.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reiser4 uses radix trees to solve a trouble reiser4_readdir has serving nfs
requests.
Unfortunately, radix tree api lacks an operation suitable for modifying
existing entry. This patch adds radix_tree_lookup_slot which returns pointer
to found item within the tree. That location can be then updated.
Both Nick and Christoph Lameter have patches which need this as well.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I recently picked up my older work to remove unnecessary #includes of
sched.h, starting from a patch by Dave Jones to not include sched.h
from module.h. This reduces the number of indirect includes of sched.h
by ~300. Another ~400 pointless direct includes can be removed after
this disentangling (patch to follow later).
However, quite a few indirect includes need to be fixed up for this.
In order to feed the patches through -mm with as little disturbance as
possible, I've split out the fixes I accumulated up to now (complete for
i386 and x86_64, more archs to follow later) and post them before the real
patch. This way this large part of the patch is kept simple with only
adding #includes, and all hunks are independent of each other. So if any
hunk rejects or gets in the way of other patches, just drop it. My scripts
will pick it up again in the next round.
Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of (char *) casts removed in a previous cleanup patch in
lib/string.c:memmove() were actually useful, as they suppressed a couple of
warnings:
assignment discards qualifiers from pointer target type
Fix by declaring the local variable const in the first place, so casts
aren't needed to strip the const qualifier.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is a rewrite of the one submitted on October 1st, using modules
(http://marc.theaimsgroup.com/?l=linux-kernel&m=112819093522998&w=2).
This rewrite adds a tristate CONFIG_RCU_TORTURE_TEST, which enables an
intense torture test of the RCU infratructure. This is needed due to the
continued changes to the RCU infrastructure to accommodate dynamic ticks,
CPU hotplug, realtime, and so on. Most of the code is in a separate file
that is compiled only if the CONFIG variable is set. Documentation on how
to run the test and interpret the output is also included.
This code has been tested on i386 and ppc64, and an earlier version of the
code has received extensive testing on a number of architectures as part of
the PREEMPT_RT patchset.
Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They aren't used anywhere in that file.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix-up the CONFIG_FRAME_POINTER help text language a bit.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the forthcoming task migration support, a key calculation will be
mapping cpu and node numbers from the old set to the new set while
preserving cpuset-relative offset.
For example, if a task and its pages on nodes 8-11 are being migrated to
nodes 24-27, then pages on node 9 (the 2nd node in the old set) should be
moved to node 25 (the 2nd node in the new set.)
As with other bitmap operations, the proper way to code this is to provide
the underlying calculation in lib/bitmap.c, and then to provide the usual
cpumask and nodemask wrappers.
This patch provides that. These operations are termed 'remap' operations.
Both remapping a single bit and a set of bits is supported.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The first two hunks of the patch really belongs in patch 1, but I missed
them on the first pass and instead of redoing all 3 patches I stuck them in
this one.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Removes a few pointless register keywords. register is merely a compiler
hint that access to the variable should be optimized, but gcc (3.3.6 in my
case) generates the exact same code with and without the keyword, and even
if gcc did something different with register present I think it is doubtful
we would want to optimize access to these variables - especially since this
is generic library code and there are supposed to be optimized versions in
asm/ for anything that really matters speed wise.
(akpm: iirc, keyword register is a gcc no-op unless using -O0)
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Removes some blank lines, removes some trailing whitespace, adds spaces
after commas and a few similar changes.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add CONFIG_X86_32 for i386. This allows selecting options that only apply
to 32-bit systems.
(X86 && !X86_64) becomes X86_32
(X86 || X86_64) becomes X86
Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is no need to include module.h in inflate.c
Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix a bug which was reported and diagnosed by
Stefan Jones <stefan.jones@churchillrandoms.co.uk>
IDR trees include a cache of idr_layer objects. There's no way to destroy
this cache, so when we discard an overall idr tree we end up leaking some
memory.
Add and use idr_destroy() for this. v9fs and infiniband also need to use
idr_destroy() to avoid leaks.
Or, we make the cache global, like radix_tree_preload(). Which is probably
better. Later.
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
changes to swiotlb.c made in commit 281dd25cdc
since this file has been moved from arch/ia64/lib/swiotlb.c to
lib/swiotlb.c
Signed-off-by: Tony Luck <tony.luck@intel.com>
This still leaves driver and architecture-specific subdirectories alone,
but gets rid of the bulk of the "generic" generated files that we should
ignore.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- added typedef unsigned int __nocast gfp_t;
- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix nocast sparse warnings:
include/linux/textsearch.h:165:57: warning: implicit cast to nocast type
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Wilcox pointed out that swiotlb.c implements a generic
interface that is not tied to just PCI. Remove includes of
<linux/pci.h>, <asm/pci.h>. Fix comments and printk() messages
to no longer refer to PCI.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Change comment at top of swiotlb.c to reflect that the code is shared
with EM64T (i.e. Intel x86_64). Also add an entry for myself so that
if I "broke it", everyone knows who "bought it"... :-)
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The current implementation of sync_single in swiotlb.c chokes on
DMA_BIDIRECTIONAL mappings. This patch adds the capability to sync
those mappings, and optimizes other syncs by accounting for the
sync target (i.e. cpu or device) in addition to the DMA direction of
the mapping.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch implements swiotlb_sync_single_range_for_{cpu,device}. This
is intended to support an x86_64 implementation of
dma_sync_single_range_for_{cpu,device}.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The implementations of swiotlb_sync_single_for_{cpu,device} are
identical. Likewise for swiotlb_syng_sg_for_{cpu,device}. This patch
move the guts of those functions to two new inline functions, and
calls the appropriate one from the bodies of those functions.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The swiotlb implementation is shared by both IA-64 and EM64T. However,
the source itself lives under arch/ia64. This patch moves swiotlb.c
from arch/ia64/lib to lib/ and fixes-up the appropriate Makefile and
Kconfig files. No actual changes are made to swiotlb.c.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro. Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.
Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the following small cleanups:
- make two needlessly global functions static
- every file should #include the header files containing the prototypes
of it's global functions
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the sparse warning "implicit cast to nocast type"
Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch (written by me and also containing many suggestions of Arjan van
de Ven) does a major cleanup of the spinlock code. It does the following
things:
- consolidates and enhances the spinlock/rwlock debugging code
- simplifies the asm/spinlock.h files
- encapsulates the raw spinlock type and moves generic spinlock
features (such as ->break_lock) into the generic code.
- cleans up the spinlock code hierarchy to get rid of the spaghetti.
Most notably there's now only a single variant of the debugging code,
located in lib/spinlock_debug.c. (previously we had one SMP debugging
variant per architecture, plus a separate generic one for UP builds)
Also, i've enhanced the rwlock debugging facility, it will now track
write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too.
All locks have lockup detection now, which will work for both soft and hard
spin/rwlock lockups.
The arch-level include files now only contain the minimally necessary
subset of the spinlock code - all the rest that can be generalized now
lives in the generic headers:
include/asm-i386/spinlock_types.h | 16
include/asm-x86_64/spinlock_types.h | 16
I have also split up the various spinlock variants into separate files,
making it easier to see which does what. The new layout is:
SMP | UP
----------------------------|-----------------------------------
asm/spinlock_types_smp.h | linux/spinlock_types_up.h
linux/spinlock_types.h | linux/spinlock_types.h
asm/spinlock_smp.h | linux/spinlock_up.h
linux/spinlock_api_smp.h | linux/spinlock_api_up.h
linux/spinlock.h | linux/spinlock.h
/*
* here's the role of the various spinlock/rwlock related include files:
*
* on SMP builds:
*
* asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
* initializers
*
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel
* implementations, mostly inline assembly code
*
* (also included on UP-debug builds:)
*
* linux/spinlock_api_smp.h:
* contains the prototypes for the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
*
* on UP builds:
*
* linux/spinlock_type_up.h:
* contains the generic, simplified UP spinlock type.
* (which is an empty structure on non-debug builds)
*
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* linux/spinlock_up.h:
* contains the __raw_spin_*()/etc. version of UP
* builds. (which are NOPs on non-debug, non-preempt
* builds)
*
* (included on UP-non-debug builds:)
*
* linux/spinlock_api_up.h:
* builds the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
*/
All SMP and UP architectures are converted by this patch.
arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should
be mostly fine.
From: Grant Grundler <grundler@parisc-linux.org>
Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
Builds 32-bit SMP kernel (not booted or tested). I did not try to build
non-SMP kernels. That should be trivial to fix up later if necessary.
I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids
some ugly nesting of linux/*.h and asm/*.h files. Those particular locks
are well tested and contained entirely inside arch specific code. I do NOT
expect any new issues to arise with them.
If someone does ever need to use debug/metrics with them, then they will
need to unravel this hairball between spinlocks, atomic ops, and bit ops
that exist only because parisc has exactly one atomic instruction: LDCW
(load and clear word).
From: "Luck, Tony" <tony.luck@intel.com>
ia64 fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjanv@infradead.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add the crc16 routines, as used by w1 devices.
Signed-off-by: Ben Gardner <bgardner@wabtec.com>
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The problem is that klists claim to provide semantics for safe traversal of
lists which are being modified. The failure case is when traversal of a
list causes element removal (a fairly common case). The issue is that
although the list node is refcounted, if it is embedded in an object (which
is universally the case), then the object will be freed regardless of the
klist refcount leading to slab corruption because the klist iterator refers
to the prior element to get the next.
The solution is to make the klist take and release references to the
embedding object meaning that the embedding object won't be released until
the list relinquishes the reference to it.
(akpm: fast-track this because it's needed for the 2.6.13 scsi merge)
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Simple patch to radix_tree_tag_get() to return different values for non
present node and tag unset.
The function is not used by any in-kernel callers (yet), but this
information is definitely useful.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- There is frequent use of indirections in the radix code. This patch
removes those indirections, makes the code more readable and allows
the compilers to generate better code.
- Removing indirections allows the removal of several casts.
- Removing indirections allows the reduction of the radix_tree_path
size from 3 to 2 words.
- Use pathp-> consistently.
- Remove unnecessary tmp variable in radix_tree_insert
- Separate the upper layer processing from the lowest layer in __lookup()
in order to make it easier to understand what is going on and allow
compilers to generate better code for the loop.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a new kernel debug feature: CONFIG_DETECT_SOFTLOCKUP.
When enabled then per-CPU watchdog threads are started, which try to run
once per second. If they get delayed for more than 10 seconds then a
callback from the timer interrupt detects this condition and prints out a
warning message and a stack dump (once per lockup incident). The feature
is otherwise non-intrusive, it doesnt try to unlock the box in any way, it
only gets the debug info out, automatically, and on all CPUs affected by
the lockup.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-Off-By: Matthias Urlichs <smurf@smurf.noris.de>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
at the moment, the list_head semantics are
list_add(node, head)
whereas current klist semantics are
klist_add(head, node)
This is bound to cause confusion, and since klist is the newcomer, it
should follow the list_head semantics.
I also added missing include guards to klist.h
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch moves the common code in x86 and x86-64's semaphore.c into a
single file in lib/semaphore-sleepers.c. The arch specific asm stubs are
left in the arch tree (in semaphore.c for i386 and in the asm for x86-64).
There should be no changes in code/functionality with this patch.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Attached the implementation of the Boyer-Moore string search
algorithm for the new textsearch infrastructure.
I've added as well a note about the limitations that this approach
presents, as Thomas has remarked.
Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
netlink_broadcast users must initialize NETLINK_CB(skb).dst_groups to the
destination group mask for netlink_recvmsg.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Remove bogus code for compiling netlink as module
- Add module refcounting support for modules implementing a netlink
protocol
- Add support for autoloading modules that implement a netlink protocol
as soon as someone opens a socket for that protocol
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an off by one problem with idr_get_new_above.
The comment and function name suggest that it will return an id >
starting_id, but it actually returned an id >= starting_id, and kernel
callers other than inotify treated it as such.
The patch below fixes the comment, and fixes inotifys usage. The
function name still doesn't match the behaviour, but it never did.
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
handling of %t... (ptrdiff_t) in vsnprintf
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It turns out that empty distance code tables are not an error, and that
a compressed block with only literals can validly have an empty table
and should not be flagged as a data error.
Some old versions of gzip had problems with this case, but it does not
affect the zlib code in the kernel.
Analysis and explanations thanks to Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes a typo in lib/crc32.c which results in incorrect debug
output.
Signed-off-by: Dominik Hackl <dominik@hackl.dhs.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For some reason halfmd4 isn't being linked into the kernel any more and
modular ext3 wants it.
So statically link the halfmd4 code into the kernel.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.
If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated. In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.
The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes use of ALIGN() to remove duplicate round-up code.
Signed-off-by: Nick Wilson <njw@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Do not present these confusing new options to the user
unless he picked some facility that makes use of it,
such as NET_EMATCH_TEXT.
Signed-off-by: David S. Miller <davem@davemloft.net>
A finite state machine consists of n states (struct ts_fsm_token)
representing the pattern as a finite automation. The data is read
sequentially on a octet basis. Every state token specifies the number
of recurrences and the type of value accepted which can be either a
specific character or ctype based set of characters. The available
type of recurrences include 1, (0|1), [0 n], and [1 n].
The algorithm differs between strict/non-strict mode specyfing
whether the pattern has to start at the first octect. Strict mode
is enabled by default and can be disabled by inserting
TS_FSM_HEAD_IGNORE as the first token in the chain.
The runtime performance of the algorithm should be around O(n),
however while in strict mode the average runtime can be better.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implements a linear-time string-matching algorithm due to Knuth,
Morris, and Pratt [1]. Their algorithm avoids the explicit
computation of the transition function DELTA altogether. Its
matching time is O(n), for n being length(text), using just an
auxiliary function PI[1..m], for m being length(pattern),
precomputed from the pattern in time O(m). The array PI allows
the transition function DELTA to be computed efficiently
"on the fly" as needed. Roughly speaking, for any state
"q" = 0,1,...,m and any character "a" in SIGMA, the value
PI["q"] contains the information that is independent of "a" and
is needed to compute DELTA("q", "a") [2]. Since the array PI
has only m entries, whereas DELTA has O(m|SIGMA|) entries, we
save a factor of |SIGMA| in the preprocessing time by computing
PI rather than DELTA.
[1] Cormen, Leiserson, Rivest, Stein
Introdcution to Algorithms, 2nd Edition, MIT Press
[2] See finite automation theory
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The textsearch infrastructure provides text searching
facitilies for both linear and non-linear data.
Individual search algorithms are implemented in modules
and chosen by the user.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains the ia64 uncached page allocator and the generic
allocator (genalloc). The uncached allocator was formerly part of the SN2
mspec driver but there are several other users of it so it has been split
off from the driver.
The generic allocator can be used by device driver to manage special memory
etc. The generic allocator is based on the allocator from the sym53c8xx_2
driver.
Various users on ia64 needs uncached memory. The SGI SN architecture requires
it for inter-partition communication between partitions within a large NUMA
cluster. The specific user for this is the XPC code. Another application is
large MPI style applications which use it for synchronization, on SN this can
be done using special 'fetchop' operations but it also benefits non SN
hardware which may use regular uncached memory for this purpose. Performance
of doing this through uncached vs cached memory is pretty substantial. This
is handled by the mspec driver which I will push out in a seperate patch.
Rather than creating a specific allocator for just uncached memory I came up
with genalloc which is a generic purpose allocator that can be used by device
drivers and other subsystems as they please. For instance to handle onboard
device memory. It was derived from the sym53c7xx_2 driver's allocator which
is also an example of a potential user (I am refraining from modifying sym2
right now as it seems to have been under fairly heavy development recently).
On ia64 memory has various properties within a granule, ie. it isn't safe to
access memory as uncached within the same granule as currently has memory
accessed in cached mode. The regular system therefore doesn't utilize memory
in the lower granules which is mixed in with device PAL code etc. The
uncached driver walks the EFI memmap and pulls out the spill uncached pages
and sticks them into the uncached pool. Only after these chunks have been
utilized, will it start converting regular cached memory into uncached memory.
Hence the reason for the EFI related code additions.
Signed-off-by: Jes Sorensen <jes@wildopensource.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.
The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.
Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.
In the new code, there are two externally visible symbols:
- smp_processor_id(): debug variant.
- raw_smp_processor_id(): nondebug variant. Replaces all existing
uses of _smp_processor_id() and __smp_processor_id(). Defined
by every SMP architecture in include/asm-*/smp.h.
There is one new internal symbol, dependent on DEBUG_PREEMPT:
- debug_smp_processor_id(): internal debug variant, mapped to
smp_processor_id().
Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file. All related comments got updated and/or
clarified.
I have build/boot tested the following 8 .config combinations on x86:
{SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}
I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other
architectures are untested, but should work just fine.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes overrun of array pa:
92 struct idr_layer *pa[MAX_LEVEL];
in
98 l = idp->layers;
99 pa[l--] = NULL;
by passing idp->layers, set in
202 idp->layers = layers;
to function sub_alloc in
203 v = sub_alloc(idp, ptr, &id);
Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This klist interface provides a couple of structures that wrap around
struct list_head to provide explicit list "head" (struct klist) and
list "node" (struct klist_node) objects. For struct klist, a spinlock
is included that protects access to the actual list itself. struct
klist_node provides a pointer to the klist that owns it and a kref
reference count that indicates the number of current users of that node
in the list.
The entire point is to provide an interface for iterating over a list
that is safe and allows for modification of the list during the
iteration (e.g. insertion and removal), including modification of the
current node on the list.
It works using a 3rd object type - struct klist_iter - that is declared
and initialized before an iteration. klist_next() is used to acquire the
next element in the list. It returns NULL if there are no more items.
This klist interface provides a couple of structures that wrap around
struct list_head to provide explicit list "head" (struct klist) and
list "node" (struct klist_node) objects. For struct klist, a spinlock
is included that protects access to the actual list itself. struct
klist_node provides a pointer to the klist that owns it and a kref
reference count that indicates the number of current users of that node
in the list.
The entire point is to provide an interface for iterating over a list
that is safe and allows for modification of the list during the
iteration (e.g. insertion and removal), including modification of the
current node on the list.
It works using a 3rd object type - struct klist_iter - that is declared
and initialized before an iteration. klist_next() is used to acquire the
next element in the list. It returns NULL if there are no more items.
Internally, that routine takes the klist's lock, decrements the reference
count of the previous klist_node and increments the count of the next
klist_node. It then drops the lock and returns.
There are primitives for adding and removing nodes to/from a klist.
When deleting, klist_del() will simply decrement the reference count.
Only when the count goes to 0 is the node removed from the list.
klist_remove() will try to delete the node from the list and block
until it is actually removed. This is useful for objects (like devices)
that have been removed from the system and must be freed (but must wait
until all accessors have finished).
Internally, that routine takes the klist's lock, decrements the reference
count of the previous klist_node and increments the count of the next
klist_node. It then drops the lock and returns.
There are primitives for adding and removing nodes to/from a klist.
When deleting, klist_del() will simply decrement the reference count.
Only when the count goes to 0 is the node removed from the list.
klist_remove() will try to delete the node from the list and block
until it is actually removed. This is useful for objects (like devices)
that have been removed from the system and must be freed (but must wait
until all accessors have finished).
Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
diff -Nru a/include/linux/klist.h b/include/linux/klist.h
kobject: make kobject's name const char * since users should not
attempt to change it (except by calling kobject_rename).
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
kobject: kobject_hotplug should use kobject_name() instead of
accessing kobj->name directly since for objects with
long names it can contain garbage.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Until now, FRAME_POINTER was set = DEBUG_INFO for UML. Change it to be the
default way, so that it can be enabled alone (for instance to get better
backtraces on crashes). The call-trace dumper which uses the frame pointer is
not yet in, I'm going to introduce it in a separate patch.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>