OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	32c15bb978	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: [MIPS] time: Move R4000 clockevent device code to separate configurable file [MIPS] time: Delete dead cycles_per_jiffy, mips_timer_ack and null_timer_ack [MIPS] IP32: Retire use of plat_timer_setup. [MIPS] Jazz: Retire use of plat_timer_setup. [MIPS] IP27: Convert to clock_event_device. [MIPS] JMR3927: Convert to clock_event_device. [MIPS] Always do the ARC64_TWIDDLE_PC thing.	2007-10-18 14:51:02 -07:00
Linus Torvalds	a57793651f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits) [IPV6]: Fix again the fl6_sock_lookup() fixed locking [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels [IPV6]: Lost locking in fl6_sock_lookup [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required [NET]: Fix OOPS due to missing check in dev_parse_header(). [TCP]: Remove lost_retrans zero seqno special cases [NET]: fix carrier-on bug? [NET]: Fix uninitialised variable in ip_frag_reasm() [IPSEC]: Rename mode to outer_mode and add inner_mode [IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP [IPSEC]: Use the top IPv4 route's peer instead of the bottom [IPSEC]: Store afinfo pointer in xfrm_mode [IPSEC]: Add missing BEET checks [IPSEC]: Move type and mode map into xfrm_state.c [IPSEC]: Fix length check in xfrm_parse_spi [IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi [IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input ...	2007-10-18 14:40:30 -07:00
Linus Torvalds	9cf52b2921	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC/64]: Consolidate of_register_driver [SPARC] Videopix Frame Grabber: Convert device_lock_sem to mutex [SPARC]: Support for new termios. [SPARC64]: Check of_get_property() return in pci_determine_mem_io_space(). [SPARC64]: Fix boot failures due to bootmem. [SPARC64]: Implement atomic backoff.	2007-10-18 14:39:44 -07:00
Corey Minyard	d8c98618f4	IPMI: add 0.9 support Add support for IPMI 0.9 systems to the IPMI driver. Just handle a shorter get device ID command with less information. Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Stian Jordet <liste@jordet.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:32 -07:00
Corey Minyard	fcfa472411	IPMI: add polled interface Currently the IPMI watchdog timer sets the watchdog timeout on a panic, but it doesn't actually poll the interface to make sure the message goes out. Add an interface for polling the IPMI driver, and add code to the IPMI watchdog timer to poll the interface when the timer is set from a panic. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:32 -07:00
Ralf Baechle	e8c44319c6	Replace __attribute_pure__ with __pure To be consistent with the use of attributes in the rest of the kernel replace all use of __attribute_pure__ with __pure and delete the definition of __attribute_pure__. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:32 -07:00
Miklos Szeredi	0e9663ee45	fuse: add blksize field to fuse_attr There are cases when the filesystem will be passed the buffer from a single read or write call, namely: 1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go through the page cache, but go directly to the userspace fs 2) currently buffered writes are done with single page requests, but if Nick's ->perform_write() patch goes it, it will be possible to do larger write requests. But only if the original write() was also bigger than a page. In these cases the filesystem might want to give a hint to the app about the optimal I/O size. Allow the userspace filesystem to supply a blksize value to be returned by stat() and friends. If the field is zero, it defaults to the old PAGE_CACHE_SIZE value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	f33321141b	fuse: add support for mandatory locking For mandatory locking the userspace filesystem needs to know the lock ownership for read, write and truncate operations. This patch adds the necessary fields to the protocol. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	b25e82e567	fuse: add helper for asynchronous writes This patch adds a new helper function fuse_write_fill() which makes it possible to send WRITE requests asynchronously. A new flag for WRITE requests is also added which indicates that this a write from the page cache, and not a "normal" file write. This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	a9ff4f8705	fuse: support BSD locking semantics It is trivial to add support for flock(2) semantics to the existing protocol, by setting the lock owner field to the file pointer, and passing a new FUSE_LK_FLOCK flag with the locking request. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	6ff958edbf	fuse: add atomic open+truncate support This patch allows fuse filesystems to implement open(..., O_TRUNC) as a single request, instead of separate truncate and open requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	17637cbaba	fuse: improve utimes support Add two new flags for setattr: FATTR_ATIME_NOW and FATTR_MTIME_NOW. These mean, that atime or mtime should be changed to the current time. Also it is now possible to update atime or mtime individually, not just together. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	d139d7ffd0	VFS: allow filesystems to implement atomic open+truncate Add a new attribute flag ATTR_OPEN, with the meaning: "truncation was initiated by open() due to the O_TRUNC flag". This way filesystems wanting to implement truncation within their ->open() method can ignore such truncate requests. This is a quick & dirty hack, but it comes for free. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	c79e322f63	fuse: add file handle to getattr operation Add necessary protocol changes for supplying a file handle with the getattr operation. Step the API version to 7.9. This patch doesn't actually supply the file handle, because that needs some kind of VFS support, which we haven't yet been able to agree upon. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Takashi Sato	0f0a89ebe1	ext3: support large blocksize up to PAGESIZE This patch set supports large block size(>4k, <=64k) in ext3 just enlarging the block size limit. But it is NOT possible to have 64kB blocksize on ext3 without some changes to the directory handling code. The reason is that an empty 64kB directory block would have a rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in the filesystem. The proposed solution is treat 64k rec_len with a an impossible value like rec_len = 0xffff to handle this. The Patch-set consists of the following 2 patches. [1/2] ext3: enlarge blocksize - Allow blocksize up to pagesize [2/2] ext3: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Now on 64k page ppc64 box runs with this patch set we could create a 64k block size ext3, and able to handle empty directory block. Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	b8dc93cbe9	bit_spin_lock: use lock bitops Convert bit_spin_lock to new locking bitops. Slub can use the non-atomic store version to clear (Christoph?) Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	66ffb04ca5	powerpc: lock bitops Add non-trivial lock bitops implementation for powerpc. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	728697cd6b	mips: lock bitops mips can avoid one mb when acquiring a lock with test_and_set_bit_lock. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	c8f30ae547	mips: fix bitops Documentation/atomic_ops.txt defines these primitives must contain a memory barrier both before and after their memory operation. This is consistent with the atomic ops implementation on mips. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	87371e4fa4	ia64: lock bitops Convert ia64 to new bitops. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	44086d5286	alpha: lock bitops Alpha can avoid one mb when acquiring a lock with test_and_set_bit_lock. [bunk@kernel.org: alpha bitops.h must #include <asm/barrier.h>] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	7c29ca5b8d	alpha: fix bitops Documentation/atomic_ops.txt defines these primitives must contain a memory barrier both before and after their memory operation. This is consistent with the atomic ops implementation on alpha. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	26333576fd	bitops: introduce lock ops Introduce test_and_set_bit_lock / clear_bit_unlock bitops with lock semantics. Convert all architectures to use the generic implementation. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-By: David Howells <dhowells@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Andi Kleen <ak@muc.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Satyam Sharma	761bb43190	Redefine {un}register_hotcpu_notifier() !HOTPLUG_CPU stubs The return of the present "do {} while" based stub definition of register_hotcpu_notifier() cannot be checked. This makes the stub asymmetric w.r.t. the real HOTPLUG_CPU=y implementation that is int-returning. So let us redefine this to be consistent with the full version. Also do the same for unregister_hotcpu_notifier(). We cannot define these as static inline functions due to an existing GCC bug (#33172). So define as macros that return appropriately instead (int '0' for the register_hotcpu_notifier case and void for unregister_hotcpu_notifier). Signed-off-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Michael Neuling	4603ac180a	powerpc: add scaled time accounting This adds POWERPC specific hooks for scaled time accounting. POWER6 includes a SPURR register. The SPURR is based off the PURR register but is scaled based on CPU frequency and issue rates. This gives a more accurate account of the instructions used per task. The PURR and timebase will be constant relative to the wall clock, irrespective of the CPU frequency. This implementation reads the SPURR register in account_system_vtime which is only call called on context witch and hard and soft irq entry and exit. The percentage of user and system time is then estimated using the ratio of these accounted by the PURR. If the SPURR is not present, the PURR read. An earlier implementation of this patch read the SPURR whenever the PURR was read, which included the system call entry and exit path. Unfortunately this showed a performance regression on lmbench runs, so was re-implemented. I've included the lmbench results here when run bare metal on POWER6. 1st column is the unpatch results. 2nd column is the results using the below patch and the 3rd is the % diff of these results from the base. 4th and 5th columns are the results and % differnce from the base using the older patch (SPURR read in syscall entry/exit path). Base Scaled-Acct SPURR-in-syscall Result Result % diff Result % diff Simple syscall: 0.3086 0.3086 0.0000 0.3452 11.8600 Simple read: 0.4591 0.4671 1.7425 0.5044 9.86713 Simple write: 0.4364 0.4366 0.0458 0.4731 8.40971 Simple stat: 2.0055 2.0295 1.1967 2.0669 3.06158 Simple fstat: 0.5962 0.5876 -1.442 0.6368 6.80979 Simple open/close: 3.1283 3.1009 -0.875 3.2088 2.57328 Select on 10 fd's: 0.8554 0.8457 -1.133 0.8667 1.32101 Select on 100 fd's: 3.5292 3.6329 2.9383 3.6664 3.88756 Select on 250 fd's: 7.9097 8.1881 3.5197 8.2242 3.97613 Select on 500 fd's: 15.2659 15.836 3.7357 15.873 3.97814 Select on 10 tcp fd's: 0.9576 0.9416 -1.670 0.9752 1.83792 Select on 100 tcp fd's: 7.248 7.2254 -0.311 7.2685 0.28283 Select on 250 tcp fd's: 17.7742 17.707 -0.375 17.749 -0.1406 Select on 500 tcp fd's: 35.4258 35.25 -0.496 35.286 -0.3929 Signal handler installation: 0.6131 0.6075 -0.913 0.647 5.52927 Signal handler overhead: 2.0919 2.1078 0.7600 2.1831 4.35967 Protection fault: 0.7345 0.7478 1.8107 0.8031 9.33968 Pipe latency: 33.006 16.398 -50.31 33.475 1.42368 AF_UNIX sock stream latency: 14.5093 30.910 113.03 30.715 111.692 Process fork+exit: 219.8 222.8 1.3648 229.37 4.35623 Process fork+execve: 876.14 873.28 -0.32 868.66 -0.8533 Process fork+/bin/sh -c: 2830 2876.5 1.6431 2958 4.52296 File /var/tmp/XXX write bw: 1193497 1195536 0.1708 118657 -0.5799 Pagefaults on /var/tmp/XXX: 3.1272 3.2117 2.7020 3.2521 3.99398 Also, kernel compile times show no difference with this patch applied. [pbadari@us.ibm.com: Avoid unnecessary PURR reading] Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Michael Neuling	f494f8fcb1	add-scaled-time-to-taskstats-based-process-accounting fix This moves the new items to the end of the taskstats struct as requested by Balbir and yourself. Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Michael Neuling	c66f08be7e	Add scaled time to taskstats based process accounting This adds items to the taststats struct to account for user and system time based on scaling the CPU frequency and instruction issue rates. Adds account_(user\|system)_time_scaled callbacks which architectures can use to account for time using this mechanism. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Jiri Slaby	65f76a82ec	Char: cyclades, fix some -W warnings Most of them are signedness, the rest unused function parameters. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:26 -07:00
Jiri Slaby	ebafeeff0f	Char: cyclades, remove bottom half processing The work done in bottom half doesn't cost much cpu time (e.g. tty_hangup itself schedules its own bottom half), it's possible to do the work in isr directly and save hence some .text. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:26 -07:00
Andrew Morgan	72c2d5823f	V3 file capabilities: alter behavior of cap_setpcap The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1, can change the capabilities of another process, p2. This is not the meaning that was intended for this capability at all, and this implementation came about purely because, without filesystem capabilities, there was no way to use capabilities without one process bestowing them on another. Since we now have a filesystem support for capabilities we can fix the implementation of CAP_SETPCAP. The most significant thing about this change is that, with it in effect, no process can set the capabilities of another process. The capabilities of a program are set via the capability convolution rules: pI(post-exec) = pI(pre-exec) pP(post-exec) = (X(aka cap_bset) & fP) \| (pI(post-exec) & fI) pE(post-exec) = fE ? pP(post-exec) : 0 at exec() time. As such, the only influence the pre-exec() program can have on the post-exec() program's capabilities are through the pI capability set. The correct implementation for CAP_SETPCAP (and that enabled by this patch) is that it can be used to add extra pI capabilities to the current process - to be picked up by subsequent exec()s when the above convolution rules are applied. Here is how it works: Let's say we have a process, p. It has capability sets, pE, pP and pI. Generally, p, can change the value of its own pI to pI' where (pI' & ~pI) & ~pP = 0. That is, the only new things in pI' that were not present in pI need to be present in pP. The role of CAP_SETPCAP is basically to permit changes to pI beyond the above: if (pE & CAP_SETPCAP) { pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */ } This capability is useful for things like login, which (say, via pam_cap) might want to raise certain inheritable capabilities for use by the children of the logged-in user's shell, but those capabilities are not useful to or needed by the login program itself. One such use might be to limit who can run ping. You set the capabilities of the 'ping' program to be "= cap_net_raw+i", and then only shells that have (pI & CAP_NET_RAW) will be able to run it. Without CAP_SETPCAP implemented as described above, login(pam_cap) would have to also have (pP & CAP_NET_RAW) in order to raise this capability and pass it on through the inheritable set. Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:24 -07:00
Eric W. Biederman	fc6cd25b73	sysctl: Error on bad sysctl tables After going through the kernels sysctl tables several times it has become clear that code review and testing is just not effective in prevent problematic sysctl tables from being used in the stable kernel. I certainly can't seem to fix the problems as fast as they are introduced. Therefore this patch adds sysctl_check_table which is called when a sysctl table is registered and checks to see if we have a problematic sysctl table. The biggest part of the code is the table of valid binary sysctl entries, but since we have frozen our set of binary sysctls this table should not need to change, and it makes it much easier to detect when someone unintentionally adds a new binary sysctl value. As best as I can determine all of the several hundred errors spewed on boot up now are legitimate. [bunk@kernel.org: kernel/sysctl_check.c must #include <linux/string.h>] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	f429cd37a2	sysctl: properly register the irda binary sysctl numbers Grumble. These numbers should have been in sysctl.h from the beginning if we ever expected anyone to use them. Oh well put them there now so we can find them and make maintenance easier. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	25398a158d	sysctl: parport remove binary paths The sysctl binary paths don't look as if they even code work, .data is not filled in, and all of the proc_handlers look at extra1 and there is not strategy routine. So just kill the binary paths. In addition this patch removes the setting of extra1 on directories. It doesn't look like the parport code ever examines it, and it's bad sysctl form. [bunk@kernel.org: remove parport_device_num()] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	49a0c45833	sysctl: Factor out sysctl_data. There as been no easy way to wrap the default sysctl strategy routine except for returning 0. Which is not always what we want. The few instances I have seen that want different behaviour have written their own version of sysctl_data. While not too hard it is unnecessary code and has the potential for extra bugs. So to make these situations easier and make that part of sysctl more symetric I have factord sysctl_data out of do_sysctl_strategy and exported as a function everyone can use. Further having sysctl_data be an explicit function makes checking for badly formed sysctl tables much easier. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Eric W. Biederman	d8217f076b	sysctl core: Stop using the unnecessary ctl_table typedef In sysctl.h the typedef struct ctl_table ctl_table violates coding style isn't needed and is a bit of a nuisance because it makes it harder to recognize ctl_table is a type name. So this patch removes it from the generic sysctl code. Hopefully I will have enough energy to send the rest of my patches will follow and to remove it from the rest of the kernel. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Andrew Morton	8f286c33f1	stop using DMA_xxBIT_MASK Now that we have DMA_BIT_MASK(), these macros are pointless. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:21 -07:00
Borislav Petkov	34c6538413	unify DMA_..BIT_MASK definitions: v3.1 Remove redundant DMA_..BIT_MASK definitions across two drivers. The computation of the majority of the bitmasks is done by the compiler. The initial split of the patch touching each a different file got removed due to possible git bisect breakage. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Reviewed-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:21 -07:00
Tony Breeds	2c62214831	Fix discrepancy between VDSO based gettimeofday() and sys_gettimeofday(). On platforms that copy sys_tz into the vdso (currently only x86_64, soon to include powerpc), it is possible for the vdso to get out of sync if a user calls (admittedly unusual) settimeofday(NULL, ptr). This patch adds a hook for architectures that set CONFIG_GENERIC_TIME_VSYSCALL to ensure when sys_tz is updated they can also updatee their copy in the vdso. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Andi Kleen <ak@suse.de> Cc: Tony Luck <tony.luck@intel.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:20 -07:00
Alexey Dobriyan	6212e3a388	Remove struct task_struct::io_wait Hell knows what happened in commit 63b05203af57e7de4f3bb63b8b81d43bc196d32b during 2.6.9 development. Commit introduced io_wait field which remained write-only than and still remains write-only. Also garbage collect macros which "use" io_wait. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:20 -07:00
Rafael J. Wysocki	c7e0831d38	Hibernation: Check if ACPI is enabled during restore in the right place The following scenario leads to total confusion of the platform firmware on some boxes (eg. HPC nx6325): * Hibernate with ACPI enabled * Resume passing "acpi=off" to the boot kernel To prevent this from happening it's necessary to check if ACPI is enabled (and enable it if that's not the case) _right_ _after_ control has been transfered from the boot kernel to the image kernel, before device_power_up() is called (ie. with interrupts disabled). Enabling ACPI after calling device_power_up() turns out to be insufficient. For this reason, introduce new hibernation callback ->leave() that will be executed before device_power_up() by the restored image kernel. To make it work, it also is necessary to move swsusp_suspend() from swsusp.c to disk.c (it's name is changed to "create_image", which is more up to the point). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:20 -07:00
Rafael J. Wysocki	d158cbdf39	Hibernation: Arbitrary boot kernel support on x86_64 Make it possible to restore a hibernation image on x86_64 with the help of a kernel different from the one in the image. The idea is to split the core restoration code into two separate parts and to place each of them in a different page. The first part belongs to the boot kernel and is executed as the last step of the image kernel's memory restoration procedure. Before being executed, it is relocated to a safe page that won't be overwritten while copying the image kernel pages. The final operation performed by it is a jump to the second part of the core restoration code that belongs to the image kernel and has just been restored. This code makes the CPU switch to the image kernel's page tables and restores the state of general purpose registers (including the stack pointer) from before the hibernation. The main issue with this idea is that in order to jump to the second part of the core restoration code the boot kernel needs to know its address. However, this address may be passed to it in the image header. Namely, the part of the image header previously used for checking if the version of the image kernel is correct can be replaced with some architecture specific data that will allow the boot kernel to jump to the right address within the image kernel. These data should also be used for checking if the image kernel is compatible with the boot kernel (as far as the memory restroration procedure is concerned). It can be done, for example, with the help of a "magic" value that has to be equal in both kernels, so that they can be regarded as compatible. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:19 -07:00
Andres Salomon	8f4ce8c32f	serial: turn serial console suspend a boot rather than compile time option Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop the serial console from being suspended when the rest of the machine goes to sleep. This is incredibly useful for debugging power management-related things; however, having it as a compile-time option has proved to be incredibly inconvenient for us (OLPC). There are plenty of times that we want serial console to not suspend, but for the most part we'd like serial console to be suspended. This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel boot parameter (no_console_suspend). By default, the serial console will be suspended along with the rest of the system; by passing 'no_console_suspend' to the kernel during boot, serial console will remain alive during suspend. For now, this is pretty serial console specific; further fixes could be applied to make this work for things like netconsole. Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:19 -07:00
Rafael J. Wysocki	e42837bcd3	freezer: introduce freezer-friendly waiting macros Introduce freezer-friendly wrappers around wait_event_interruptible() and wait_event_interruptible_timeout(), originally defined in <linux/wait.h>, to be used in freezable kernel threads. Make some of the freezable kernel threads use them. This is necessary for the freezer to stop sending signals to kernel threads, which is implemented in the next patch. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:19 -07:00
Rafael J. Wysocki	b3dac3b304	PM: Rename hibernation_ops to platform_hibernation_ops Rename 'struct hibernation_ops' to 'struct platform_hibernation_ops' in analogy with 'struct platform_suspend_ops'. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	74f270af0c	PM: Rework struct hibernation_ops During hibernation we also need to tell the ACPI core that we're going to put the system into the S4 sleep state. For this reason, an additional method in 'struct hibernation_ops' is needed, playing the role of set_target() in 'struct platform_suspend_operations'. Moreover, the role of the .prepare() method is now different, so it's better to introduce another method, that in general may be different from .prepare(), that will be used to prepare the platform for creating the hibernation image (.prepare() is used anyway to notify the platform that we're going to enter the low power state after the image has been saved). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	f242d9196f	PM: Make suspend_ops static The variable suspend_ops representing the set of global platform-specific suspend-related operations, used by the PM core, need not be exported outside of kernel/power/main.c . Make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	e6c5eb9541	PM: Rework struct platform_suspend_ops There is no reason why the .prepare() and .finish() methods in 'struct platform_suspend_ops' should take any arguments, since architectures don't use these methods' argument in any practically meaningful way (ie. either the target system sleep state is conveyed to the platform by .set_target(), or there is only one suspend state supported and it is indicated to the PM core by .valid(), or .prepare() and .finish() aren't defined at all). There also is no reason why .finish() should return any result. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	26398a70ea	PM: Rename struct pm_ops and related things The name of 'struct pm_ops' suggests that it is related to the power management in general, but in fact it is only related to suspend. Moreover, its name should indicate what this structure is used for, so it seems reasonable to change it to 'struct platform_suspend_ops'. In that case, the name of the global variable of this type used by the PM core and the names of related functions should be changed accordingly. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	95d9ffbe01	PM: Move definition of struct pm_ops to suspend.h Move the definition of 'struct pm_ops' and related functions from <linux/pm.h> to <linux/suspend.h> . There are, at least, the following reasons to do that: * 'struct pm_ops' is specifically related to suspend and not to the power management in general. * As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it causes the entire kernel to be recompiled, which is unnecessary and annoying. * Some suspend-related features are already defined in <linux/suspend.h>, so it is logical to move the definition of 'struct pm_ops' into there. * 'struct hibernation_ops', being the hibernation-related counterpart of 'struct pm_ops', is defined in <linux/suspend.h> . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Anton Blanchard	04c227140f	hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier Pull the copy_to_user out of hrtimer_nanosleep and into the callers (common_nsleep, sys_nanosleep) in preparation for converting compat_sys_nanosleep to use hrtimers. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-18 22:54:18 +02:00
Jeff Garzik	3be6cbd73f	[libata] kill ata_sg_is_last() Short term, this works around a bug introduced by early sg-chaining work. Long term, removing this function eliminates a branch from a hot path loop in each scatter/gather table build. Also, as this code demonstrates, we don't need to _track_ the end of the s/g list, as long as we mark it in some way. And doing so programatically is nice. So its a useful cleanup, regardless of its short term effects. Based conceptually on a quick patch by Jens Axboe. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-18 16:21:18 -04:00
Ken Chen	480b9434c5	sched: reduce schedstat variable overhead a bit schedstat is useful in investigating CPU scheduler behavior. Ideally, I think it is beneficial to have it on all the time. However, the cost of turning it on in production system is quite high, largely due to number of events it collects and also due to its large memory footprint. Most of the fields probably don't need to be full 64-bit on 64-bit arch. Rolling over 4 billion events will most like take a long time and user space tool can be made to accommodate that. I'm proposing kernel to cut back most of variable width on 64-bit system. (note, the following patch doesn't affect 32-bit system). Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-18 21:32:56 +02:00
Ralf Baechle	42f77542f4	[MIPS] time: Move R4000 clockevent device code to separate configurable file Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-10-18 18:11:47 +01:00
Ralf Baechle	2cfa7660db	[MIPS] time: Delete dead cycles_per_jiffy, mips_timer_ack and null_timer_ack cycles_per_jiffy was only ever getting assigned and the function pointer not being called anymore and mips_timer_ack had gotten similarly stale. I leave the remaining assignments unfixed as a lighthouse pointing platform maintainers to what needs a rewrite. These changes make null_timer_ack() unreferenced, so delete that too. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-10-18 18:11:47 +01:00
Thomas Bogendoerfer	15ad838d28	[MIPS] Always do the ARC64_TWIDDLE_PC thing. Always jump to the place where the kernel is linked to. This helps where the bootloaders/proms ignores the start address inside the ELF header. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-10-18 18:11:46 +01:00
Li Zefan	009e8c965f	[NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required Macros like SCTP_CHUNKMAP_XXX(chukmap) require chukmap to be an array, but match_packet() passes a pointer to these macros. Also remove the ELEMCOUNT macro and fix a bug in SCTP_CHUNKMAP_COPY. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:12:21 -07:00
Patrick McHardy	1b83336bb9	[NET]: Fix OOPS due to missing check in dev_parse_header(). [ This is kernel bugzilla 9174 "linux-2.6.23-git11 kernel panic" ] The device in question is an IPv6-over-IPv4 tunnel, which doesn't have any header_ops, so the crash happens in dev_parse_header when dereferencing them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:09:28 -07:00
Kyle McMartin	6cc4525d29	[PARISC] Kill off broken irqstack code It's been unfinished and broken long enough, and I have some ideas on how to do it more cleanly. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:59:31 -07:00
Sam Ravnborg	1c59357109	[PARISC] Kill off ASM_PAGE_SIZE use We have the macro _AC() generally available now so the calculation of PAGE_SIZE can be made assembler compatible. Introduce use of _AC() and kill all users of ASM_PAGE_SIZE. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:59:15 -07:00
Adrian Bunk	f13cec8447	[PARISC] parisc: "extern inline" -> "static inline" "extern inline" will have different semantics with gcc 4.3, and "static inline" is correct here. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:58:41 -07:00
Kyle McMartin	218c998caa	[PARISC] Clean up asm-parisc/pdc.h If we're going to export the header, at least let's organize it sensibly and not have a mishmash of userspace, assembly, and kernel visible defines. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:58:33 -07:00
Jeff Bailey	c75ac712df	[PARISC] Export pdc.h for palo Signed-off-by: Jeff Bailey <jbailey@raspberryginger.com> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:58:29 -07:00
Kyle McMartin	2cfc5be7df	[PARISC] Wire up sys_fallocate (and compat_sys_fallocate) Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>	2007-10-18 00:58:26 -07:00
Herbert Xu	13996378e6	[IPSEC]: Rename mode to outer_mode and add inner_mode This patch adds a new field to xfrm states called inner_mode. The existing mode object is renamed to outer_mode. This is the first part of an attempt to fix inter-family transforms. As it is we always use the outer family when determining which mode to use. As a result we may end up shoving IPv4 packets into netfilter6 and vice versa. What we really want is to use the inner family for the first part of outbound processing and the outer family for the second part. For inbound processing we'd use the opposite pairing. I've also added a check to prevent silly combinations such as transport mode with inter-family transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:35:51 -07:00
Herbert Xu	17c2a42a24	[IPSEC]: Store afinfo pointer in xfrm_mode It is convenient to have a pointer from xfrm_state to address-specific functions such as the output function for a family. Currently the address-specific policy code calls out to the xfrm state code to get those pointers when we could get it in an easier way via the state itself. This patch adds an xfrm_state_afinfo to xfrm_mode (since they're address-specific) and changes the policy code to use it. I've also added an owner field to do reference counting on the module providing the afinfo even though it isn't strictly necessary today since IPv6 can't be unloaded yet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:33:12 -07:00
Herbert Xu	1bfcb10f67	[IPSEC]: Add missing BEET checks Currently BEET mode does not reinject the packet back into the stack like tunnel mode does. Since BEET should behave just like tunnel mode this is incorrect. This patch fixes this by introducing a flags field to xfrm_mode that tells the IPsec code whether it should terminate and reinject the packet back into the stack. It then sets the flag for BEET and tunnel mode. I've also added a number of missing BEET checks elsewhere where we check whether a given mode is a tunnel or not. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:31:50 -07:00
Herbert Xu	aa5d62cc87	[IPSEC]: Move type and mode map into xfrm_state.c The type and mode maps are only used by SAs, not policies. So it makes sense to move them from xfrm_policy.c into xfrm_state.c. This also allows us to mark xfrm_get_type/xfrm_put_type/xfrm_get_mode/xfrm_put_mode as static. The only other change I've made in the move is to get rid of the casts on the request_module call for types. They're unnecessary because C will promote them to ints anyway. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:31:12 -07:00
Herbert Xu	33b5ecb8f6	[IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi Currently xfrm6_rcv_spi gets the nexthdr value itself from the packet. This means that we need to fix up the value in case we have a 4-on-6 tunnel. Moving this logic into the caller simplifies things and allows us to merge the code with IPv4. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:29:25 -07:00
Herbert Xu	c4541b41c0	[IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input This patch moves the tunnel parsing for IPv4 out of xfrm4_input and into xfrm4_tunnel. This change is in line with what IPv6 does and will allow us to merge the two input functions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:28:53 -07:00
Pavel Emelyanov	47e958eac2	[NET]: Fix the race between sk_filter_(de\|at)tach and sk_clone() The proposed fix is to delay the reference counter decrement until the quiescent state pass. This will give sk_clone() a chance to get the reference on the cloned filter. Regular sk_filter_uncharge can happen from the sk_free() only and there's no need in delaying the put - the socket is dead anyway and is to be release itself. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:22:42 -07:00
Pavel Emelyanov	309dd5fc87	[NET]: Move the filter releasing into a separate call This is done merely as a preparation for the fix. The sk_filter_uncharge() unaccounts the filter memory and calls the sk_filter_release(), which in turn decrements the refcount anf frees the filter. The latter function will be required separately. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:21:51 -07:00
Pavel Emelyanov	55b333253d	[NET]: Introduce the sk_detach_filter() call Filter is attached in a separate function, so do the same for filter detaching. This also removes one variable sock_setsockopt(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:21:26 -07:00
Stephen Rothwell	5c45708352	[SPARC/64]: Consolidate of_register_driver Also of_unregister_driver. These will be shortly also used by the PowerPC code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:17:42 -07:00
Pavel Emelyanov	48d6005638	[INET]: Remove no longer needed ->equal callback Since this callback is used to check for conflicts in hashtable when inserting a newly created frag queue, we can do the same by checking for matching the queue with the argument, used to create one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:47:56 -07:00
Pavel Emelyanov	abd6523d15	[INET]: Consolidate xxx_find() in fragment management Here we need another callback ->match to check whether the entry found in hash matches the key passed. The key used is the same as the creation argument for inet_frag_create. Yet again, this ->match is the same for netfilter and ipv6. Running a frew steps forward - this callback will later replace the ->equal one. Since the inet_frag_find() uses the already consolidated inet_frag_create() remove the xxx_frag_create from protocol codes. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:47:21 -07:00
Pavel Emelyanov	c6fda28229	[INET]: Consolidate xxx_frag_create() This one uses the xxx_frag_intern() and xxx_frag_alloc() routines, which are already consolidated, so remove them from protocol code (as promised). The ->constructor callback is used to init the rest of the frag queue and it is the same for netfilter and ipv6. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:46:47 -07:00
Pavel Emelyanov	e521db9d79	[INET]: Consolidate xxx_frag_alloc() Just perform the kzalloc() allocation and setup common fields in the inet_frag_queue(). Then return the result to the caller to initialize the rest. The inet_frag_alloc() may return NULL, so check the return value before doing the container_of(). This looks ugly, but the xxx_frag_alloc() will be removed soon. The xxx_expire() timer callbacks are patches, because the argument is now the inet_frag_queue, not the protocol specific queue. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:45:23 -07:00
Pavel Emelyanov	2588fe1d78	[INET]: Consolidate xxx_frag_intern This routine checks for the existence of a given entry in the hash table and inserts the new one if needed. The ->equal callback is used to compare two frag_queue-s together, but this one is temporary and will be removed later. The netfilter code and the ipv6 one use the same routine to compare frags. The inet_frag_intern() always returns non-NULL pointer, so convert the inet_frag_queue into protocol specific one (with the container_of) without any checks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:44:34 -07:00
David Miller	6050afbbb0	[SPARC]: Support for new termios. [akpm@linux-foundation.org: coding-style tweaks] Signed-off-by: David Miller <davem@davemloft.net> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:38:10 -07:00
Stephen Hemminger	c264c3dee9	napi_synchronize: waiting for NAPI Some drivers with shared NAPI need a synchronization barrier. Also suggested by Benjamin Herrenschmidt for EMAC. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-17 20:17:34 -04:00
David S. Miller	24f287e412	[SPARC64]: Implement atomic backoff. When the cpu count is high and contention hits an atomic object, the processors can synchronize such that some cpus continually get knocked out and cannot complete the atomic update. So implement an exponential backoff when SMP. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 16:24:55 -07:00
Aneesh Kumar K.V	d8dd0b4543	ext4: Convert ext4_extent_idx.ei_leaf to ext4_extent_idx.ei_leaf_lo Convert ext4_extent_idx.ei_leaf ext4_extent_idx.ei_leaf_lo This helps in finding BUGs due to direct partial access of these split 48 bit values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:03 -04:00
Aneesh Kumar K.V	b377611d11	ext4: Convert ext4_extent.ee_start to ext4_extent.ee_start_lo Convert ext4_extent.ee_start to ext4_extent.ee_start_lo This helps in finding BUGs due to direct partial access of these split 48 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:03 -04:00
Aneesh Kumar K.V	308ba3ece7	ext4: Convert s_r_blocks_count and s_free_blocks_count Convert s_r_blocks_count and s_free_blocks_count to s_r_blocks_count_lo and s_free_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-10-17 18:50:02 -04:00
Aneesh Kumar K.V	6bc9feff14	ext4: Convert s_blocks_count to s_blocks_count_lo Convert s_blocks_count to s_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:02 -04:00
Aneesh Kumar K.V	5272f83727	ext4: Convert bg_inode_bitmap and bg_inode_table Convert bg_inode_bitmap and bg_inode_table to bg_inode_bitmap_lo and bg_inode_table_lo. This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix one direct partial access Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:02 -04:00
Aneesh Kumar K.V	3a14589cce	ext4: Convert bg_block_bitmap to bg_block_bitmap_lo Convert bg_block_bitmap to bg_block_bitmap_lo This helps in catching some BUGS due to direct partial access of these split fields. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:01 -04:00
Jose R. Santos	ce42158179	ext4: FLEX_BG Kernel support v2. This feature relaxes check restrictions on where each block groups meta data is located within the storage media. This allows for the allocation of bitmaps or inode tables outside the block group boundaries in cases where bad blocks forces us to look for new blocks which the owning block group can not satisfy. This will also allow for new meta-data allocation schemes to improve performance and scalability. Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-17 18:50:01 -04:00
Aneesh Kumar K.V	c1bddad949	ext4: Fix sparse warnings Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-17 18:50:01 -04:00
Andreas Dilger	717d50e497	Ext4: Uninitialized Block Groups In pass1 of e2fsck, every inode table in the fileystem is scanned and checked, regardless of whether it is in use. This is this the most time consuming part of the filesystem check. The unintialized block group feature can greatly reduce e2fsck time by eliminating checking of uninitialized inodes. With this feature, there is a a high water mark of used inodes for each block group. Block and inode bitmaps can be uninitialized on disk via a flag in the group descriptor to avoid reading or scanning them at e2fsck time. A checksum of each group descriptor is used to ensure that corruption in the group descriptor's bit flags does not cause incorrect operation. The feature is enabled through a mkfs option mke2fs /dev/ -O uninit_groups A patch adding support for uninitialized block groups to e2fsprogs tools has been posted to the linux-ext4 mailing list. The patches have been stress tested with fsstress and fsx. In performance tests testing e2fsck time, we have seen that e2fsck time on ext3 grows linearly with the total number of inodes in the filesytem. In ext4 with the uninitialized block groups feature, the e2fsck time is constant, based solely on the number of used inodes rather than the total inode count. Since typical ext4 filesystems only use 1-10% of their inodes, this feature can greatly reduce e2fsck time for users. With performance improvement of 2-20 times, depending on how full the filesystem is. The attached graph shows the major improvements in e2fsck times in filesystems with a large total inode count, but few inodes in use. In each group descriptor if we have EXT4_BG_INODE_UNINIT set in bg_flags: Inode table is not initialized/used in this group. So we can skip the consistency check during fsck. EXT4_BG_BLOCK_UNINIT set in bg_flags: No block in the group is used. So we can skip the block bitmap verification for this group. We also add two new fields to group descriptor as a part of uninitialized group patch. __le16 bg_itable_unused; /* Unused inodes count / __le16 bg_checksum; / crc16(sb_uuid+group+desc) */ bg_itable_unused: If we have EXT4_BG_INODE_UNINIT not set in bg_flags then bg_itable_unused will give the offset within the inode table till the inodes are used. This can be used by fsck to skip list of inodes that are marked unused. bg_checksum: Now that we depend on bg_flags and bg_itable_unused to determine the block and inode usage, we need to make sure group descriptor is not corrupt. We add checksum to group descriptor to detect corruption. If the descriptor is found to be corrupt, we mark all the blocks and inodes in the group used. Signed-off-by: Avantika Mathur <mathur@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:00 -04:00
Eric Sandeen	4074fe3736	ext4: remove #ifdef CONFIG_EXT4_INDEX CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext4_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-17 18:50:00 -04:00
Coly Li	f077d0d7ea	ext4: Remove (partial, never completed) fragment support Fragment support in ext2/3/4 was never implemented, and it probably will never be implemented. So remove it from ext4. Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-10-17 18:49:59 -04:00
Mingming Cao	cd02ff0b14	jbd2: JBD_XXX to JBD2_XXX naming cleanup change JBD_XXX macros to JBD2_XXX in JBD2/Ext4 Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-10-17 18:49:58 -04:00
Mingming Cao	2d917969bc	JBD2: replace jbd_kmalloc with kmalloc directly. This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:57 -04:00
Mingming Cao	a5005da204	JBD: replace jbd_kmalloc with kmalloc directly This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:57 -04:00
Mingming Cao	af1e76d6b3	JBD2: jbd2 slab allocation cleanups JBD2: Replace slab allocations with page allocations JBD2 allocate memory for committed_data and frozen_data from slab. However JBD2 should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:56 -04:00
Mingming Cao	c089d490df	JBD: JBD slab allocation cleanups JBD: Replace slab allocations with page allocations JBD allocate memory for committed_data and frozen_data from slab. However JBD should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:56 -04:00
Linus Torvalds	9d8190f87b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: remove sysctl 9p: fix bad kconfig cross-dependency 9p: soften invalidation in loose_mode 9p: attach-per-user 9p: rename uid and gid parameters 9p: define session flags 9p: Make transports dynamic	2007-10-17 15:05:58 -07:00
Linus Torvalds	c2f73fd07d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: net: libertas sdio driver mmc: at91_mci: cleanup: use MCI_ERRORS mmc: possible leak in mmc_read_ext_csd	2007-10-17 14:12:44 -07:00
Pierre Ossman	727c26ed78	net: libertas sdio driver Add driver for Marvell's Libertas 8385 and 8686 wifi chips. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Acked-by: Dan Williams <dcbw@redhat.com>	2007-10-17 22:51:13 +02:00
Linus Torvalds	d20ead9e86	Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86 * ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (114 commits) x86: delete vsyscall files during make clean kbuild: fix typo SRCARCH in find_sources x86: fix kernel rebuild due to vsyscall fallout .gitignore update for x86 arch x86: unify include/asm/debugreg_32/64.h x86: unify include/asm/unwind_32/64.h x86: unify include/asm/types_32/64.h x86: unify include/asm/tlb_32/64.h x86: unify include/asm/siginfo_32/64.h x86: unify include/asm/bug_32/64.h x86: unify include/asm/mman_32/64.h x86: unify include/asm/agp_32/64.h x86: unify include/asm/kdebug_32/64.h x86: unify include/asm/ioctls_32/64.h x86: unify include/asm/floppy_32/64.h x86: apply missing DMA/OOM prevention to floppy_32.h x86: unify include/asm/cache_32/64.h x86: unify include/asm/cache_32/64.h x86: unify include/asm/dmi_32/64.h x86: unify include/asm/delay_32/64.h ...	2007-10-17 13:13:16 -07:00
Eric Van Hensbergen	982c37cfb6	9p: remove sysctl A sysctl method was added to enable and disable debugging levels. After further review, it was decided that there are better approaches to doing this and the sysctl methodology isn't really desirable. This patch removes the sysctl code from 9p. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:35:15 -05:00
Eric Van Hensbergen	fb0466c3ae	9p: fix bad kconfig cross-dependency This patch moves transport dynamic registration and matching to the net module to prevent a bad Kconfig dependency between the net and fs 9p modules. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:31:07 -05:00
Latchesar Ionkov	ba17674fe0	9p: attach-per-user The 9P2000 protocol requires the authentication and permission checks to be done in the file server. For that reason every user that accesses the file server tree has to authenticate and attach to the server separately. Multiple users can share the same connection to the server. Currently v9fs does a single attach and executes all I/O operations as a single user. This makes using v9fs in multiuser environment unsafe as it depends on the client doing the permission checking. This patch improves the 9P2000 support by allowing every user to attach separately. The patch defines three modes of access (new mount option 'access'): - attach-per-user (access=user) (default mode for 9P2000.u) If a user tries to access a file served by v9fs for the first time, v9fs sends an attach command to the server (Tattach) specifying the user. If the attach succeeds, the user can access the v9fs tree. As there is no uname->uid (string->integer) mapping yet, this mode works only with the 9P2000.u dialect. - allow only one user to access the tree (access=<uid>) Only the user with uid can access the v9fs tree. Other users that attempt to access it will get EPERM error. - do all operations as a single user (access=any) (default for 9P2000) V9fs does a single attach and all operations are done as a single user. If this mode is selected, the v9fs behavior is identical with the current one. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:31:07 -05:00
Eric Van Hensbergen	a80d923e13	9p: Make transports dynamic This patch abstracts out the interfaces to underlying transports so that new transports can be added as modules. This should also allow kernel configuration of transports without ifdef-hell. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-17 14:31:07 -05:00
Thomas Gleixner	21ebddd3ef	x86: unify include/asm/debugreg_32/64.h Almost identical except for the extra DR_LEN_8 and the different DR_CONTROL_RESERVED defines. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Conflicts: include/asm-x86/Kbuild	2007-10-17 20:35:37 +02:00
Thomas Gleixner	3f0bde8353	x86: unify include/asm/unwind_32/64.h 32bit has an extra UNW_FP define, which does not hurt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:32:38 +02:00
Thomas Gleixner	9d256ff51c	x86: unify include/asm/types_32/64.h Mostly the same. Make the few exceptions conditional. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Conflicts: include/asm-x86/types_32.h	2007-10-17 20:32:07 +02:00
Thomas Gleixner	01749f6d6d	x86: unify include/asm/tlb_32/64.h Same file, except for whitespace, comment formatting. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:18 +02:00
Thomas Gleixner	f16ee28854	x86: unify include/asm/siginfo_32/64.h Same file, except for the 64bit PREAMBLE_SIZE define. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:17 +02:00
Thomas Gleixner	68fdc55c48	x86: unify include/asm/bug_32/64.h Same file, except for whitespace, comment formatting and the .long/.quad delta which can be solved by a define. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:16 +02:00
Thomas Gleixner	5c8eec5019	x86: unify include/asm/mman_32/64.h Same file, except for the extra 64bit MAP_32BIT define, which does not hurt for 32 bit compiles. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:15 +02:00
Thomas Gleixner	c4ac82a881	x86: unify include/asm/agp_32/64.h The 32bit D(n) debug addon can be made exclusive for 32 bit compiles. Otherwise all the same. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:13 +02:00
Thomas Gleixner	35cc46119d	x86: unify include/asm/kdebug_32/64.h The 64 bit variant has additional function prototypes which do no harm for 32 bit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:12 +02:00
Thomas Gleixner	612d26a72a	x86: unify include/asm/ioctls_32/64.h Same file, except for whitespace and comment formatting. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:26:11 +02:00
Thomas Gleixner	b515e4767f	x86: unify include/asm/floppy_32/64.h Same file, except for whitespace, comment formatting and: 32-bit: if((unsigned int) addr >= (unsigned int) high_memory) 64-bit: if((unsigned long) addr >= (unsigned long) high_memory) where the latter can be used safely for both. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Conflicts: include/asm-x86/floppy_32.h include/asm-x86/floppy_64.h	2007-10-17 20:24:56 +02:00
Thomas Gleixner	2dc27f01ec	x86: apply missing DMA/OOM prevention to floppy_32.h commit `554d284ba9` added _GPF_NORETRY to floppy_64.h to prevent OOM killer on floppy DMA allocations. Apply the same to the 32 bit variant. Found during the attempt to unify the _32/_64 variants. Seperate commit to document the resulting code change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:22 +02:00
Thomas Gleixner	106619c440	x86: unify include/asm/cache_32/64.h Same file, except for whitespace, comment formatting and the two variants of fb_is_primary_device() Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:21 +02:00
Thomas Gleixner	1f7afb08a5	x86: unify include/asm/cache_32/64.h Same file, except for whitespace, comment formatting and: 32-bit: unsigned long virt_addr = va; 64-bit: unsigned int virt_addr = va; Both can be safely replaced by: u32 i, *virt_addr = va; Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:19 +02:00
Thomas Gleixner	327c21bc3d	x86: unify include/asm/dmi_32/64.h Unification, so we have these things in one file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:18 +02:00
Thomas Gleixner	f1ea05466a	x86: unify include/asm/delay_32/64.h Same file, except for whitespace, comment formatting and the extra function prototype usc_tsc_delay() in _32. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:17 +02:00
Thomas Gleixner	9bfa23df56	x86: unify include/asm/cache_32/64.h Same file, except for whitespace, comment formatting and the extra defines in _64, which are conditional on VSMP anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:15 +02:00
Thomas Gleixner	b2bba72c10	x86: unify include/asm/cacheflush_32/64.h Same file, except for whitespace, comment formatting and the extra DEBUG_PAGE_ALLOC function in _32. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:14 +02:00
Thomas Gleixner	0b4dc7c352	x86: unify include/asm/auxvec_32/64.h Same file, except for whitespace, comment formatting and the AT_SYSINFO define for 32bit Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:13 +02:00
Thomas Gleixner	17d36707dd	x86: unify include/asm/agp_32/64.h Same file, except for whitespace, comment formatting and the usage of wbinvd() instead of asm volatile("wbinvd":::"memory"), which is the same. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:12 +02:00
Thomas Gleixner	003a46cfff	x86: unify some more trivial include/asm-x86/ 32/64 variants Scripted unification. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:10 +02:00
Roland Dreier	217d115cd5	x86: merge some trivially mergeable headers Merge errno.h, resource.h, rtc.h, sections.h, serial.h and sockios.h, where i386 and x86_64 have no or only trivial comment/include guard differences. Build tested on both 32-bit and 64-bit, and booted on 64-bit. [tglx: fixup Kbuild as well] Signed-off-by: Roland Dreier <roland@digitalvampire.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:09 +02:00
Brian Gerst	020bd9f1c7	x86: trivial header merges Merge 32/64-bit headers that simply redirect to asm-generic [tglx: fixup Kbuild as well] Signed-off-by: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:08 +02:00
Luiz Fernando N. Capitulino	de8aacbe6a	x86: convert mm_context_t semaphore to a mutex convert mm_context_t semaphore to a mutex. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:05 +02:00
Jan Beulich	32c464f5d9	x86: multi-byte single instruction NOPs Add support for and use the multi-byte NOPs recently documented to be available on all PentiumPro and later processors. This patch only applies cleanly on top of the "x86: misc. constifications" patch sent earlier. [ tglx: arch/x86 adaptation ] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> arch/x86/kernel/alternative.c \| 23 ++++++++++++++++++++++- include/asm-x86/processor_32.h \| 22 ++++++++++++++++++++++ include/asm-x86/processor_64.h \| 22 ++++++++++++++++++++++ 3 files changed, 66 insertions(+), 1 deletion(-)	2007-10-17 20:17:04 +02:00
Luiz Fernando N. Capitulino	c7537ab234	x86: convert mm_context_t semaphore to a mutex [ tglx: arch/x86 adaptation ] Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:17:00 +02:00
Joe Korty	38e760a133	x86: expand /proc/interrupts to include missing vectors, v2 Add missing IRQs and IRQ descriptions to /proc/interrupts. /proc/interrupts is most useful when it displays every IRQ vector in use by the system, not just those somebody thought would be interesting. This patch inserts the following vector displays to the i386 and x86_64 platforms, as appropriate: rescheduling interrupts TLB flush interrupts function call interrupts thermal event interrupts threshold interrupts spurious interrupts A threshold interrupt occurs when ECC memory correction is occuring at too high a frequency. Thresholds are used by the ECC hardware as occasional ECC failures are part of normal operation, but long sequences of ECC failures usually indicate a memory chip that is about to fail. Thermal event interrupts occur when a temperature threshold has been exceeded for some CPU chip. IIRC, a thermal interrupt is also generated when the temperature drops back to a normal level. A spurious interrupt is an interrupt that was raised then lowered by the device before it could be fully processed by the APIC. Hence the apic sees the interrupt but does not know what device it came from. For this case the APIC hardware will assume a vector of 0xff. Rescheduling, call, and TLB flush interrupts are sent from one CPU to another per the needs of the OS. Typically, their statistics would be used to discover if an interrupt flood of the given type has been occuring. AK: merged v2 and v4 which had some more tweaks AK: replace Local interrupts with Local timer interrupts AK: Fixed description of interrupt types. [ tglx: arch/x86 adaptation ] [ mingo: small cleanup ] Signed-off-by: Joe Korty <joe.korty@ccur.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Tim Hockin <thockin@hockin.org> Cc: Andi Kleen <ak@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:53 +02:00
Thomas Gleixner	f6a2e7f201	x86: unify include/asm/ldt_32/64.h The additional struct member of user_desc can be made conditional for 64 bit compiles. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 20:16:47 +02:00
Thomas Gleixner	686d8c63d5	x86: unify include/asm/ptrace-abi_32/64.h Aside of the register defines the content can be shared. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 20:16:45 +02:00
Thomas Gleixner	e2f430291f	x86: unify include/asm/mce_32/64.h Merge the files together. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 20:16:44 +02:00
Andrew Morton	3c215b6680	x86: asm-i386/io.h fix constness - Fix this: include/asm/io.h: In function `memcpy_fromio': include/asm/io.h:208: warning: passing argument 2 of `__memcpy' discards qualifiers from pointer target type - Clean up code a bit Reported-by: Uwe Bugla <uwe.bugla@gmx.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:40 +02:00
Adrian Bunk	95c1e9aefa	x86: visws extern inline to static inline "extern inline" will have different semantics with gcc 4.3. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:39 +02:00
Thomas Gleixner	6b556ffc4b	x86: cleanup 64bit unistd.h sys_iopl is long gone and there is no reason to declare sys_rt_sigaction here. Remove it all together and fix the whitespace mess as well. It's worth the trouble: 25897 -> 21337 bytes, the win is larger than the memory of my first computer :) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 20:16:36 +02:00
Satyam Sharma	ffecad95ee	i386: fix argument signedness warnings These build warnings: In file included from include/asm/thread_info.h:16, from include/linux/thread_info.h:21, from include/linux/preempt.h:9, from include/linux/spinlock.h:49, from include/linux/vmalloc.h:4, from arch/i386/boot/compressed/misc.c:14: include/asm/processor.h: In function cpuid_count include/asm/processor.h:615: warning: pointer targets in passing argument 1 of native_cpuid differ in signedness include/asm/processor.h:615: warning: pointer targets in passing argument 2 of native_cpuid differ in signedness include/asm/processor.h:615: warning: pointer targets in passing argument 3 of native_cpuid differ in signedness include/asm/processor.h:615: warning: pointer targets in passing argument 4 of native_cpuid differ in signedness come because the arguments have been specified as pointers to (signed) int types, not unsigned. So let's specify those as unsigned. Do some codingstyle here and there while at it. [ tglx: arch/x86 adaptation ] Signed-off-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:31 +02:00
Adrian Bunk	7e02cb941d	x86: rename .i assembler includes to .h .i is an ending used for preprocessed stuff. This patch therefore renames assembler include files to .h and guards the contents with an #ifdef __ASSEMBLY__. [ tglx: arch/x86 adaptation ] Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:29 +02:00
Steven Rostedt	3f4ed1511d	x86: Add parenthesis to IRQ vector macros It is not good taste to have macros with additions that do not have parenthesises around them. This patch parethesizes the IRQ vector macros for x86_64 arch. Note, this caused me a bit of heart-ache debugging lguest64. [ tglx: arch/x86 adaptation ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:28 +02:00
Chuck Lever	d2ccc3fdde	x86: Eliminate result signage problem in asm-x86_64/bitops.h The return type of __scanbit() doesn't match the return type of find_{first,next}_bit(). Thus when you construct something like this: boolean ? __scanbit() : find_first_bit() you get an unsigned long result if "boolean" is true, and a signed long result if "boolean" is false. In file included from /home/cel/src/linux/include/linux/mmzone.h:15, from /home/cel/src/linux/include/linux/gfp.h:4, from /home/cel/src/linux/include/linux/slab.h:14, from /home/cel/src/linux/include/linux/percpu.h:5, from /home/cel/src/linux/include/linux/rcupdate.h:41, from /home/cel/src/linux/include/linux/dcache.h:10, from /home/cel/src/linux/include/linux/fs.h:275, from /home/cel/src/linux/fs/nfs/sysctl.c:9: /home/cel/src/linux/include/linux/nodemask.h: In function â__first_nodeâ: /home/cel/src/linux/include/linux/nodemask.h:229: warning: signed and unsigned type in conditional expression /home/cel/src/linux/include/linux/nodemask.h: In function â__next_nodeâ: /home/cel/src/linux/include/linux/nodemask.h:235: warning: signed and unsigned type in conditional expression /home/cel/src/linux/include/linux/nodemask.h: In function â__first_unset_nodeâ: /home/cel/src/linux/include/linux/nodemask.h:253: warning: signed and unsigned type in conditional expression [ tglx: arch/x86 adaptation ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:27 +02:00
Glauber de Oliveira Costa	92b2dc79c3	x86: remove STR() macros This patch removes the __STR() and STR() macros from x86_64 header files. They seem to be legacy, and has no more users. Even if there were users, they should use __stringify() instead. In fact, there were one third place in which this macro was defined (ia32_binfmt.c), and used just below. In this file, usage was properly converted to __stringify() [ tglx: arch/x86 adaptation ] Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:25 +02:00
Mike Travis	9efa98159c	x86: remove x86_cpu_to_log_apicid Remove the x86_cpu_to_log_apicid array. It is set in arch/x86_64/kernel/genapic_flat.c:flat_init_apic_ldr() and arch/x86_64/kernel/smpboot.c:do_boot_cpu() but it is never referenced. [ tglx: arch/x86 adaptation ] Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:24 +02:00
Andi Kleen	61d08a9ea3	i386: Remove strrchr assembler implementation The constraints in the inline assembler implementation of i386 strrchr() were incorrect and break the build with recent gcc 4.3. Since there are only very few callers of strrchr() and none of them are performance relevant just remove the assembler implementation and use the C fallback instead. [ tglx: arch/x86 adaptation ] Cc: rguenther@suse.de Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:23 +02:00
Chris Snook	883001f982	x86: make atomic64_t work like atomic_t The volatile keyword has already been removed from the declaration of atomic_t on x86_64. For consistency, remove it from atomic64_t as well. [ tglx: arch/x86 adaptation ] Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de> CC: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:21 +02:00
Adrian Bunk	a850cef77f	i386: no need to make enable_cpu_hotplug a variable As long as there's no write access to this variable there's no reason to let gcc check it at runtime. [ tglx: arch/x86 adaptation ] Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:16 +02:00
H. Peter Anvin	6619a8fb59	x86: Create clflush() inline, remove hardcoded wbinvd Create an inline function for clflush(), with the proper arguments, and use it instead of hard-coding the instruction. This also removes one instance of hard-coded wbinvd, based on a patch by Bauder de Oliveira Costa. [ tglx: arch/x86 adaptation ] Cc: Andi Kleen <andi@firstfloor.org> Cc: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:12 +02:00
Jan Beulich	9689ba8ad0	x86: constify stacktrace_ops .. as they're never written to. [ tglx: arch/x86 adaptation ] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:11 +02:00
Thomas Gleixner	6d43be8ea8	x86: remove reminder of i386 irqstat per cpu conversion The i386 irqstat per cpu conversion left an bogus export of the old irqstat array in the header file. Remove it. [ tglx: arch/x86 adaptation ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-17 20:16:04 +02:00
Andi Kleen	e295f75410	x86_64: Remove serialize_cpu() inline - It was redundant with sync_core() - It was unused - It was broken: no input arguments to cpuid; could fault randomly depending on eax contents. Now it's gone. [ tglx: arch/x86 adaptation ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:16:03 +02:00
Mike Frysinger	6704ab1cd4	x86: hide cond_syscall behind __KERNEL__ This brings x86_64 into line with all other architectures by only defining cond_syscall() when __KERNEL__ is defined. [ tglx: arch/x86 adaptation ] Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:15:55 +02:00
Kirill Korotaev	c1217a75ea	x86: mark read_crX() asm code as volatile Some gcc versions (I checked at least 4.1.1 from RHEL5 & 4.1.2 from gentoo) can generate incorrect code with read_crX()/write_crX() functions mix up, due to cached results of read_crX(). The small app for x8664 below compiled with -O2 demonstrates this (i686 does the same thing):	2007-10-17 20:15:31 +02:00
Siddha, Suresh B	58d5fa7a6a	i386: fix 4 bit apicid assumption of mach-default Fix get_apic_id() in mach-default, so that it uses 8 bits incase of xAPIC case and 4 bits for legacy APIC case. This fixes the i386 kernel assumption that apic id is less than 16 for xAPIC platforms with 8 cpus or less and makes the kernel boot on such platforms. [ tglx: arch/x86 adaptation ] Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:15:24 +02:00
Laurent Vivier	6442eea937	i386: export i386 smp_call_function_mask() to modules This patch export i386 smp_call_function_mask() with EXPORT_SYMBOL(). This function is needed by KVM to call a function on a set of CPUs. [ tglx: arch/x86 adaptation ] Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:15:21 +02:00
Andrew Morton	afc54659b1	x86: clean up apicid_to_node declaration Use the correct #define in the declaration of apicid_to_node[], to match the definition. [ tglx: arch/x86 adaptation ] Cc: Andi Kleen <ak@suse.de> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-17 20:15:16 +02:00
Linus Torvalds	fb9fc39517	Merge branch 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: xfs: eagerly remove vmap mappings to avoid upsetting Xen xen: add some debug output for failed multicalls xen: fix incorrect vcpu_register_vcpu_info hypercall argument xen: ask the hypervisor how much space it needs reserved xen: lock pte pages while pinning/unpinning xen: deal with stale cr3 values when unpinning pagetables xen: add batch completion callbacks xen: yield to IPI target if necessary Clean up duplicate includes in arch/i386/xen/ remove dead code in pgtable_cache_init paravirt: clean up lazy mode handling paravirt: refactor struct paravirt_ops into smaller pv_*_ops	2007-10-17 11:10:11 -07:00
Ralf Baechle	9d360ab4a7	[MIPS] Alchemy: Renumber interrupts so irq_cpu can work. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-10-17 18:28:48 +01:00
Ralf Baechle	f3e8d1da38	[MIPS] Alchemy: Fix build by conversion to irq_cpu.c. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-10-17 18:28:48 +01:00
Linus Torvalds	e6d5a11dad	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: fix new task startup crash sched: fix !SYSFS build breakage sched: fix improper load balance across sched domain sched: more robust sd-sysctl entry freeing	2007-10-17 09:11:18 -07:00
Linus Torvalds	c548f08a4f	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits) [POWERPC] Fix vmemmap warning in init_64.c [POWERPC] Fix 64 bits vDSO DWARF info for CR register [POWERPC] Add 1TB workaround for PA6T [POWERPC] Enable NO_HZ and high res timers for pseries and ppc64 configs [POWERPC] Quieten cache information at boot [POWERPC] Quieten clockevent printk [POWERPC] Enable SLUB in *_defconfig [POWERPC] Fix 1TB segment detection [POWERPC] Fix iSeries_hpte_insert prototype [POWERPC] Fix copyright symbol [POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers [POWERPC] ibmebus: Add device creation and bus probing based on of_device [POWERPC] ibmebus: Remove bus match/probe/remove functions [POWERPC] Move of_device allocation into of_device.[ch] [POWERPC] mpc52xx: device tree changes for FEC and MDIO [POWERPC] bestcomm: GenBD task support [POWERPC] bestcomm: FEC task support [POWERPC] bestcomm: ATA task support [POWERPC] bestcomm: core bestcomm support for Freescale MPC5200 [POWERPC] mpc52xx: Update mpc52xx_psc structure with B revision changes ...	2007-10-17 09:05:55 -07:00
Linus Torvalds	5c8e191e84	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup: Remove magic macros for screen_info structure members [x86] remove uses of magic macros for boot_params access	2007-10-17 09:00:30 -07:00
Adrian Bunk	cbfee34520	security/ cleanups This patch contains the following cleanups that are now possible: - remove the unused security_operations->inode_xattr_getsuffix - remove the no longer used security_operations->unregister_security - remove some no longer required exit code - remove a bunch of no longer used exports Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Serge E. Hallyn	b53767719b	Implement file posix capabilities Implement file posix capabilities. This allows programs to be given a subset of root's powers regardless of who runs them, without having to use setuid and giving the binary all of root's powers. This version works with Kaigai Kohei's userspace tools, found at http://www.kaigai.gr.jp/index.php. For more information on how to use this patch, Chris Friedhoff has posted a nice page at http://www.friedhoff.org/fscaps.html. Changelog: Nov 27: Incorporate fixes from Andrew Morton (security-introduce-file-caps-tweaks and security-introduce-file-caps-warning-fix) Fix Kconfig dependency. Fix change signaling behavior when file caps are not compiled in. Nov 13: Integrate comments from Alexey: Remove CONFIG_ ifdef from capability.h, and use %zd for printing a size_t. Nov 13: Fix endianness warnings by sparse as suggested by Alexey Dobriyan. Nov 09: Address warnings of unused variables at cap_bprm_set_security when file capabilities are disabled, and simultaneously clean up the code a little, by pulling the new code into a helper function. Nov 08: For pointers to required userspace tools and how to use them, see http://www.friedhoff.org/fscaps.html. Nov 07: Fix the calculation of the highest bit checked in check_cap_sanity(). Nov 07: Allow file caps to be enabled without CONFIG_SECURITY, since capabilities are the default. Hook cap_task_setscheduler when !CONFIG_SECURITY. Move capable(TASK_KILL) to end of cap_task_kill to reduce audit messages. Nov 05: Add secondary calls in selinux/hooks.c to task_setioprio and task_setscheduler so that selinux and capabilities with file cap support can be stacked. Sep 05: As Seth Arnold points out, uid checks are out of place for capability code. Sep 01: Define task_setscheduler, task_setioprio, cap_task_kill, and task_setnice to make sure a user cannot affect a process in which they called a program with some fscaps. One remaining question is the note under task_setscheduler: are we ok with CAP_SYS_NICE being sufficient to confine a process to a cpuset? It is a semantic change, as without fsccaps, attach_task doesn't allow CAP_SYS_NICE to override the uid equivalence check. But since it uses security_task_setscheduler, which elsewhere is used where CAP_SYS_NICE can be used to override the uid equivalence check, fixing it might be tough. task_setscheduler note: this also controls cpuset:attach_task. Are we ok with CAP_SYS_NICE being used to confine to a cpuset? task_setioprio task_setnice sys_setpriority uses this (through set_one_prio) for another process. Need same checks as setrlimit Aug 21: Updated secureexec implementation to reflect the fact that euid and uid might be the same and nonzero, but the process might still have elevated caps. Aug 15: Handle endianness of xattrs. Enforce capability version match between kernel and disk. Enforce that no bits beyond the known max capability are set, else return -EPERM. With this extra processing, it may be worth reconsidering doing all the work at bprm_set_security rather than d_instantiate. Aug 10: Always call getxattr at bprm_set_security, rather than caching it at d_instantiate. [morgan@kernel.org: file-caps clean up for linux/capability.h] [bunk@kernel.org: unexport cap_inode_killpriv] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morgan <morgan@kernel.org> Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Alexey Dobriyan	57c521ce61	ifdef struct task_struct::security For those who don't care about CONFIG_SECURITY. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
James Morris	20510f2f4e	security: Convert LSM into a static interface Convert LSM into a static interface, as the ability to unload a security module is not required by in-tree users and potentially complicates the overall security architecture. Needlessly exported LSM symbols have been unexported, to help reduce API abuse. Parameters for the capability and root_plug modules are now specified at boot. The SECURITY_FRAMEWORK_VERSION macro has also been removed. In a nutshell, there is no safe way to unload an LSM. The modular interface is thus unecessary and broken infrastructure. It is used only by out-of-tree modules, which are often binary-only, illegal, abusive of the API and dangerous, e.g. silently re-vectoring SELinux. [akpm@linux-foundation.org: cleanups] [akpm@linux-foundation.org: USB Kconfig fix] [randy.dunlap@oracle.com: fix LSM kernel-doc] Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Dave Hansen	ce8d2cdf3d	r/o bind mounts: filesystem helpers for custom 'struct file's Why do we need r/o bind mounts? This feature allows a read-only view into a read-write filesystem. In the process of doing that, it also provides infrastructure for keeping track of the number of writers to any given mount. This has a number of uses. It allows chroots to have parts of filesystems writable. It will be useful for containers in the future because users may have root inside a container, but should not be allowed to write to somefilesystems. This also replaces patches that vserver has had out of the tree for several years. It allows security enhancement by making sure that parts of your filesystem read-only (such as when you don't trust your FTP server), when you don't want to have entire new filesystems mounted, or when you want atime selectively updated. I've been using the following script to test that the feature is working as desired. It takes a directory and makes a regular bind and a r/o bind mount of it. It then performs some normal filesystem operations on the three directories, including ones that are expected to fail, like creating a file on the r/o mount. This patch: Some filesystems forego the vfs and may_open() and create their own 'struct file's. This patch creates a couple of helper functions which can be used by these filesystems, and will provide a unified place which the r/o bind mount code may patch. Also, rename an existing, static-scope init_file() to a less generic name. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Bjorn Helgaas	402b310cb6	PNP: remove null pointer checks Remove some null pointer checks. Null pointers in these areas indicate programming errors, and I think it's better to oops immediately rather than return an error that is easily ignored. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Adam Belay <ambx1@neo.rr.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Adrian Bunk	5ebf2c1260	bitmap.h: remove dead artifacts bitmap_active() no longer exists and BITMAP_ACTIVE is no longer used. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Martin J. Bligh	a686cd898b	ext2 reservations Val's cross-port of the ext3 reservations code into ext2. [mbligh@mbligh.org: Small type error for printk [akpm@linux-foundation.org: fix types, sync with ext3] [mbligh@mbligh.org: Bring ext2 reservations code in line with latest ext3] [akpm@linux-foundation.org: kill noisy printk] [akpm@linux-foundation.org: remember to dirty the gdp's block] [akpm@linux-foundation.org: cross-port the missed `5dea5176e5`] [akpm@linux-foundation.org: cross-port `e6022603b9`] [akpm@linux-foundation.org: Port the omitted `08fb306fe6`] [akpm@linux-foundation.org: Backport the missed `20acaa18d0`] [akpm@linux-foundation.org: fixes] [cmm@us.ibm.com: fix reservation extension] [bunk@stusta.de: make ext2_get_blocks() static] [hugh@veritas.com: fix hang] [hugh@veritas.com: ext2_new_blocks should reset the reservation window size] [hugh@veritas.com: ext2 balloc: fix off-by-one against rsv_end] [hugh@veritas.com: grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv should treat it as such] [hugh@veritas.com: rbtree usage cleanup] [pbadari@us.ibm.com: Fix for ext2 reservation] [bunk@kernel.org: remove fs/ext2/balloc.c:reserve_blocks()] [hugh@veritas.com: ext2 balloc: use io_error label] Cc: "Martin J. Bligh" <mbligh@mbligh.org> Cc: Valerie Henson <val_henson@linux.intel.com> Cc: Mingming Cao <cmm@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Joern Engel	1c0eeaf569	introduce I_SYNC I_LOCK was used for several unrelated purposes, which caused deadlock situations in certain filesystems as a side effect. One of the purposes now uses the new I_SYNC bit. Also document the various bits and change their order from historical to logical. [bunk@stusta.de: make fs/inode.c:wake_up_inode() static] Signed-off-by: Joern Engel <joern@wohnheim.fh-wedel.de> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: David Chinner <dgc@sgi.com> Cc: Anton Altaparmakov <aia21@cam.ac.uk> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Fengguang Wu	2e6883bdf4	writeback: introduce writeback_control.more_io to indicate more io After making dirty a 100M file, the normal behavior is to start the writeback for all data after 30s delays. But sometimes the following happens instead: - after 30s: ~4M - after 5s: ~4M - after 5s: all remaining 92M Some analyze shows that the internal io dispatch queues goes like this: s_io s_more_io ------------------------- 1) 100M,1K 0 2) 1K 96M 3) 0 96M 1) initial state with a 100M file and a 1K file 2) 4M written, nr_to_write <= 0, so write more 3) 1K written, nr_to_write > 0, no more writes(BUG) nr_to_write > 0 in (3) fools the upper layer to think that data have all been written out. The big dirty file is actually still sitting in s_more_io. We cannot simply splice s_more_io back to s_io as soon as s_io becomes empty, and let the loop in generic_sync_sb_inodes() continue: this may starve newly expired inodes in s_dirty. It is also not an option to draw inodes from both s_more_io and s_dirty, an let the loop go on: this might lead to live locks, and might also starve other superblocks in sync time(well kupdate may still starve some superblocks, that's another bug). We have to return when a full scan of s_io completes. So nr_to_write > 0 does not necessarily mean that "all data are written". This patch introduces a flag writeback_control.more_io to indicate this situation. With it the big dirty file no longer has to wait for the next kupdate invocation 5s later. Cc: David Chinner <dgc@sgi.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Fengguang Wu	08d8e9749e	writeback: fix ntfs with sb_has_dirty_inodes() NTFS's if-condition on dirty inodes is not complete. Fix it with sb_has_dirty_inodes(). Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Ken Chen	0e0f4fc22e	writeback: fix periodic superblock dirty inode flushing Current -mm tree has bucketful of bug fixes in periodic writeback path. However, we still hit a glitch where dirty pages on a given inode aren't completely flushed to the disk, and system will accumulate large amount of dirty pages beyond what dirty_expire_interval is designed for. The problem is __sync_single_inode() will move an inode to sb->s_dirty list even when there are more pending dirty pages on that inode. If there is another inode with a small number of dirty pages, we hit a case where the loop iteration in wb_kupdate() terminates prematurely because wbc.nr_to_write > 0. Thus leaving the inode that has large amount of dirty pages behind and it has to wait for another dirty_writeback_interval before we flush it again. We effectively only write out MAX_WRITEBACK_PAGES every dirty_writeback_interval. If the rate of dirtying is sufficiently high, the system will start accumulate a large number of dirty pages. So fix it by having another sb->s_more_io list on which to park the inode while we iterate through sb->s_io and to allow each dirty inode which resides on that sb to have an equal chance of flushing some amount of dirty pages. Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Ingo Molnar	4749252776	printk: add KERN_CONT annotation printk: add the KERN_CONT annotation (which is empty string but via which checkpatch.pl can notice that the lacking KERN_ level is fine). This useful for multiple calls of hand-crafted printk output done by early debug code or similar. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Ulrich Drepper	22d2b35b20	F_DUPFD_CLOEXEC implementation One more small change to extend the availability of creation of file descriptors with FD_CLOEXEC set. Adding a new command to fcntl() requires no new system call and the overall impact on code size if minimal. If this patch gets accepted we will also add this change to the next revision of the POSIX spec. To test the patch, use the following little program. Adjust the value of F_DUPFD_CLOEXEC appropriately. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <errno.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #ifndef F_DUPFD_CLOEXEC # define F_DUPFD_CLOEXEC 12 #endif int main (int argc, char *argv[]) { if (argc > 1) { if (fcntl (3, F_GETFD) == 0) { puts ("descriptor not closed"); exit (1); } if (errno != EBADF) { puts ("error not EBADF"); exit (1); } exit (0); } int fd = fcntl (STDOUT_FILENO, F_DUPFD_CLOEXEC, 0); if (fd == -1 && errno == EINVAL) { puts ("F_DUPFD_CLOEXEC not supported"); return 0; } if (fd != 3) { puts ("program called with descriptors other than 0,1,2"); return 1; } execl ("/proc/self/exe", "/proc/self/exe", "1", NULL); puts ("execl failed"); return 1; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Alexey Dobriyan	18796aa002	task_struct: move ->fpu_counter and ->oomkilladj There is nice 2 byte hole after struct task_struct::ioprio field into which we can put two 1-byte fields: ->fpu_counter and ->oomkilladj. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Davide Libenzi	96358de6bc	rename signalfd_siginfo fields For Michael Kerrisk request, the following patch renames signalfd_siginfo fields in order to keep them consistent with the siginfo_t ones. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Eric Sandeen	059590f495	ext3: remove #ifdef CONFIG_EXT3_INDEX CONFIG_EXT3_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext3_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Ahmed S. Darwish	b4471cbb09	Completely remove deprecated IRQ flags (SA_) Only very little files use the deprecated SA_ IRQ flags in latest pull. This patch series removes such macros from the tree and transfrom old code to the new IRQF_* flags. I've grepped the whole tree to make sure that no more files than the patched ones use such deprecated macros. I hope this series won't introduce build errors. Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:00 -07:00
Andrey Mirkin	fd5eea4214	change inotifyfs magic as the same magic is used for futexfs Right now futexfs and inotifyfs have one magic 0xBAD1DEA, that looks a little bit confusing. Use 0xBAD1DEA as magic for futexfs and 0x2BAD1DEA as magic for inotifyfs. Signed-off-by: Andrey Mirkin <major@openvz.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:00 -07:00
Olaf Hering	4f9a58d75b	increase AT_VECTOR_SIZE to terminate saved_auxv properly include/asm-powerpc/elf.h has 6 entries in ARCH_DLINFO. fs/binfmt_elf.c has 14 unconditional NEW_AUX_ENT entries and 2 conditional NEW_AUX_ENT entries. So in the worst case, saved_auxv does not get an AT_NULL entry at the end. The saved_auxv array must be terminated with an AT_NULL entry. Make the size of mm_struct->saved_auxv arch dependend, based on the number of ARCH_DLINFO entries. Signed-off-by: Olaf Hering <olh@suse.de> Cc: Roland McGrath <roland@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:00 -07:00
Pavel Emelyanov	1efd24fa05	Remove unused member from nsproxy The nslock spinlock is not used in the kernel at all. Remove it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:59 -07:00
Alexey Dobriyan	970a8645ca	user.c: #ifdef ->mq_bytes For those who deselect POSIX message queues. Reduces SLAB size of user_struct from 64 to 32 bytes here, SLUB size -- from 40 bytes to 32 bytes. [akpm@linux-foundation.org: fix build] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:59 -07:00
Alan Cox	5f519d7281	tty: expose new methods needed for drivers to get termios right This adds three new functions (or in one case to be more exact makes it always available) tty_termios_copy_hw Copies all the hardware settings from one termios structure to the other. This is intended for drivers that support little or no hardware setting tty_termios_encode_baud_rate Allows you to set the input and output baud rate in a termios structure. A driver is supposed to set the resulting baud rate from a request so most will want to use this function to set the resulting input and output rates to match the hardware values. Internally it knows about keeping Bxxx encoding when possible to maximise compatibility. tty_encode_baud_rate As above but for the tty's own current termios structure I suspect this will initially need some tweaking as it gets enabled by driver patches over the next few mm cycles so consider this lot -mm only for the moment so it can stabilize and end up neat before it goes to base. I've tried not to break any obscure architectures - if you get a speed you can't represent the code will print warnings on non updated termios systems but not break. Once this is merged and seems sane I've got a growing pile of driver updates to use it - notably for USB serial drivers. [akpm@linux-foundation.org: cleanups] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:58 -07:00
Denys Vlasenko	b9ec0339d8	add consts where appropriate in fs/nls/* Add const modifiers to a few struct nls_table's member pointers in include/linux/nls.h and adds a lot of const's in fs/nls/*.c files. Resulting changes as visible by size: text data bss dec hex filename 113612 481216 2368 597196 91ccc nls.org/built-in.o 593548 3296 288 597132 91c8c nls/built-in.o Apparently compiler managed to optimize code a bit better because of const-ness. No other changes are made. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:58 -07:00
Denis V. Lunev	37c42524d6	shrink_dcache_sb speedup This patch makes shrink_dcache_sb consistent with dentry pruning policy. On the first pass we iterate over dentry unused list and prepare some dentries for removal. However, since the existing code moves evicted dentries to the beginning of the LRU it can happen that fresh dentries from other superblocks will be inserted before our dentries. This can result in significant slowdown of shrink_dcache_sb(). Moreover, for virtual filesystems like unionfs which can call dput() during dentries kill existing code results in O(n^2) complexity. We observed 2 minutes shrink_dcache_sb() with only 35000 dentries. To avoid this effects we propose to isolate sb dentries at the end of LRU list. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Andrey Mirkin <amirkin@openvz.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Emil Medve	1f7c8234c7	Make the pr_() family of macros in kernel.h complete Other/Some pr_() macros are already defined in kernel.h, but pr_err() was defined multiple times in several other places Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Tony Lindgren <tony@atomide.com> Reviewed-by: Satyam Sharma <satyam@infradead.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
David Howells	76181c134f	KEYS: Make request_key() and co fundamentally asynchronous Make request_key() and co fundamentally asynchronous to make it easier for NFS to make use of them. There are now accessor functions that do asynchronous constructions, a wait function to wait for construction to complete, and a completion function for the key type to indicate completion of construction. Note that the construction queue is now gone. Instead, keys under construction are linked in to the appropriate keyring in advance, and that anyone encountering one must wait for it to be complete before they can use it. This is done automatically for userspace. The following auxiliary changes are also made: (1) Key type implementation stuff is split from linux/key.h into linux/key-type.h. (2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does not need to call key_instantiate_and_link() directly. (3) Adjust the debugging macros so that they're -Wformat checked even if they are disabled, and make it so they can be enabled simply by defining __KDEBUG to be consistent with other code of mine. (3) Documentation. [alan@lxorguk.ukuu.org.uk: keys: missing word in documentation] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Ralf Baechle	622a9edd91	Remove dma_cache_(wback\|inv\|wback_inv) functions dma_cache_(wback\|inv\|wback_inv) were the earliest attempt on a generalized cache managment API for I/O purposes. Originally it was basically the raw MIPS low level cache API exported to the entire world. The API has suffered from a lack of documentation, was not very widely used unlike it's more modern brothers and can easily be replaced by dma_cache_sync. So remove it rsp. turn the surviving bits back into an arch private API, as discussed on linux-arch. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Paul Mackerras <paulus@samba.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kyle McMartin <kyle@parisc-linux.org> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Matti Linnanvuori	f20fda4861	Mutex documentation is unclear about software interrupts, tasklets and timers Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Bill Nottingham	2e8ecb9db0	add CONFIG_VT_UNICODE As of now, the kernel defaults to non-unicode and XLATE for the keyboard. We've been changing this in Fedora, but that requires patching the defaults in the kernel. The attached introduces CONFIG_VT_UNICODE, which sets the console in unicode mode by default on boot, including both the virtual terminal and the keyboard driver. Signed-off-by: Bill Nottingham <notting@redhat.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Jan Beulich	22e48eaf58	constify string/array kparam tracking structures .. in an effort to make read-only whatever can be made, so that CONFIG_DEBUG_RODATA can catch as many issues as possible. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Jan Beulich	d5aa0daf6d	store __setup_str_* in a more compact way __setup_str_* are referenced only during boot, hence there's no need to waste image space for aligning these strings (with the aim of improving performance). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Robert P. J. Day	b311e921b3	Add a "rounddown_pow_of_two" routine to log2.h To go along with the existing "roundup_pow_of_two" routine, add one for rounding down since that operation appears to crop up on a regular basis in the source tree. [m.kozlowski@tuxland.pl: fix unbalanced parentheses] Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Jan Kara	8e8934695d	quota: send messages via netlink Implement sending of quota messages via netlink interface. The advantage is that in userspace we can better decide what to do with the message - for example display a dialogue in your X session or just write the message to the console. As a bonus, we can get rid of problems with console locking deep inside filesystem code once we remove the old printing mechanism. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Adrian Bunk	b012d346c0	make kernel/profile.c:time_hook static {,un}register_timer_hook() is the API that should be used. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:55 -07:00
Adrian Bunk	cba4fbbff2	remove include/asm-/ipc.h All asm/ipc.h files do only #include <asm-generic/ipc.h>. This patch therefore removes all include/asm-/ipc.h files and moves the contents of include/asm-generic/ipc.h to include/linux/ipc.h. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:55 -07:00
Alexey Dobriyan	4af3c9cc4f	Drop some headers from mm.h mm.h doesn't use directly anything from mutex.h and backing-dev.h, so remove them and add them back to files which need them. Cross-compile tested on many configs and archs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:55 -07:00
Paul Clements	7fdfd4065c	NBD: allow hung network I/O to be cancelled Allow NBD I/O to be cancelled when a network outage occurs. Previously, I/O would just hang, and if enough I/O was hung in nbd, the system (at least user-level) would completely hang until a TCP timeout (default, 15 minutes) occurred. The patch introduces a new ioctl NBD_SET_TIMEOUT that allows a transmit timeout value (in seconds) to be specified. Any network send that exceeds the timeout will be cancelled and the nbd connection will be shut down. I've tested with various timeout values and 6 seconds seems to be a good choice for the timeout. If the NBD_SET_TIMEOUT ioctl is not called, you get the old (I/O hang) behavior. Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:55 -07:00

... 2 3 4 5 6 ...

17208 Commits