Commit Graph

1382 Commits

Author SHA1 Message Date
Zach Brown d55b5fdaf4 [PATCH] aio: remove aio_max_nr accounting race
AIO was adding a new context's max requests to the global total before
testing if that resulting total was over the global limit.  This let
innocent tasks get their new limit tested along with a racing guilty task
that was crossing the limit.  This serializes the _nr accounting with a
spinlock It also switches to using unsigned long for the global totals.
Individual contexts are still limited to an unsigned int's worth of
requests by the syscall interface.

The problem and fix were verified with a simple program that spun creating
and destroying a context while holding on to another long lived context.
Before the patch a task creating a tiny context could get a spurious EAGAIN
if it raced with a task creating a very large context that overran the
limit.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:38 -08:00
Christoph Hellwig d8ba3b7310 [PATCH] fuse: remove dead code from fuse_permission
The -EROFS check has moved up to permission() in the VFS a while ago.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:37 -08:00
Adrian Bunk ccb6e363a6 [PATCH] fs/smbfs/request.c: turn NULL dereference into BUG()
In a case documented as

  We should never be called with any of these states

BUG() in a case that would later result in a NULL pointer dereference.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:36 -08:00
Lennert Buytenhek 878129a304 [PATCH] hfs needs nls
Reported by Eddy Petrisor <eddy.petrisor@gmail.com>

fs/built-in.o(.text+0x35fdc): In function `hfs_mdb_put':
: undefined reference to `unload_nls'
fs/built-in.o(.text+0x35ff1): In function `hfs_mdb_put':
: undefined reference to `unload_nls'
fs/built-in.o(.text+0x367a5): In function `parse_options':
super.c: undefined reference to `load_nls'
fs/built-in.o(.text+0x367db):super.c: undefined reference to `load_nls'
fs/built-in.o(.text+0x36938):super.c: undefined reference to `load_nls_default'

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Acked-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:36 -08:00
Paul E. McKenney 665a7583f3 [PATCH] Remove hlist_for_each_rcu() API, convert existing use to hlist_for_each_entry_rcu
Remove the hlist_for_each_rcu() API, which is used only in one place, and
is trivially converted to hlist_for_each_entry_rcu(), making the code
shorter and more readable.  Any out-of-tree uses may be similarly
converted.

Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Matt Helsley 9f46080c41 [PATCH] Process Events Connector
This patch adds a connector that reports fork, exec, id change, and exit
events for all processes to userspace.  It replaces the fork_advisor patch
that ELSA is currently using.  Applications that may find these events
useful include accounting/auditing (e.g.  ELSA), system activity monitoring
(e.g.  top), security, and resource management (e.g.  CKRM).

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Andrew Morton 49364ce253 [PATCH] write_inode_now(): write inode if not BDI_CAP_NO_WRITEBACK
If the backing_dev_info doesn't have BDI_CAP_NO_WRITEBACK we're not supposed
to write back an inode's pages.  But in this situation write_inode_now()
refuses to write the inode itself as well.  Fix.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Peter Oberparleiter 4cd5b9f6df [PATCH] s390: cleanup of include/asm-s390/vtoc.h
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:34 -08:00
Pekka J Enberg 2109a2d1b1 [PATCH] mm: rename kmem_cache_s to kmem_cache
This patch renames struct kmem_cache_s to kmem_cache so we can start using
it instead of kmem_cache_t typedef.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:24 -08:00
Thomas Gleixner 182ec4eee3 [JFFS2] Clean up trailing white spaces
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-07 14:18:56 +01:00
Artem B. Bityutskiy 008531f4c3 [JFFS2] Fix broken compile when debug level = 2
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 23:17:39 +01:00
Todd Poynor 4fc67fbe52 [JFFS2] Return 0, not number of bytes written, for success at commit_write
Some callers to block-layer commit_write function treat non-zero return as
error, notably the loopback mount driver sometimes used in conjunction with
JFFS2 on NAND flash for bad block avoidance, etc.  Return zero for success
as do various other commit_write functions.

Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 23:14:16 +01:00
Artem B. Bityutskiy daba5cc4bc [JFFS2] Fix dataflash support
- assume wbuf may be of size which is not power of 2
- don't make strange assumption about not padding wbuf for DataFlash
- use wbuf = DataFlash page and eraseblock >= 8 Dataflash pages

From: Peter Menzebach <pm-mtd@mw-itcon.de>
Acked-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 23:01:48 +01:00
Artem B. Bityutskiy d55849aa4d [JFFS2] Use memset(struct) instead of nulling struct members one by one
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:41:34 +01:00
Artem B. Bityutskiy 2f0077e018 [JFFS2] Remove stale comment
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:40:33 +01:00
Ferenc Havasi 2bc9764c48 [JFFS2] Rename jffs2_summary_node to jffs2_raw_summary
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:32:45 +01:00
Artem B. Bityutskiy 733802d974 [JFFS2] Debug code simplification, update TODO
Simplify the debugging code further.
Update the TODO list

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:20:33 +01:00
Ferenc Havasi 34c0e90671 [JFFS2] Account summary space in reserved_size.
Always keep valid data in reserved_size.

It did not cause problems, but the reservation code was unoptimal
when centralized summary was active or the size of the erase block
was very small.

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:47:18 +01:00
Artem B. Bityutskiy 81e39cf029 [JFFS2] Debug message format clean up
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:38:34 +01:00
Ferenc Havasi 8acff5e934 [JFFS2] Call summary collector for all mtd devices with writev support
Do the summary collection in the right place. If the device
was not writebuffered but had c->mtd->writev function
(e.g. blkmtd) the summary collector function was not called.

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:37:07 +01:00
Ferenc Havasi c617e84248 [JFFS2] Return real jffs2_sum_init() error code
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:31:05 +01:00
Ferenc Havasi e631ddba58 [JFFS2] Add erase block summary support (mount time improvement)
The goal of summary is to speed up the mount time. Erase block summary (EBS)
stores summary information at the end of every (closed) erase block. It is
no longer necessary to scan all nodes separetly (and read all pages of them)
just read this "small" summary, where every information is stored which is
needed at mount time.

This summary information is stored in a JFFS2_FEATURE_RWCOMPAT_DELETE. During
the mount process if there is no summary info the orignal scan process will
be executed. EBS works with NAND and NOR flashes, too.

There is a user space tool called sumtool to generate this summary
information for a JFFS2 image.

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:29:48 +01:00
Ferenc Havasi 4ce1f56218 [JFFS2] Remove support for virtual blocks
Remove support for virtual blocks, which are build by
concatenation of multiple physical erase blocks.

For more information please read the MTD mailing list thread
"[PATCH] remove support for virtual blocks"

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:08:27 +01:00
Artem B. Bityutskiy f0507530cb [JFFS2] Solve BUG caused by frag->node representing a hole in fragtree
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:32:36 +01:00
Artem B. Bityutskiy 280562b210 [JFFS2] Calculate CRC check starting point correctly
When data starts from the beginning of NAND page, 'len' must be zero, not
c->wbuf_page.

Thanks to Zoltan Sogor for reporting this problem.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:29:56 +01:00
Artem B. Bityutskiy 8d5df40954 [JFFS2] More message formatting cleanups
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:27:14 +01:00
Artem B. Bityutskiy 3a69e0cd22 [JFFS2] Fix JFFS2 [mc]time handling
From: David Woodhouse <dwmw2@infradead.org>

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:25:59 +01:00
Artem B. Bityutskiy 01d445f89d [JFFS2] Make the JFFS2 messages a bit nicer
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:24:15 +01:00
Nicolas Pitre 59da721a22 [JFFS2] Teach JFFS2 about Sibley flash
Intels Sibley flash needs JFFS2 write buffer functionality

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:13:52 +01:00
Artem B. Bityutskiy 45ca1b509e [JFFS2] Debug code clean up - step 7
Remove more noisy debugs. Add current->pid to debug messages.
Remove bogus includes.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 19:14:35 +01:00
Artem B. Bityutskiy 3c09133739 [JFFS2] Correct buggy length checks
The previous changes introduced wrong length calculations.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:35:36 +01:00
Artem B. Bityutskiy 392435081e [JFFS2] Debug code clean up - step 6
Remove extra noisy debugs

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:33:09 +01:00
Artem B. Bityutskiy 1e0da3cb6c [JFFS2] Build fragtree in reverse order
Instead of building fragtree starting from node with the smallest version
number, start from the highest. This helps to avoid reading and checking
obsolete nodes.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:22:17 +01:00
Artem B. Bityutskiy e0e3006f79 [JFFS2] Refine fragtree debug macros
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:19:41 +01:00
Artem B. Bityutskiy 1e900979a7 [JFFS2] Move another fragtree-related function to nodelist.c
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:11:59 +01:00
Andrew Lunn 737b7661e0 [JFFS2] Fix up new debug code for eCos build
The debug code cleanup broke the eCos build.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:06:10 +01:00
Artem B. Bityutskiy e0d601373b [JFFS2] Debug code clean up - step 5
Replace the D1(printk()) style debugging with the new debug macros

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 18:01:24 +01:00
Artem B. Bityutskiy f97117d153 [JFFS2] Move scattered function into related files
Move functions to read inodes into readinode.c
Move functions to handle fragtree and dentry lists into nodelist.[ch]

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:50:45 +01:00
Artem B. Bityutskiy f538c96ba2 [JFFS2] Debug code clean up - step 4
Small comment cleanups. Remove a unused macro

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:34:21 +01:00
Ferenc Havasi 2227c0ba4b [jffs2] Remove compressor lzo and lzari
Remove unused compressor code

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:31:24 +01:00
Artem B. Bityutskiy f302cd028c [JFFS2] Namespace clean up
Rename functions to a name matching the functionality.
Remove stall debug code

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:17:32 +01:00
Artem B. Bityutskiy e0c8e42f8f [JFFS2] Debug code clean up - step 3
Various simplifiactions. printk format corrections.
Convert more code to use the new debug functions.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:06:49 +01:00
Artem B. Bityutskiy 6dac02a5e1 [JFFS2] Fix slab panic
When JFFS22 is unable to read the root inode, the bad root inode object is not
freed and remains sticked in the jffs2_i slab cache. When we further try to
free the slab cache (e.g., on rmmod jffs2), slab allocator subsystem panics.
Fix this bug.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 16:31:04 +01:00
Artem B. Bityutskiy 61a39b6941 [JFFS2] Debug code clean up - step 2
If debugging is disabled, define debugging functions as empty macros, instead
of using Dx() explicitly.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 16:29:43 +01:00
Artem B. Bityutskiy 2b79adcca1 [JFFS2] Use f->target instead of f->dents for symlink target
JFFS2 uses f->dents to store the pointer to the symlink target string (in case
the inode is symlink). This is somewhat ugly to use the same field for
different reasons. Introduce distinct field f->target for this purpose.
Note, f->fragtree, f->dents, f->target may probably be put in a union.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 16:25:55 +01:00
Artem B. Bityutskiy 730554d946 [JFFS2] Debug code clean up - step 1
Move debug functions into a seperate source file

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 16:21:25 +01:00
Artem B. Bityutskiy dae6227f71 [JFFS2] Split a large routine on several smaller.
Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 15:40:55 +01:00
Chuck Lever 0bbacc402e NFS,SUNRPC,NLM: fix unused variable warnings when CONFIG_SYSCTL is disabled
Fix some dprintk's so that NLM, NFS client, and RPC client compile
 cleanly if CONFIG_SYSCTL is disabled.

 Test plan:
 Compile kernel with CONFIG_NFS enabled and CONFIG_SYSCTL disabled.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:39:48 -05:00
Trond Myklebust 6bfc93ef98 NFSv4: Teach NFSv4 to cache locks when we hold a delegation
Now that we have a method of dealing with delegation recalls, actually
 enable the caching of posix and BSD locks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:39:36 -05:00
Trond Myklebust 888e694c16 NFSv4: Recover locks too when returning a delegation
Delegations allow us to cache posix and BSD locks, however when the
 delegation is recalled, we need to "flush the cache" and send
 the cached LOCK requests to the server.

 This patch sets up the mechanism for doing so.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:38:11 -05:00
Trond Myklebust 43b2a33aa8 NFSv4: Fix recovery of flock() locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:35:30 -05:00
Trond Myklebust 34ea818846 NFSv4: Return any delegations before sillyrenaming the file
I missed this one... Any form of rename will result in a delegation
 recall, so it is more efficient to return the one we hold before
 trying the rename.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:35:02 -05:00
Trond Myklebust 2c56617d76 NFSv4: Fix the handling of the error NFS4ERR_OLD_STATEID
Ensure that we retry the failed operation...

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:50 -05:00
Trond Myklebust d530838bfa NFSv4: Fix problem with OPEN_DOWNGRADE
RFC 3530 states that for OPEN_DOWNGRADE "The share_access and share_deny
 bits specified must be exactly equal to the union of the share_access and
 share_deny bits specified for some subset of the OPENs in effect for
 current openowner on the current file.

 Setattr is currently violating the NFSv4 rules for OPEN_DOWNGRADE in that
 it may cause a downgrade from OPEN4_SHARE_ACCESS_BOTH to
 OPEN4_SHARE_ACCESS_WRITE despite the fact that there exists no open file
 with O_WRONLY access mode.

 Fix the problem by replacing nfs4_find_state() with a modified version of
 nfs_find_open_context().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:38 -05:00
Trond Myklebust 4cecb76ff8 NFSv4: Fix a race between open() and close()
We must not remove the nfs4_state structure from the inode open lists
 before we are in sequence lock.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:32:58 -05:00
Steve French ec58ef0328 [CIFS] Update kconfig for cifs
Add cifs extended stats configure option and reduce experimental code.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-11-04 09:44:33 -08:00
Linus Torvalds 0f3278d14f Merge git://oss.sgi.com:8090/oss/git/xfs-2.6 2005-11-03 16:25:58 -08:00
Nathan Scott 15c84a4701 [XFS] Remove no-longer-used qsort source.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-04 10:51:01 +11:00
Nathan Scott 05db218a27 [XFS] Fix an inode32 regression - if no options are presented, must still
set default flags.

SGI-PV: 945242
SGI-Modid: xfs-linux-melb:xfs-kern:24292a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-04 09:49:07 +11:00
Nathan Scott 992c83a129 [XFS] Remove several no-longer-used files.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 16:50:07 +11:00
Nathan Scott 7f248a81c5 [XFS] Cleanup cosmetic differences between source trees.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 16:14:31 +11:00
Nathan Scott 538524aed0 [XFS] fix XFS quota for modular XFS builds
Cannot build XFS filesystem support as module with quota support.  It
works only when the XFS filesystem support is compiled into the kernel.
Menuconfig prevents from setting CONFIG_XFS_FS=m and CONFIG_XFS_QUOTA=y.

How to reproduce: configure the XFS filesystem with quota support as
module.  The resulting kernel won't have quota support compiled into
xfs.ko.

Fix: Changing the fs/xfs/Kconfig file from tristate to bool lets you
configure the quota support to be compiled into the XFS module.  The
Makefile-linux-2.6 checks only for CONFIG_XFS_QUOTA=y.

Signed-off-by: Dimitri Puzin <tristan-777@ddkom-online.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 13:55:06 +11:00
Nathan Scott de69e5f44e [XFS] Add a mechanism for XFS to use the generic quota sync method.
This is now used to issue a delayed allocation flush before reporting
quota, which allows the used space quota report to match reality.

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 13:53:34 +11:00
Yingping Lu bf6f05aa0b [XFS] Fixed the inconsistency between attribute b-tree intermidiate node
and leaf blocks. The problem cam from xfsqa test 117.

SGI-PV: 940655
SGI-Modid: xfs-linux:xfs-kern:201527a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 10:31:47 +11:00
Steve French cb9dbff92e [CIFS] Make CONFIG_CIFS_EXPERIMENTAL depend on CONFIG_EXPERIMENTAL
It seems logical.

Note that CONFIG_EXPERIMENTAL itself doesn't enable any code.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2005-11-02 11:37:15 -08:00
Linus Torvalds ec1890c5df Merge git://brick.kernel.dk/data/git/linux-2.6-block 2005-11-02 08:06:02 -08:00
Nathan Scott 19d5bcf370 [XFS] Ensure fsync does not incorrectly return EIO for pages beyond EOF.
SGI-PV: 944819
SGI-Modid: xfs-linux:xfs-kern:24236a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:14:09 +11:00
Eric Sandeen a749ee8615 [XFS] Fix calculation of reserved AGs for inodes in 32-bit inode mode
Spotted by Roger Willcocks <willcor @at@ gmail.com>

SGI-PV: 944858
SGI-Modid: xfs-linux:xfs-kern:201213a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:13:42 +11:00
Nathan Scott fdc7ed75c0 [XFS] Fix boundary conditions when issuing direct IOs from large userspace
buffers.

SGI-PV: 944820
SGI-Modid: xfs-linux:xfs-kern:24223a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:13:13 +11:00
Nathan Scott 2b3b6d07f7 [XFS] Remove an unhelpful ifdef, the comment above the routine explains
the purpose well enough here.

SGI-PV: 944821
SGI-Modid: xfs-linux:xfs-kern:24214a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:12:28 +11:00
Nathan Scott cfcbbbd089 [XFS] Remove old, broken nolog-mode code - noone plans to ever fix it.
SGI-PV: 944821
SGI-Modid: xfs-linux:xfs-kern:24213a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:12:04 +11:00
Nathan Scott c11e2c369d [XFS] Rework fid encode/decode wrt 64 bit inums interacting with NFS.
SGI-PV: 937127
SGI-Modid: xfs-linux:xfs-kern:24201a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:11:45 +11:00
Christoph Hellwig 16259e7d95 [XFS] Endianess annotations for various allocator data structures
SGI-PV: 943272
SGI-Modid: xfs-linux:xfs-kern:201006a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:11:25 +11:00
Eric Sandeen e2ed81fbbb [XFS] remove unused code from xfs_iomap_write_direct
SGI-PV: 943266
SGI-Modid: xfs-linux:xfs-kern:200996a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:10:55 +11:00
Eric Sandeen e94af02a9c [XFS] fix old xfs_setattr mis-merge from irix; mostly harmless esp if not
using xfs rt

SGI-PV: 944632
SGI-Modid: xfs-linux:xfs-kern:200983a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:10:41 +11:00
Yingping Lu 91e11088f8 [XFS] Fixing size report discrepancy between ls and du caused by xfs_fsr
SGI-PV: 943908
SGI-Modid: xfs-linux:xfs-kern:200874a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:10:24 +11:00
Yingping Lu 9af0a70c07 [XFS] Fixed a bug in reporting extent list for attribute fork running
xfs_bmap -a.

SGI-PV: 944075
SGI-Modid: xfs-linux:xfs-kern:200860a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:09:54 +11:00
Christoph Hellwig 7f14d0a013 [XFS] Simplify pagebuf_rele Remove a conditional that can not be true
anymore and simplify the final put path a little

SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:200790a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:09:35 +11:00
Nathan Scott e718eeb4fe [XFS] Rework the final mount options flag bit to make room for more.
SGI-PV: 943866
SGI-Modid: xfs-linux:xfs-kern:24030a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:09:22 +11:00
Nathan Scott 6b3f6b5b87 [XFS] Rework the dquot hash sizing heuristics.
SGI-PV: 943123
SGI-Modid: xfs-linux:xfs-kern:24012a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:08:25 +11:00
Eric Sandeen 1f730e3b53 [XFS] Add ATTR_NOSIZETOK definition for xfs_vnodeops.c change
SGI-PV: 942439
SGI-Modid: xfs-linux:xfs-kern:200185a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:08:10 +11:00
Nathan Scott 8a319ae494 [XFS] Disable attr2 by default, until a more appropriate time to enable
it.

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:24002a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:07:51 +11:00
Eric Sandeen 374e2ac337 [XFS] Prevent data corruption on extending truncate case from cxfs client
SGI-PV: 942439
SGI-Modid: xfs-linux:xfs-kern:200152a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:07:34 +11:00
Christoph Hellwig 4750ddb0ba [XFS] Fix sparse warnings in ktrace.[ch]
SGI-PV: 943556
SGI-Modid: xfs-linux:xfs-kern:200113a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:07:23 +11:00
Christoph Hellwig 5bde1ba99c [XFS] silence gcc4 warnings. the directory ones are wrong because of
information gcc could not find out (that a directory always has a ..
entry), the others are outright gcc bugs.

SGI-PV: 943511
SGI-Modid: xfs-linux:xfs-kern:200055a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:06:18 +11:00
Nathan Scott 9dac13e7ff [XFS] Remove unused type, xfs_gap_t.
SGI-PV: 907752
SGI-Modid: xfs-linux:xfs-kern:23932a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:05:34 +11:00
Christoph Hellwig 1149d96ae8 [XFS] endianess annotations and cleanup for the quota code
SGI-PV: 943272
SGI-Modid: xfs-linux:xfs-kern:199767a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:01:12 +11:00
Nathan Scott fa7e7d71e0 [XFS] Show additional mount options in /proc/mounts, fix up some debug
code.

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23926a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:00:48 +11:00
Nathan Scott da087bad81 [XFS] Fix up a 32/64 local flags variable issue when enabling attr2 mode.
SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23925a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:00:20 +11:00
Eric Sandeen 0116d9356b [XFS] Remove dead code in xfs_iomap_write_direct; save some stack
SGI-PV: 943266
SGI-Modid: xfs-linux:xfs-kern:199750a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:00:01 +11:00
Nathan Scott 4ce3121f67 [XFS] Update license/copyright notices to match the prefered SGI
boilerplate.

SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23917a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 14:59:41 +11:00
Nathan Scott 7b71876980 [XFS] Update license/copyright notices to match the prefered SGI
boilerplate.

SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23903a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 14:58:39 +11:00
Nathan Scott a844f4510d [XFS] Remove xfs_macros.c, xfs_macros.h, rework headers a whole lot.
SGI-PV: 943122
SGI-Modid: xfs-linux:xfs-kern:23901a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 14:38:42 +11:00
Christoph Hellwig 61c1e689fb [XFS] remove unused struct xfs_ail_ticket
SGI-PV: 919278
SGI-Modid: xfs-linux:xfs-kern:199498a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:44:57 +11:00
Nathan Scott fc1f8c1ca3 [XFS] Track external log/realtime device names for correct reporting in
/proc/mounts.

SGI-PV: 942984
SGI-Modid: xfs-linux:xfs-kern:23862a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:44:33 +11:00
Nathan Scott 4aeb664c25 [XFS] Improve buffered read throughput by removing unnecessary timer calls
that showed in ´kernel profiles.

SGI-PV: 925163
SGI-Modid: xfs-linux:xfs-kern:23861a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:58 +11:00
Nathan Scott 0fdfb3757f [XFS] Remove a null CELL macro and its one caller, not useful to anyone.
SGI-PV: 942986
SGI-Modid: xfs-linux:xfs-kern:23860a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:42 +11:00
Nathan Scott 380b5dc0e5 [XFS] Fix up an internal sort function name collision issue.
SGI-PV: 942986
SGI-Modid: xfs-linux:xfs-kern:23859a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:18 +11:00
Nathan Scott 80cce77980 [XFS] Make some extended attributes routines take const parameters, for
the FreeBSD porters.

SGI-PV: 942906
SGI-Modid: xfs-linux:xfs-kern:23845a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:04 +11:00
Nathan Scott f74dee4276 [XFS] Ondisk format extension for extended attributes (attr2). Basically,
the data/attr forks now grow up/down from either end of the literal area,
rather than dividing the literal area into two chunks and growing both
upward.  Means we can now make much more efficient use of the attribute
space, incl. fitting DMF attributes inline in 256 byte inodes, and large
jumps in dbench3 performance numbers.  It is self enabling, but can be
forced on/off via the attr2/noattr2 mount options.

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23837a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:35:56 +11:00
Nathan Scott af4e34a527 [XFS] Ondisk format extension for extended attributes (attr2). Basically,
the data/attr forks now grow up/down from either end of the literal area,
rather than dividing the literal area into two chunks and growing both
upward.  Means we can now make much more efficient use of the attribute
space, incl. fitting DMF attributes inline in 256 byte inodes, and large
jumps in dbench3 performance numbers.  It is self enabling, but can be
forced on/off via the attr2/noattr2 mount options.

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23836a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:35:46 +11:00
Nathan Scott d8cc890d40 [XFS] Ondisk format extension for extended attributes (attr2). Basically,
the data/attr forks now grow up/down from either end of the literal area,
rather than dividing the literal area into two chunks and growing both
upward.  Means we can now make much more efficient use of the attribute
space, incl. fitting DMF attributes inline in 256 byte inodes, and large
jumps in dbench3 performance numbers.  It is self enabling, but can be
forced on/off via the attr2/noattr2 mount options.

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23835a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:34:53 +11:00
Nathan Scott aa82daa061 [XFS] Move some code around to prepare for the upcoming extended
attributes format change (attr2).

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23833a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:33:33 +11:00
David Chinner e8c8b3a79d [XFS] Introduce two new mount options (nolargeio/largeio) to allow
filesystems to expose the filesystem stripe width in stat(2) rather than
the page cache size. This allows applications requiring high bandwidth to
easily determine the optimum I/O size for the underlying filesystem. The
default is to report the page cache size (i.e. "nolargeio").

SGI-PV: 942818
SGI-Modid: xfs-linux:xfs-kern:23830a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:33:05 +11:00
Nathan Scott ee34807a65 [XFS] Provide a mechiansm for flushing delalloc before quota reporting.
SGI-PV: 942815
SGI-Modid: xfs-linux:xfs-kern:23829a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:32:38 +11:00
Nathan Scott c310ab6c07 [XFS] Fix signedness issues in dquot ID handling, allowing uids/gids above
MAXINT

SGI-PV: 942528
SGI-Modid: xfs-linux:xfs-kern:23828a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:31:41 +11:00
Nathan Scott 30dab21abb [XFS] Add a comment about the use of XFS_SIZE_TOKEN_WANT.
SGI-PV: 936331
SGI-Modid: xfs-linux:xfs-kern:23827a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:31:13 +11:00
Christoph Hellwig c86e711ceb [XFS] only mark buffers done when all pages are uptodate in addition
replace PBF_NONE with an inverted PBF_DONE, so it's like all the other
flags.

SGI-PV: 942609
SGI-Modid: xfs-linux:xfs-kern:199136a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:29:39 +11:00
Eric Sandeen d0cfb37305 [XFS] Stack footprint reduction for xfs_swapext (used from xfs_fsr)
SGI-PV: 913332
SGI-Modid: xfs-linux:xfs-kern:198926a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:29:04 +11:00
Christoph Hellwig f538d4da8d [XFS] write barrier support Issue all log sync operations as ordered
writes.  In addition flush the disk cache on fsync if the sync cached
operation didn't sync the log to disk (this requires some additional
bookeping in the transaction and log code). If the device doesn't claim to
support barriers, the filesystem has an extern log volume or the trial
superblock write with barriers enabled failed we disable barriers and
print a warning.  We should probably fail the mount completely, but that
could lead to nasty boot failures for the root filesystem.  Not enabled by
default yet, needs more destructive testing first.

SGI-PV: 912426
SGI-Modid: xfs-linux:xfs-kern:198723a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:26:59 +11:00
Christoph Hellwig 739cafd316 [XFS] fix PBF_NONE handling
SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:198669a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:25:51 +11:00
Christoph Hellwig da1650a5d6 [XFS] Add format checking to cmn_err and icmn_err
SGI-PV: 942243
SGI-Modid: xfs-linux:xfs-kern:198658a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:21:35 +11:00
Christoph Hellwig 88741a95af [XFS] remove unused pagebuf flags
SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:198656a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:21:14 +11:00
Christoph Hellwig 04d8b28416 [XFS] Make sure the threads and shaker in xfs_buf are de-initialized in
reverse startup order

SGI-PV: 942063
SGI-Modid: xfs-linux:xfs-kern:198651a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:15:05 +11:00
Steve French ef0eaa1336 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-11-01 09:02:10 -08:00
Anton Altaparmakov 94b166a7cb Merge branch 'master' of /home/src/linux-2.6/ 2005-11-01 15:51:32 +00:00
Anton Altaparmakov 3aebf25bdc NTFS: Fix a stupid bug causing writes to non-initialized pages to segfault.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-11-01 15:49:31 +00:00
Jens Axboe a362357b6c [BLOCK] Unify the seperate read/write io stat fields into arrays
Instead of having ->read_sectors and ->write_sectors, combine the two
into ->sectors[2] and similar for the other fields. This saves a branch
several places in the io path, since we don't have to care for what the
actual io direction is. On my x86-64 box, that's 200 bytes less text in
just the core (not counting the various drivers).

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-11-01 09:26:16 +01:00
Dave Kleikamp 988a6490a7 JFS: set i_ctime & i_mtime on target directory when creating links
jfs has never been setting i_ctime or i_mtime when creating either hard
or symbolic links.  I'm surprised nobody had noticed until now.

Thanks to Chris Spiegel for reporting the problem.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-10-31 16:53:04 -06:00
Andrea Arcangeli 659603ef69 [PATCH] fix __writeback_single_inode WARN_ON
When the inode count is zero in inode writeback, the

	WARN_ON(!(inode->i_state & I_WILL_FREE));

is broken, and needs to test for either I_WILL_FREE|I_FREEING.

When the inode is in I_FREEING state, it's already out of the visibility
of the vm so it can't be freed so it doesn't require the __iget and the
generic_delete_inode path can call the sync internally to the lowlevel
fs callback during the last iput. So the inode being in I_FREEING is
also a valid condition for calling the sync with i_count == 0.

The specific stack trace is this:

  0xc00000007b8fb6e0  0xc00000000010118c  .__writeback_single_inode +0x5c
  0xc00000007b8fb6e0  0xc0000000001014dc (lr) .sync_inode +0x3c
  0xc00000007b8fb790  0xc0000000001014dc  .sync_inode +0x3c
  0xc00000007b8fb820  0xc0000000001a5020  .ext2_sync_inode +0x64
  0xc00000007b8fb8f0  0xc0000000001a65b4  .ext2_truncate +0x3f8
  0xc00000007b8fba40  0xc0000000001a6940  .ext2_delete_inode +0xdc
  0xc00000007b8fbac0  0xc0000000000f7a5c  .generic_delete_inode +0x124
  0xc00000007b8fbb50  0xc0000000000f5fe0  .iput +0xb8
  0xc00000007b8fbbe0  0xc0000000000e9fd4  .sys_unlink +0x2a8
  0xc00000007b8fbd10  0xc00000000001048c  .ret_from_syscall_1 +0x0

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-31 14:22:04 -08:00
Steve French 53b2ec5518 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-10-31 08:36:11 -08:00
Anton Altaparmakov 1f04c0a24b Merge branch 'master' of /usr/src/ntfs-2.6/ 2005-10-31 10:06:46 +00:00
Paul Mackerras 23fd07750a Merge ../linux-2.6 by hand 2005-10-31 13:37:12 +11:00
Pekka Enberg ad2c1604da [PATCH] fat: Remove duplicate directory scanning code
This patch removes duplicate directory scanning code from fs/fat/dir.c.  The
two functions that share identical code are fat_readdirx() and
fat_search_long().  This patch also renames fat_readdirx to __fat_readdir().

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
OGAWA Hirofumi 9131dd4256 [PATCH] fat: remove the unneeded vfat_find() in vfat_rename()
Now, vfat_rename() is using vfat_find() for sanity check.  This removes that
sanity check, the cost of sanity check is too high.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
OGAWA Hirofumi 451cbaa1c3 [PATCH] fat: cleanup and optimization of checksum
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
Tim Schmielau 4e57b68178 [PATCH] fix missing includes
I recently picked up my older work to remove unnecessary #includes of
sched.h, starting from a patch by Dave Jones to not include sched.h
from module.h. This reduces the number of indirect includes of sched.h
by ~300. Another ~400 pointless direct includes can be removed after
this disentangling (patch to follow later).
However, quite a few indirect includes need to be fixed up for this.

In order to feed the patches through -mm with as little disturbance as
possible, I've split out the fixes I accumulated up to now (complete for
i386 and x86_64, more archs to follow later) and post them before the real
patch.  This way this large part of the patch is kept simple with only
adding #includes, and all hunks are independent of each other.  So if any
hunk rejects or gets in the way of other patches, just drop it.  My scripts
will pick it up again in the next round.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
Andrew Morton a3e713b5fd [PATCH] __bread oops fix
If a filesystem passes an idiotic blocksize into bread(), __getblk_slow() will
warn and will return NULL.  We have a report (from Hubert Tonneau
<hubert.tonneau@fullpliant.org>) of isofs_fill_super() doing this (passing in
a silly block size) against an unplugged CDROM drive.

But a couple of __getblk_slow() callers forgot to check for the NULL bh, hence
oops.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:27 -08:00
Alexey Dobriyan b3099b48da [PATCH] fs/attr.c: remove BUG()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:27 -08:00
Glauber de Oliveira Costa 2973dfdb87 [PATCH] Test for sb_getblk return value
This patch adds tests for the return value of sb_getblk() in the ext2/3
filesystems.  In fs/buffer.c it is stated that the getblk() function never
fails.  However, it does can return NULL in some situations due to I/O
errors, which may lead us to NULL pointer dereferences

Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:26 -08:00
Andrea Arcangeli 7f04c26d71 [PATCH] fix nr_unused accounting, and avoid recursing in iput with I_WILL_FREE set
list_move(&inode->i_list, &inode_in_use);
 		} else {
 			list_move(&inode->i_list, &inode_unused);
+			inodes_stat.nr_unused++;
 		}
 	}
 	wake_up_inode(inode);

Are you sure the above diff is correct? It was added somewhere between
2.6.5 and 2.6.8. I think it's wrong.

The only way I can imagine the i_count to be zero in the above path, is
that I_WILL_FREE is set.  And if I_WILL_FREE is set, then we must not
increase nr_unused.  So I believe the above change is buggy and it will
definitely overstate the number of unused inodes and it should be backed
out.

Note that __writeback_single_inode before calling __sync_single_inode, can
drop the spinlock and we can have both the dirty and locked bitflags clear
here:

		spin_unlock(&inode_lock);
		__wait_on_inode(inode);
		iput(inode);
XXXXXXX
		spin_lock(&inode_lock);
	}
	use inode again here

a construct like the above makes zero sense from a reference counting
standpoint.

Either we don't ever use the inode again after the iput, or the
inode_lock should be taken _before_ executing the iput (i.e. a __iput
would be required). Taking the inode_lock after iput means the iget was
useless if we keep using the inode after the iput.

So the only chance the 2.6 was safe to call __writeback_single_inode
with the i_count == 0, is that I_WILL_FREE is set (I_WILL_FREE will
prevent the VM to free the inode in XXXXX).

Potentially calling the above iput with I_WILL_FREE was also wrong
because it would recurse in iput_final (the second mainline bug).

The below (untested) patch fixes the nr_unused accounting, avoids recursing
in iput when I_WILL_FREE is set and makes sure (with the BUG_ON) that we
don't corrupt memory and that all holders that don't set I_WILL_FREE, keeps
a reference on the inode!

Signed-off-by: Andrea Arcangeli <andrea@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:26 -08:00
Ben Dooks 381be25458 [PATCH] ext3: sparse fixes
Fix warnings from sparse due to un-declared functions that should either
have a header file or have been declared static

 fs/ext2/bitmap.c:14:15: warning: symbol 'ext2_count_free' was not declared. Should it be static?
 fs/ext2/namei.c:92:15: warning: symbol 'ext2_get_parent' was not declared. Should it be static?
 fs/ext3/bitmap.c:15:15: warning: symbol 'ext3_count_free' was not declared. Should it be static?
 fs/ext3/namei.c:1013:15: warning: symbol 'ext3_get_parent' was not declared. Should it be static?
 fs/ext3/xattr.c:214:1: warning: symbol 'ext3_xattr_block_get' was not declared. Should it be static?
 fs/ext3/xattr.c:358:1: warning: symbol 'ext3_xattr_block_list' was not declared. Should it be static?
 fs/ext3/xattr.c:630:1: warning: symbol 'ext3_xattr_block_find' was not declared. Should it be static?
 fs/ext3/xattr.c:863:1: warning: symbol 'ext3_xattr_ibody_find' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Oleg Nesterov 1291cf4163 [PATCH] fix de_thread() vs do_coredump() deadlock
de_thread() sends SIGKILL to all sub-threads and waits them to die in 'D'
state.  It is possible that one of the threads already dequeued coredump
signal.  When de_thread() unlocks ->sighand->lock that thread can enter
do_coredump()->coredump_wait() and cause a deadlock.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Miklos Szeredi 1779381dea [PATCH] fuse: spelling fixes
Correct some typos and inconsistent use of "initialise" vs "initialize" in
comments.  Reported by Ioannis Barkas.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:24 -08:00
Glauber de Oliveira Costa 5b11687924 [PATCH] Locking problems while EXT3FS_DEBUG on
I noticed some problems while running ext3 with the debug flag set on.
More precisely, I was unable to umount the filesystem.  Some investigation
took me to the patch that follows.

At a first glance , the lock/unlock I've taken out seems really not
necessary, as the main code (outside debug) does not lock the super.  The
only additional danger operations that debug code introduces seems to be
related to bitmap, but bitmap operations tends to be all atomic anyway.

I also took the opportunity to fix 2 spelling errors.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:23 -08:00
Oleg Nesterov 2384f55f8a [PATCH] coredump_wait() cleanup
This patch deletes pointless code from coredump_wait().

1. It does useless mm->core_waiters inc/dec under mm->mmap_sem,
   but any changes to ->core_waiters have no effect until we drop
   ->mmap_sem.

2. It calls yield() for absolutely unknown reason.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:23 -08:00
Kirill Korotaev e954365971 [PATCH] proc: fix of error path in proc_get_inode()
This patch fixes incorrect error path in proc_get_inode(), when module
can't be get due to being unloaded.  When try_module_get() fails, this
function puts de(!) and still returns inode with non-getted de.

There are still unresolved known bugs in proc yet to be fixed:
- proc_dir_entry tree is managed without any serialization
- create_proc_entry() doesn't setup de->owner anyhow,
   so setting it later manually is inatomic.
- looks like almost all modules do not care whether
   it's de->owner is set...

Signed-Off-By: Denis Lunev <den@sw.ru>
Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:21 -08:00
Miklos Szeredi f12ec44070 [PATCH] fuse: clean up dead code related to nfs exporting
Remove last remains of NFS exportability support.

The code is actually buggy (as reported by Akshat Aranya), since 'alias'
will be leaked if it's non-null and alias->d_flags has DCACHE_DISCONNECTED.

This is not an active bug, since there will never be any disconnected
dentries.  But it's better to get rid of the unnecessary complexity anyway.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:21 -08:00
Oleg Nesterov 932aeafbe8 [PATCH] fix de_thread vs it_real_fn() deadlock
de_thread() calls del_timer_sync(->real_timer) under ->sighand->siglock.
This is deadlockable, it_real_fn sends a signal and needs this lock too.

Also, delete unneeded ->real_timer.data assignment.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:19 -08:00
Eric Dumazet 2f51201662 [PATCH] reduce sizeof(struct file)
Now that RCU applied on 'struct file' seems stable, we can place f_rcuhead
in a memory location that is not anymore used at call_rcu(&f->f_rcuhead,
file_free_rcu) time, to reduce the size of this critical kernel object.

The trick I used is to move f_rcuhead and f_list in an union called f_u

The callers are changed so that f_rcuhead becomes f_u.fu_rcuhead and f_list
becomes f_u.f_list

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:19 -08:00
Miklos Szeredi 42e50a5a69 [PATCH] open: cleanup in lookup_flags()
lookup_flags() is only called from the non-create case, so it needn't check
for O_CREAT|O_EXCL.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:18 -08:00
Eric W. Biederman a928972864 [PATCH] Don't uselessly export task_struct to userspace in core dumps
task_struct is an internal structure to the kernel with a lot of good
information, that is probably interesting in core dumps.  However there is
no way for user space to know what format that information is in making it
useless.

I grepped the GDB 6.3 source code and NT_TASKSTRUCT while defined is not
used anywhere else.  So I would be surprised if anyone notices it is
missing.

In addition exporting kernel pointers to all the interesting kernel data
structures sounds like the very definition of an information leak.  I
haven't a clue what someone with evil intentions could do with that
information, but in any attack against the kernel it looks like this is the
perfect tool for aiming that attack.

So since NT_TASKSTRUCT is useless as currently defined and is potentially
dangerous, let's just not export it.

(akpm: Daniel Jacobowitz <dan@debian.org> "would be amazed" if anything was
using NT_TASKSTRUCT).

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:18 -08:00
Christoph Hellwig 9c0cbd54ce [PATCH] TIOC* compat ioctl handling
TIOCSTART and TIOCSTOP are defined in asm/ioctls.h and asm/termios.h by
various architectures but not actually implemented anywhere but in the IRIX
compatibility layer, so remove their COMPATIBLE_IOCTL from parisc, ppc64
and sparc64.

Move the TIOCSLTC COMPATIBLE_IOCTL to common code, guided by an ifdef to
only show up on architectures that support it (same as the code handling it
in tty_ioctl.c), aswell as it's brother TIOCGLTC that wasn't handled so
far.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Oleg Nesterov 9e4e23bccb [PATCH] little de_thread() cleanup
Trivial, saves one 'if' branch in de_thread().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
James Lamanna 833d304b22 [PATCH] reiserfs: [kv]free() checking cleanup
Signed-off-by: James Lamanna <jlamanna@gmail.com>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Jan Kara aaa4059bc2 [PATCH] ext3: Fix unmapped buffers in transaction's lists
Fix the problem (BUG 4964) with unmapped buffers in transaction's
t_sync_data list.  The problem is we need to call filesystem's own
invalidatepage() from block_write_full_page().

block_write_full_page() must call filesystem's invalidatepage().  Otherwise
following nasty race can happen:

   proc 1                                        proc 2
   ------                                        ------
- write some new data to 'offset'
  => bh gets to the transactions data list
                                              - starts truncate
                                                => i_size set to new size
- mpage_writepages()
  - ext3_ordered_writepage() to 'offset'
    - block_write_full_page()
      - page->index > end_index+1
        - block_invalidatepage()
          - discard_buffer()
            - clear_buffer_mapped()

- commit triggers and finds unmapped buffer - BOOM!

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
James Morris d381d8a9a0 [PATCH] SELinux: canonicalize getxattr()
This patch allows SELinux to canonicalize the value returned from
getxattr() via the security_inode_getsecurity() hook, which is called after
the fs level getxattr() function.

The purpose of this is to allow the in-core security context for an inode
to override the on-disk value.  This could happen in cases such as
upgrading a system to a different labeling form (e.g.  standard SELinux to
MLS) without needing to do a full relabel of the filesystem.

In such cases, we want getxattr() to return the canonical security context
that the kernel is using rather than what is stored on disk.

The implementation hooks into the inode_getsecurity(), adding another
parameter to indicate the result of the preceding fs-level getxattr() call,
so that SELinux knows whether to compare a value obtained from disk with
the kernel value.

We also now allow getxattr() to work for mountpoint labeled filesystems
(i.e.  mount with option context=foo_t), as we are able to return the
kernel value to the user.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Brian Gerst 0d078f6f96 [PATCH] CONFIG_IA32
Add CONFIG_X86_32 for i386.  This allows selecting options that only apply
to 32-bit systems.

(X86 && !X86_64) becomes X86_32
(X86 ||  X86_64) becomes X86

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:10 -08:00
Trond Myklebust d3f8cf4899 [PATCH] NFS: Remove unbalanced spin_unlock() calls from nfs_refresh_inode()
Doh!

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 14:46:47 -08:00
Anton Altaparmakov 07b188ab77 Merge branch 'master' of /usr/src/ntfs-2.6/ 2005-10-30 21:00:04 +00:00
Adam Litke 2e9b367c22 [PATCH] hugetlb: overcommit accounting check
Basic overcommit checking for hugetlb_file_map() based on an implementation
used with demand faulting in SLES9.

Since demand faulting can't guarantee the availability of pages at mmap
time, this patch implements a basic sanity check to ensure that the number
of huge pages required to satisfy the mmap are currently available.
Despite the obvious race, I think it is a good start on doing proper
accounting.  I'd like to work towards an accounting system that mimics the
semantics of normal pages (especially for the MAP_PRIVATE/COW case).  That
work is underway and builds on what this patch starts.

Huge page shared memory segments are simpler and still maintain their
commit on shmget semantics.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:43 -07:00
Adam Litke 4c88726597 [PATCH] hugetlb: demand fault handler
Below is a patch to implement demand faulting for huge pages.  The main
motivation for changing from prefaulting to demand faulting is so that huge
page memory areas can be allocated according to NUMA policy.

Thanks to consolidated hugetlb code, switching the behavior requires changing
only one fault handler.  The bulk of the patch just moves the logic from
hugelb_prefault() to hugetlb_pte_fault() and find_get_huge_page().

Signed-off-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:43 -07:00
Christoph Hellwig 0b1533f67c [PATCH] cleanup hugelbfs_forget_inode
Reformat hugelbfs_forget_inode and add the missing but harmless
write_inode_now call.  It looks the same as generic_forget_inode now except
for the call to truncate_hugepages instead of truncate_inode_pages.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:43 -07:00
Christoph Hellwig 6b09b9df05 [PATCH] kill hugelbfs_do_delete_inode
hugetlbfs_do_delete_inode is the same as generic_delete_inode now, so remove
it in favour of the latter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:43 -07:00
Christoph Hellwig 149f4211af [PATCH] hugetlbfs: clean up hugetlbfs_delete_inode
Make hugetlbfs looks the same as generic_detelte_inode, fixing a bunch of
missing updates to it at the same time.  Rename it to
hugetlbfs_do_delete_inode and add a real hugetlbfs_delete_inode that
implements ->delete_inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:43 -07:00
Christoph Hellwig 96527980d4 [PATCH] hugetlbfs: move free_inodes accounting
Move hugetlbfs accounting into ->alloc_inode / ->destroy_inode.  This keeps
the code simpler, fixes a loeak where a failing inode allocation wouldn't
decrement the counter and moves hugetlbfs_delete_inode and
hugetlbfs_forget_inode closer to their generic counterparts.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:43 -07:00
Hugh Dickins 4c21e2f244 [PATCH] mm: split page table lock
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.

This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock.  (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)

In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.

Splitting the lock is not quite for free: another cacheline access.  Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS.  But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.

There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Hugh Dickins deceb6cd17 [PATCH] mm: follow_page with inner ptlock
Final step in pushing down common core's page_table_lock.  follow_page no
longer wants caller to hold page_table_lock, uses pte_offset_map_lock itself;
and so no page_table_lock is taken in get_user_pages itself.

But get_user_pages (and get_futex_key) do then need follow_page to pin the
page for them: take Daniel's suggestion of bitflags to follow_page.

Need one for WRITE, another for TOUCH (it was the accessed flag before:
vanished along with check_user_page_readable, but surely get_numa_maps is
wrong to mark every page it finds as accessed), another for GET.

And another, ANON to dispose of untouched_anonymous_page: it seems silly for
that to descend a second time, let follow_page observe if there was no page
table and return ZERO_PAGE if so.  Fix minor bug in that: check VM_LOCKED -
make_pages_present ought to make readonly anonymous present.

Give get_numa_maps a cond_resched while we're there.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins 508034a32b [PATCH] mm: unmap_vmas with inner ptlock
Remove the page_table_lock from around the calls to unmap_vmas, and replace
the pte_offset_map in zap_pte_range by pte_offset_map_lock: all callers are
now safe to descend without page_table_lock.

Don't attempt fancy locking for hugepages, just take page_table_lock in
unmap_hugepage_range.  Which makes zap_hugepage_range, and the hugetlb test in
zap_page_range, redundant: unmap_vmas calls unmap_hugepage_range anyway.  Nor
does unmap_vmas have much use for its mm arg now.

The tlb_start_vma and tlb_end_vma in unmap_page_range are now called without
page_table_lock: if they're implemented at all, they typically come down to
flush_cache_range (usually done outside page_table_lock) and flush_tlb_range
(which we already audited for the mprotect case).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins 705e87c0c3 [PATCH] mm: pte_offset_map_lock loops
Convert those common loops using page_table_lock on the outside and
pte_offset_map within to use just pte_offset_map_lock within instead.

These all hold mmap_sem (some exclusively, some not), so at no level can a
page table be whipped away from beneath them.  But whereas pte_alloc loops
tested with the "atomic" pmd_present, these loops are testing with pmd_none,
which on i386 PAE tests both lower and upper halves.

That's now unsafe, so add a cast into pmd_none to test only the vital lower
half: we lose a little sensitivity to a corrupt middle directory, but not
enough to worry about.  It appears that i386 and UML were the only
architectures vulnerable in this way, and pgd and pud no problem.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins c74df32c72 [PATCH] mm: ptd_alloc take ptlock
Second step in pushing down the page_table_lock.  Remove the temporary
bridging hack from __pud_alloc, __pmd_alloc, __pte_alloc: expect callers not
to hold page_table_lock, whether it's on init_mm or a user mm; take
page_table_lock internally to check if a racing task already allocated.

Convert their callers from common code.  But avoid coming back to change them
again later: instead of moving the spin_lock(&mm->page_table_lock) down,
switch over to new macros pte_alloc_map_lock and pte_unmap_unlock, which
encapsulate the mapping+locking and unlocking+unmapping together, and in the
end may use alternatives to the mm page_table_lock itself.

These callers all hold mmap_sem (some exclusively, some not), so at no level
can a page table be whipped away from beneath them; and pte_alloc uses the
"atomic" pmd_present to test whether it needs to allocate.  It appears that on
all arches we can safely descend without page_table_lock.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins 365e9c87a9 [PATCH] mm: update_hiwaters just in time
update_mem_hiwater has attracted various criticisms, in particular from those
concerned with mm scalability.  Originally it was called whenever rss or
total_vm got raised.  Then many of those callsites were replaced by a timer
tick call from account_system_time.  Now Frank van Maarseveen reports that to
be found inadequate.  How about this?  Works for Frank.

Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
by 1): those are hot paths.  Do the opposite, update only when about to lower
rss (usually by many), or just before final accounting in do_exit.  Handle
mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
that whoever collects these hiwater statistics do the work of taking the
maximum with rss or total_vm.

And there has been no collector of these hiwater statistics in the tree.  The
new convention needs an example, so match Frank's usage by adding a VmPeak
line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
(High-Water-Mark or High-Water-Memory).

There was a particular anomaly during mremap move, that hiwater_vm might be
captured too high.  A fleeting such anomaly remains, but it's quickly
corrected now, whereas before it would stick.

What locking?  None: if the app is racy then these statistics will be racy,
it's not worth any overhead to make them exact.  But whenever it suits,
hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
page_table_lock (for now) or with preemption disabled (later on): without
going to any trouble, minimize the time between reading current values and
updating, to minimize those occasions when a racing thread bumps a count up
and back down in between.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Nick Piggin b5810039a5 [PATCH] core remove PageReserved
Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.

PageReserved special casing is removed from get_page and put_page.

All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.

MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated.  We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).

Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.

Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not.  This still
needs to be addressed (a generic page_is_ram() should work).

A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss).  These writes to the struct
page could cause excessive cacheline bouncing on big systems.  There are a
number of ways this could be addressed if it is an issue.

Signed-off-by: Nick Piggin <npiggin@suse.de>

Refcount bug fix for filemap_xip.c

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins 4294621f41 [PATCH] mm: rss = file_rss + anon_rss
I was lazy when we added anon_rss, and chose to change as few places as
possible.  So currently each anonymous page has to be counted twice, in rss
and in anon_rss.  Which won't be so good if those are atomic counts in some
configurations.

Change that around: keep file_rss and anon_rss separately, and add them
together (with get_mm_rss macro) when the total is needed - reading two
atomics is much cheaper than updating two atomics.  And update anon_rss
upfront, typically in memory.c, not tucked away in page_add_anon_rmap.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins 404351e67a [PATCH] mm: mm_init set_mm_counters
How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
that's not so good if an mm_counter_t is a special type.  And how is rss
initialized?  By set_mm_counter, all over the place.  Come on, we just need to
initialize them both at once by set_mm_counter in mm_init (which follows the
memcpy when forking).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Andi Kleen dfcd3c0dc4 [PATCH] Convert mempolicies to nodemask_t
The NUMA policy code predated nodemask_t so it used open coded bitmaps.
Convert everything to nodemask_t.  Big patch, but shouldn't have any actual
behaviour changes (except I removed one unnecessary check against
node_online_map and one unnecessary BUG_ON)

Signed-off-by: "Andi Kleen" <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Pete Zaitcev c36fc889b5 [PATCH] usb: Patch for USBDEVFS_IOCTL from 32-bit programs
Dell supplied me with the following test:

#include<stdio.h>
#include<errno.h>
#include<sys/ioctl.h>
#include<fcntl.h>
#include<linux/usbdevice_fs.h>

main(int argc,char*argv[])
{
   struct usbdevfs_hub_portinfo hubPortInfo = {0};
   struct usbdevfs_ioctl command = {0};
   command.ifno = 0;
   command.ioctl_code = USBDEVFS_HUB_PORTINFO;
   command.data = (void*)&hubPortInfo;
   int fd, ret;
   if(argc != 2) {
     fprintf(stderr,"Usage: %s /proc/bus/usb/<BusNo>/<HubID>\n",argv[0]);
     fprintf(stderr,"Example: %s /proc/bus/usb/001/001\n",argv[0]);
     exit(1);
   }
   errno = 0;
   fd = open(argv[1],O_RDWR);
   if(fd < 0) {
     perror("open failed:");
     exit(errno);
   }
   errno = 0;
   ret = ioctl(fd,USBDEVFS_IOCTL,&command);
   printf("IOCTL return status:%d\n",ret);
   if(ret<0) {
     perror("IOCTL failed:");
     close(fd);
     exit(3);
   } else {
       printf("IOCTL passed:Num of ports %d\n",hubPortInfo.nports);
       close(fd);
       exit(0);
   }
   return 0;
}

I have verified that it breaks if built in 32 bit mode on x86_64 and that
the patch below fixes it.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 16:47:46 -07:00
Peter Osterlund 6cd37cda7e [PATCH] Fix ext3 warning for unused var
Fix compile warning in ext3 quota code.

Signed-off-by: Peter Osterlund <petero2@telia.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 13:57:57 -07:00
Linus Torvalds 84860bf064 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2005-10-28 13:09:47 -07:00
Linus Torvalds 8caf89157d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 2005-10-28 12:44:24 -07:00
Dave Kleikamp 7038f1cbac JFS: make sure right-most xtree pages have header.next set to zero
The xtTruncate code was only doing this for leaf pages.  When a file is
horribly fragmented, we may truncate a file leaving an internal page with
an invalid head.next field, which may cause a stale page to be referenced.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-10-28 13:27:40 -05:00
Greg KH 6fbfddcb52 Merge ../bleed-2.6 2005-10-28 10:13:16 -07:00
Greg Kroah-Hartman 53f4654272 [PATCH] Driver Core: fix up all callers of class_device_create()
The previous patch adding the ability to nest struct class_device
changed the paramaters to the call class_device_create().  This patch
fixes up all in-kernel users of the function.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 09:52:52 -07:00
Kay Sievers a7fd67062e [PATCH] add sysfs attr to re-emit device hotplug event
A "coldplug + udevstart" can be simple like this:
  for i in /sys/block/*/*/uevent; do echo 1 > $i; done
  for i in /sys/class/*/*/uevent; do echo 1 > $i; done
  for i in /sys/bus/*/devices/*/uevent; do echo 1 > $i; done

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-10-28 09:52:51 -07:00
Linus Torvalds 0ee40c6628 Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block 2005-10-28 08:53:00 -07:00
Al Viro c4cdd03831 [PATCH] gfp_t: reiserfs mapping_set_gfp_mask() use
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:51 -07:00
Al Viro 27496a8c67 [PATCH] gfp_t: fs/*
- ->releasepage() annotated (s/int/gfp_t), instances updated
 - missing gfp_t in fs/* added
 - fixed misannotation from the original sweep caught by bitwise checks:
   XFS used __nocast both for gfp_t and for flags used by XFS allocator.
   The latter left with unsigned int __nocast; we might want to add a
   different type for those but for now let's leave them alone.  That,
   BTW, is a case when __nocast use had been actively confusing - it had
   been used in the same code for two different and similar types, with
   no way to catch misuses.  Switch of gfp_t to bitwise had caught that
   immediately...

One tricky bit is left alone to be dealt with later - mapping->flags is
a mix of gfp_t and error indications.  Left alone for now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Al Viro af4ca457ea [PATCH] gfp_t: infrastructure
Beginning of gfp_t annotations:

 - -Wbitwise added to CHECKFLAGS
 - old __bitwise renamed to __bitwise__
 - __bitwise defined to either __bitwise__ or nothing, depending on
   __CHECK_ENDIAN__ being defined
 - gfp_t switched from __nocast to __bitwise__
 - force cast to gfp_t added to __GFP_... constants
 - new helper - gfp_zone(); extracts zone bits out of gfp_t value and casts
   the result to int

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:46 -07:00
Chen, Kenneth W 20e5c81fcf [patch] remove gendisk->stamp_idle field
struct gendisk has these two fields: stamp, stamp_idle.  Update to
stamp_idle is always in sync with stamp and they are always the same.
Therefore, it does not add any value in having two fields tracking
same timestamp.  Suggest to remove it.

Also, we should only update gendisk stats with non-zero value.
Advantage is that we don't have to needlessly calculate memory address,
and then add zero to the content.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-10-28 08:15:30 +02:00
Paul Mackerras 4542437679 Merge in v2.6.14 by hand 2005-10-28 13:38:53 +10:00
Trond Myklebust bec273b491 NFS: Allow files that are open for write to invalidate caches
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:45 -04:00
Trond Myklebust 16c32b71bc NFSv4: Convert unnecessary XDR warning messages into dprintk()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:45 -04:00
Trond Myklebust 4f9838c7ec NFSv4: Add post-op attributes to NFSv4 write and commit callbacks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:44 -04:00
Trond Myklebust 16e429596d NFSv4: Add post-op attributes to nfs4_proc_remove()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:44 -04:00
Trond Myklebust 6caf2c8276 NFSv4: Add post-op attributes to nfs4_proc_rename()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:43 -04:00
Trond Myklebust 91ba2eeec5 NFSv4: Add post-op attributes to nfs4_proc_link()
Optimise attribute revalidation when hardlinking. Add post-op attributes
 for the directory and the original inode.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:42 -04:00
Trond Myklebust cf80955614 NFS: Ensure that nfs_link() instantiates the dentry correctly
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:42 -04:00
Trond Myklebust 516a6af641 NFS: Add optional post-op getattr instruction to the NFSv4 file close.
"Optional" means that the close call will not fail if the getattr
 at the end of the compound fails.
 If it does succeed, try to refresh inode attributes.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:41 -04:00
Trond Myklebust 3338c143b4 NFS: Optimise attribute revalidation on close().
Only force a getattr in nfs_file_flush() if the attribute
 cache is stale.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:41 -04:00
Trond Myklebust 56ae19f38f NFSv4: Add directory post-op attributes to the CREATE operations.
Since the directory attributes change every time we CREATE a file,
 we might as well pick up the new directory attributes in the same
 compound.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:40 -04:00
Chuck Lever 0c70b50150 NFS: nfs_lookup doesn't need to revalidate the parent directory's inode
nfs_lookup() used to consult a lookup cache before trying an actual wire
 lookup operation.  The lookup cache would be invalid, of course, if the
 parent directory's mtime had changed, so nfs_lookup performed an inode
 revalidation on the parent.

 Since nfs_lookup() doesn't use a cache anymore, the revalidation is no
 longer necessary.  There are cases where it will generate a lot of
 unnecessary GETATTR traffic.

 See http://bugzilla.linux-nfs.org/show_bug.cgi?id=9

 Test-plan:
 Use lndir and "rm -rf" and watch for excess GETATTR traffic or application
 level errors.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:40 -04:00
Trond Myklebust decf491f30 NFS: Don't let nfs_end_data_update() clobber attribute update information
Since we almost always call nfs_end_data_update() after we called
 nfs_refresh_inode(), we now end up marking the inode metadata
 as needing revalidation immediately after having updated it.

 This patch rearranges things so that we mark the inode as needing
 revalidation _before_ we call nfs_refresh_inode() on those operations
 that need it.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:39 -04:00
Trond Myklebust 33801147a8 NFS: Optimise inode attribute cache updates
Allow nfs_refresh_inode() also to update attributes on the inode if the
 RPC call was sent after the last call to nfs_update_inode().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:39 -04:00
Trond Myklebust 913a70fc17 NFS: Convert cache_change_attribute into a jiffy-based value
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:38 -04:00
Trond Myklebust 0e574af1be NFS: Cleanup initialisation of struct nfs_fattr
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:38 -04:00
Trond Myklebust 4c2cb58c55 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-27 19:12:49 -04:00
Trond Myklebust 34123da66e NFS: Fix a bad cast in nfs3_read_done
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 19:10:09 -04:00
Steve French 0753ca7bc2 [CIFS] Change pragma pack(1) to attribute(packed) to allow cifs on arm to access
unaligned structures coming in off the wire

gcc on arm processors generates very odd code with pragma pack specified -
although it does pack the structures in some sense - it does not allow you
to access unaligned elements in nested structures at the right offset as other
architectures do.  Oddly enough though, specifying the structures as packed
the long way - one by one with the packed attribute does work.  Rather than
fighting over whether this is a gcc bug or some obscure side effect
of pragma pack, it is easier to do what most (all but 96 other places in
the kernel) do - and replace pragma pack with dozens of attribute(packed)
structure qualifiers.  Much more verbose ... but at least it works.

Signed-off-by: David Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>  CG: -----------------------------------------------------------------------
2005-10-27 13:55:12 -07:00
Steve French 04290949b3 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-10-27 12:53:03 -07:00
Peter Wainwright 94c1d31845 [PATCH] Fix HFS+ to free up the space when a file is deleted.
fsck_hfs reveals lots of temporary files accumulating in the hidden
directory "\000\000\000HFS+ Private Data".  According to the HFS+
documentation these are files which are unlinked while in use.  However,
there may be a bug in the Linux hfsplus implementation which causes this to
happen even when the files are not in use.  It looks like the "opencnt"
field is never initialized as (I think) it should be in hfsplus_read_inode.
 This means that a file can appear to be still in use when in fact it has
been closed.  This patch seems to fix it for me.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00