Commit Graph

11118 Commits

Author SHA1 Message Date
David Howells b6dff3ec5e CRED: Separate task security context from task_struct
Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com>

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:16 +11:00
David Howells 1cdcbec1a3 CRED: Neuter sys_capset()
Take away the ability for sys_capset() to affect processes other than current.

This means that current will not need to lock its own credentials when reading
them against interference by other processes.

This has effectively been the case for a while anyway, since:

 (1) Without LSM enabled, sys_capset() is disallowed.

 (2) With file-based capabilities, sys_capset() is neutered.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:14 +11:00
David Howells da9592edeb CRED: Wrap task credential accesses in the filesystem subsystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:05 +11:00
David Howells 82ab8deda7 CRED: Wrap task credential accesses in the XFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: xfs@oss.sgi.com
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:04 +11:00
David Howells a5f773a659 CRED: Wrap task credential accesses in the UFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:04 +11:00
David Howells 7706bb39ad CRED: Wrap task credential accesses in the UDF filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:03 +11:00
David Howells 26bf1946e6 CRED: Wrap task credential accesses in the UBIFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Adrian Hunter <ext-adrian.hunter@nokia.com>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:03 +11:00
David Howells fc7333deb7 CRED: Wrap task credential accesses in the SYSV filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:02 +11:00
David Howells e2950b178e CRED: Wrap task credential accesses in the SMBFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:01 +11:00
David Howells 414cb209ea CRED: Wrap task credential accesses in the ReiserFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: reiserfs-devel@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:01 +11:00
David Howells 0785f4dad0 CRED: Wrap task credential accesses in the RAMFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:39:00 +11:00
David Howells c222d53eb3 CRED: Wrap task credential accesses in the OMFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Bob Copeland <me@bobcopeland.com>
Cc: linux-karma-devel@lists.sourceforge.net
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:59 +11:00
David Howells b19c2a3b83 CRED: Wrap task credential accesses in the OCFS2 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: ocfs2-devel@oss.oracle.com
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:59 +11:00
David Howells 5cc0a84076 CRED: Wrap task credential accesses in the NFS daemon
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:58 +11:00
David Howells 48937024c6 CRED: Wrap task credential accesses in the NCPFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Petr Vandrovec <vandrove@vc.cvut.cz>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:58 +11:00
David Howells 922c030f26 CRED: Wrap task credential accesses in the Minix filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:57 +11:00
David Howells 8f659adfa2 CRED: Wrap task credential accesses in the JFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: jfs-discussion@lists.sourceforge.net
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:56 +11:00
David Howells 77c70de15a CRED: Wrap task credential accesses in the hugetlbfs filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: William Irwin <wli@holomorphy.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:56 +11:00
David Howells de395b8ac2 CRED: Wrap task credential accesses in the HPFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:55 +11:00
David Howells 4ac8489a72 CRED: Wrap task credential accesses in the HFSplus filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:54 +11:00
David Howells 94c9a5ee4c CRED: Wrap task credential accesses in the HFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:54 +11:00
David Howells 3de7be3355 CRED: Wrap task credential accesses in the GFS2 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:53 +11:00
David Howells 2186a71cbc CRED: Wrap task credential accesses in the FUSE filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: fuse-devel@lists.sourceforge.net
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:53 +11:00
David Howells f0ce7ee3a8 CRED: Wrap task credential accesses in the FAT filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:52 +11:00
David Howells 4c9c544e49 CRED: Wrap task credential accesses in the Ext4 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: adilger@sun.com
Cc: linux-ext4@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:51 +11:00
David Howells 6a2f90e9fa CRED: Wrap task credential accesses in the Ext3 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: adilger@sun.com
Cc: linux-ext4@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:51 +11:00
David Howells a8dd4d67bd CRED: Wrap task credential accesses in the Ext2 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: linux-ext4@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:50 +11:00
David Howells 4eea03539d CRED: Wrap task credential accesses in the eCryptFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Mike Halcrow <mhalcrow@us.ibm.com>
Cc: Phillip Hellewell <phillip@hellewell.homeip.net>
Cc: ecryptfs-devel@lists.sourceforge.net
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:49 +11:00
David Howells ec4c2aacd1 CRED: Wrap task credential accesses in the devpts filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:49 +11:00
David Howells 97b7702cd1 CRED: Wrap task credential accesses in the Coda filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: codalist@coda.cs.cmu.edu
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:48 +11:00
David Howells a001e5b558 CRED: Wrap task credential accesses in the CIFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Steve French <sfrench@samba.org>
Cc: linux-cifs-client@lists.samba.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:47 +11:00
David Howells 1109b07b7d CRED: Wrap task credential accesses in the BFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Tigran A. Aivazian <tigran@aivazian.fsnet.co.uk>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:47 +11:00
David Howells 0eb790e3a2 CRED: Wrap task credential accesses in the autofs4 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Ian Kent <raven@themaw.net>
Cc: autofs@linux.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:46 +11:00
David Howells 73c646e4af CRED: Wrap task credential accesses in the autofs filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: autofs@linux.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:45 +11:00
David Howells 215599815d CRED: Wrap task credential accesses in the AFFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:45 +11:00
David Howells f8b9d53a31 CRED: Wrap task credential accesses in 9P2000 filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: v9fs-developer@lists.sourceforge.net
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-14 10:38:44 +11:00
Steve French fb39601664 [CIFS] remove unused list, add new cifs sock list to prepare for mount/umount fix
Also adds two lines missing from the previous patch (for the need reconnect flag in the
/proc/fs/cifs/DebugData handling)

The new global_cifs_sock_list is added, and initialized in init_cifs but not used yet.
Jeff Layton will be adding code in to use that and to remove the GlobalTcon and GlobalSMBSession
lists.

CC: Jeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-13 20:04:07 +00:00
Linus Torvalds 7b42365396 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
  dlm: fix shutdown cleanup
2008-11-13 11:56:05 -08:00
Steve French 3b79521093 [CIFS] Fix cifs reconnection flags
In preparation for Jeff's big umount/mount fixes to remove the possibility of
various races in cifs mount and linked list handling of sessions, sockets and
tree connections, this patch cleans up some repetitive code in cifs_mount,
and addresses a problem with ses->status and tcon->tidStatus in which we
were overloading the "need_reconnect" state with other status in that
field.  So the "need_reconnect" flag has been broken out from those
two state fields (need reconnect was not mutually exclusive from some of the
other possible tid and ses states).  In addition, a few exit cases in
cifs_mount were cleaned up, and a problem with a tcon flag (for lease support)
was not being set consistently for the 2nd mount of the same share

CC: Jeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-13 19:45:32 +00:00
David Teigland 278afcbf4f dlm: fix shutdown cleanup
Fixes a regression from commit 0f8e0d9a31,
"dlm: allow multiple lockspace creates".

An extraneous 'else' slipped into a code fragment being moved from
release_lockspace() to dlm_release_lockspace().  The result of the
unwanted 'else' is that dlm threads and structures are not stopped
and cleaned up when the final dlm lockspace is removed.  Trying to
create a new lockspace again afterward will fail with
"kmem_cache_create: duplicate cache dlm_conn" because the cache
was not previously destroyed.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-11-13 13:22:34 -06:00
Theodore Tso 6cdfcc275e ext3: Clean up outdated and incorrect comment for ext3_write_super()
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-12 17:17:17 -08:00
Eric W. Biederman afef80b3d8 vfs: fix shrink_submounts
In the last refactoring of shrink_submounts a variable was not completely
renamed.  So finish the renaming of mnt to m now.

Without this if you attempt to mount an nfs mount that has both automatic
nfs sub mounts on it, and has normal mounts on it.  The unmount will
succeed when it should not.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-12 17:17:17 -08:00
David S. Miller 7e452baf6b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/message/fusion/mptlan.c
	drivers/net/sfc/ethtool.c
	net/mac80211/debugfs_sta.c
2008-11-11 15:43:02 -08:00
Linus Torvalds 04ca2c17e3 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  [XFS] XFS: Check for valid transaction headers in recovery
  [XFS] handle memory allocation failures during log initialisation
  [XFS] Account for allocated blocks when expanding directories
  [XFS] Wait for all I/O on truncate to zero file size
  [XFS] Fix use-after-free with log and quotas
2008-11-11 09:32:58 -08:00
Tiger Yang 6c1e183e12 ocfs2: Check search result in ocfs2_xattr_block_get()
ocfs2_xattr_block_get() calls ocfs2_xattr_search() to find an external
xattr, but doesn't check the search result that is passed back via struct
ocfs2_xattr_search. Add a check for search result, and pass back -ENODATA if
the xattr search failed. This avoids a later NULL pointer error.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Mark Fasheh de29c08528 ocfs2: fix printk related build warnings in xattr.c
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Dmitri Monakhov c435400140 ocfs2: truncate outstanding block after direct io failure
Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Joel Becker <Joel.Becker@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Tao Ma 80bcaf3469 ocfs2/xattr: Proper hash collision handle in bucket division
In ocfs2/xattr, we must make sure the xattrs which have the same hash value
exist in the same bucket so that the search schema can work. But in the old
implementation, when we want to extend a bucket, we just move half number of
xattrs to the new bucket. This works in most cases, but if we are lucky
enough we will move 2 xattrs into 2 different buckets. This means that an
xattr from the previous bucket cannot be found anymore. This patch fix this
problem by finding the right position during extending the bucket and extend
an empty bucket if needed.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Cc: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Tao Ma 4c1bbf1ba6 ocfs2: return 0 in page_mkwrite to let VFS retry.
In ocfs2_page_mkwrite, we return -EINVAL when we found the page mapping
isn't updated, and it will cause the user space program get SIGBUS and
exit. The reason is that during race writeable mmap, we will do
unmap_mapping_range in ocfs2_data_downconvert_worker. The good thing is
that if we reuturn 0 in page_mkwrite, VFS will retry fault and then
call page_mkwrite again, so it is safe to return 0 here.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Sunil Mushran ae0dff6830 ocfs2: Set journal descriptor to NULL after journal shutdown
Patch sets journal descriptor to NULL after the journal is shutdown.
This ensures that jbd2_journal_release_jbd_inode(), which removes the
jbd2 inode from txn lists, can be called safely from ocfs2_clear_inode()
even after the journal has been shutdown.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Tao Ma d32647993c ocfs2: Fix check of return value of ocfs2_start_trans() in xattr.c.
On failure, ocfs2_start_trans() returns values like ERR_PTR(-ENOMEM),
so we should check whether handle is NULL. Fix them to use IS_ERR().
Jan has made the patch for other part in ocfs2(thank Jan for it), so
this is just the fix for fs/ocfs2/xattr.c.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:47 -08:00
Jan Kara b99835c168 ocfs2: Let inode be really deleted when ocfs2_mknod_locked() fails
We forgot to set i_nlink to 0 when returning due to error from ocfs2_mknod_locked()
and thus inode was not properly released via ocfs2_delete_inode() (e.g. claimed
space was not released). Fix it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Jan Kara 87cfa00432 ocfs2: Fix checking of return value of new_inode()
new_inode() does not return ERR_PTR() but NULL in case of failure. Correct
checking of the return value.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Jan Kara fa38e92cb3 ocfs2: Fix check of return value of ocfs2_start_trans()
On failure, ocfs2_start_trans() returns values like ERR_PTR(-ENOMEM).
Thus checks for !handle are wrong. Fix them to use IS_ERR().

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Tao Ma 8573f79d30 ocfs2: Fix some typos in xattr annotations.
Fix some typos in the xattr annotations.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Reported-by: Coly Li <coyli@suse.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Tao Ma 63fd775737 ocfs2: Remove unused ocfs2_restore_xattr_block().
Since now ocfs2 supports empty xattr buckets, we will never remove
the xattr index tree even if all the xattrs are removed, so this
function will never be called. So remove it.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Joel Becker 54f443f4e7 ocfs2: Don't repeat ocfs2_xattr_block_find()
ocfs2_xattr_block_get() looks up the xattr in a startlingly familiar
way; it's identical to the function ocfs2_xattr_block_find().  Let's just
use the later in the former.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Joel Becker eb6ff2397d ocfs2: Specify appropriate journal access for new xattr buckets.
There are a couple places that get an xattr bucket that may be reading
an existing one or may be allocating a new one.  They should specify the
correct journal access mode depending.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:46 -08:00
Joel Becker bd60bd37ad ocfs2: Check errors from ocfs2_xattr_update_xattr_search()
The ocfs2_xattr_update_xattr_search() function can return an error when
trying to read blocks off of disk.  The caller needs to check this error
before using those (possibly invalid) blocks.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:45 -08:00
Joel Becker b37c4d84e9 ocfs2: Don't return -EFAULT from a corrupt xattr entry.
If the xattr disk structures are corrupt, return -EIO, not -EFAULT.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:45 -08:00
Joel Becker f6087fb799 ocfs2: Check xattr block signatures properly.
The xattr.c code is currently memcmp()ing naking buffer pointers.
Create the OCFS2_IS_VALID_XATTR_BLOCK() macro to match its peers and use
that.

In addition, failed signature checks were returning -EFAULT, which is
completely wrong.  Return -EIO.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:44 -08:00
Tiger Yang c988fd045f ocfs2: add handler_map array bounds checking
Make the handler_map array as large as the possible value range to avoid
a fencepost error.

[ Utilize alternate method -- Joel ]

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:44 -08:00
Tiger Yang ceb1eba3dc ocfs2: remove duplicate definition in xattr
Include/linux/xattr.h already has the definition about xattr prefix,
so remove the duplicate definitions in xattr.c.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:44 -08:00
Tiger Yang 0030e00150 ocfs2: fix function declaration and definition in xattr
Because we merged the xattr sources into one file, some functions
no longer belong in the header file.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:44 -08:00
Tiger Yang c3cb682735 ocfs2: fix license in xattr
This patch fixes the license in xattr.c and xattr.h.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-11-10 09:51:43 -08:00
David Chinner 220ca310a5 [XFS] XFS: Check for valid transaction headers in recovery
When we are about to add a new item to a transaction in recovery, we need
to check that it is valid first. Currently we just assert that header
magic number matches, but in production systems that is not present and we
add a corrupted transaction to the list to be processed. This results in a
kernel oops later when processing the corrupted transaction.

Instead, if we detect a corrupted transaction, abort recovery and leave
the user to clean up the mess that has occurred.

SGI-PV: 988145

SGI-Modid: xfs-linux-melb:xfs-kern:32356a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 18:01:50 +11:00
Dave Chinner 8f330f5149 [XFS] handle memory allocation failures during log initialisation
When there is no memory left in the system, xfs_buf_get_noaddr()
can fail. If this happens at mount time during xlog_alloc_log()
we fail to catch the error and oops.

Catch the error from xfs_buf_get_noaddr(), and allow other memory
allocations to fail and catch those errors too. Report the error
to the console and fail the mount with ENOMEM.

Tested by manually injecting errors into xfs_buf_get_noaddr() and
xlog_alloc_log().

Version 2:
o remove unnecessary casts of the returned pointer from kmem_zalloc()

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:57:06 +11:00
David Chinner 6f9f51adb6 [XFS] Account for allocated blocks when expanding directories
When we create a directory, we reserve a number of blocks for the maximum
possible expansion of of the directory due to various btree splits,
freespace allocation, etc. Unfortunately, each allocation is not reflected
in the total number of blocks still available to the transaction, so the
maximal reservation is used over and over again.

This leads to problems where an allocation group has only enough blocks
for *some* of the allocations required for the directory modification.
After the first N allocations, the remaining blocks in the allocation
group drops below the total reservation, and subsequent allocations fail
because the allocator will not allow the allocation to proceed if the AG
does not have the enough blocks available for the entire allocation total.

This results in an ENOSPC occurring after an allocation has already
occurred. This results in aborting the directory operation (leaving the
directory in an inconsistent state) and cancelling a dirty transaction,
which results in a filesystem shutdown.

Avoid the problem by reflecting the number of blocks allocated in any
directory expansion in the total number of blocks available to the
modification in progress. This prevents a directory modification from
being aborted part way through with an ENOSPC.

SGI-PV: 988144

SGI-Modid: xfs-linux-melb:xfs-kern:32340a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:51:14 +11:00
Lachlan McIlroy 2cf7f0da3a [XFS] Wait for all I/O on truncate to zero file size
It's possible to have outstanding xfs_ioend_t's queued when the file size
is zero. This can happen in the direct I/O path when a direct I/O write
fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie
xfs_end_io_direct() does not know that the I/O failed so can't force the
xfs_ioend_t to be flushed synchronously).

When we truncate a file on unlink we don't know to wait for these
xfs_ioend_ts and we can have a use-after-free situation if the inode is
reclaimed before the xfs_ioend_t is finally processed.

As was suggested by Dave Chinner lets wait for all I/Os to complete when
truncating the file size to zero.

SGI-PV: 981668

SGI-Modid: xfs-linux-melb:xfs-kern:32216a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-11-10 17:51:00 +11:00
Lachlan McIlroy 9ccbece546 [XFS] Fix use-after-free with log and quotas
Destroying the quota stuff on unmount can access the log - ie
XFS_QM_DONE() ends up in xfs_dqunlock() which calls
xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the
log has already been destroyed. Just move the cleanup of the quota code
earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving
XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot.

SGI-PV: 987086

SGI-Modid: xfs-linux-melb:xfs-kern:32148a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
2008-11-10 17:43:23 +11:00
Dave Chinner 6307091fe6 [XFS] Avoid using inodes that haven't been completely initialised
The radix tree walks in xfs_sync_inodes_ag and xfs_qm_dqrele_all_inodes()
can find inodes that are still undergoing initialisation. Avoid
them by checking for the the XFS_INEW() flag once we have a reference
on the inode. This flag is cleared once the inode is properly initialised.

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:13:23 +11:00
Dave Chinner cb4f0d1d42 [XFS] fix uninitialised variable bug in dquot release
gcc on ARM warns about an using an uninitialised variable
in xfs_qm_dqrele_all_inodes(). This is a real bug, but gcc
on x86_64 is not reporting this warning so it went unnoticed.

Fix the bug by bring the inode radix tree walk code up to
date with xfs_sync_inodes_ag().

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:11:18 +11:00
Stephen Rothwell d44dab8d1c fs: xfs needs inode_wait to be exported
Since wait_on_inode() references it.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 17:06:05 +11:00
Dave Chinner 644c3567d1 [XFS] handle memory allocation failures during log initialisation
When there is no memory left in the system, xfs_buf_get_noaddr()
can fail. If this happens at mount time during xlog_alloc_log()
we fail to catch the error and oops.

Catch the error from xfs_buf_get_noaddr(), and allow other memory
allocations to fail and catch those errors too. Report the error
to the console and fail the mount with ENOMEM.

Tested by manually injecting errors into xfs_buf_get_noaddr() and
xlog_alloc_log().

Version 2:
o remove unnecessary casts of the returned pointer from kmem_zalloc()

SGI-PV: 987246

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-11-10 16:50:24 +11:00
Linus Torvalds 8b805ef617 Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
  Fix nfsd truncation of readdir results
2008-11-09 12:25:44 -08:00
Doug Nazar b726e923ea Fix nfsd truncation of readdir results
Commit 8d7c4203 "nfsd: fix failure to set eof in readdir in some
situations" introduced a bug: on a directory in an exported ext3
filesystem with dir_index unset, a READDIR will only return about 250
entries, even if the directory was larger.

Bisected it back to this commit; reverting it fixes the problem.

It turns out that in this case ext3 reads a block at a time, then
returns from readdir, which means we can end up with buf.full==0 but
with more entries in the directory still to be read.  Before 8d7c4203
(but after c002a6c797 "Optimise NFS readdir hack slightly"), this would
cause us to return the READDIR result immediately, but with the eof bit
unset.  That could cause a performance regression (because the client
would need more roundtrips to the server to read the whole directory),
but no loss in correctness, since the cleared eof bit caused the client
to send another readdir.  After 8d7c4203, the setting of the eof bit
made this a correctness problem.

So, move nfserr_eof into the loop and remove the buf.full check so that
we loop until buf.used==0.  The following seems to do the right thing
and reduces the network traffic since we don't return a READDIR result
until the buffer is full.

Tested on an empty directory & large directory; eof is properly sent and
there are no more short buffers.

Signed-off-by: Doug Nazar <nazard@dragoninc.ca>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-11-09 15:15:50 -05:00
Linus Torvalds 1538a093f7 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
  ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
  ext4: calculate journal credits correctly
  ext4: wait on all pending commits in ext4_sync_fs()
  ext4: Convert to host order before using the values.
  ext4: fix missing ext4_unlock_group in error path
  jbd2: deregister proc on failure in jbd2_journal_init_inode
  jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space
  jbd: don't give up looking for space so easily in __log_wait_for_space
2008-11-07 08:15:18 -08:00
Frederic Bohe 23712a9c28 ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode
When initializing an uninitialized block group in ext4_new_inode(),
its block group checksum must be re-calculated.  This fixes a race
when several threads try to allocate a new inode in an UNINIT'd group.

There is some question whether we need to be initializing the block
bitmap in ext4_new_inode() at all, but for now, if we are going to
init the block group, let's eliminate the race.

Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-07 09:21:01 -05:00
Aneesh Kumar K.V ed9b3e3379 ext4: Mark the buffer_heads as dirty and uptodate after prepare_write
We need to make sure we mark the buffer_heads as dirty and uptodate
so that block_write_full_page write them correctly.

This fixes mmap corruptions that can occur in low memory situations.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-07 09:06:45 -05:00
Adrian Hunter 7e2d9bfa4e UBIFS: allow for gaps when dirtying the LPT
The LPT may have gaps in it because initially empty LEBs
are not added by mkfs.ubifs - because it does not know how
many there are.  Then UBIFS allocates empty LEBs in the
reverse order that they are discovered i.e. they are
added to, and removed from, the front of a list.  That
creates a gap in the middle of the LPT.

The function dirtying the LPT tree (for the purpose of
small model garbage collection) assumed that a gap could
only occur at the very end of the LPT and stopped dirtying
prematurely, which in turn resulted in the LPT running
out of space - something that is designed to be impossible.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
2008-11-07 12:11:52 +02:00
David S. Miller 9eeda9abd1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/ath5k/base.c
	net/8021q/vlan_core.c
2008-11-06 22:43:03 -08:00
Niv Sardi dcd7b4e5c0 Merge branch 'master' of git://oss.sgi.com:8090/xfs/linux-2.6 2008-11-07 15:07:12 +11:00
Linus Torvalds e252f4db18 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Block: use round_jiffies_up()
  Add round_jiffies_up and related routines
  block: fix __blkdev_get() for removable devices
  generic-ipi: fix the smp_mb() placement
  blk: move blk_delete_timer call in end_that_request_last
  block: add timer on blkdev_dequeue_request() not elv_next_request()
  bio: define __BIOVEC_PHYS_MERGEABLE
  block: remove unused ll_new_mergeable()
2008-11-06 15:53:47 -08:00
Linus Torvalds c361948712 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  [JFFS2] fix race condition in jffs2_lzo_compress()
  [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)
  [JFFS2] Fix lack of locking in thread_should_wake()
  [JFFS2] Fix build failure with !CONFIG_JFFS2_FS_WRITEBUFFER
  [MTD] [NAND] OMAP2: remove duplicated #include
2008-11-06 15:43:13 -08:00
OGAWA Hirofumi c3302931db fat: i_blocks warning fix
blkcnt_t type depends on CONFIG_LSF. Use unsigned long long always for
printk().  But lazy to type it, so add "llu" and use it.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:22 -08:00
OGAWA Hirofumi 9ca59f4c3d fat: ->i_pos race fix
i_pos is 64bits value, hence it's not atomic to update.

Important place is fat_write_inode() only, other places without lock
are just for printk().

This adds lock for "BITS_PER_LONG == 32" kernel.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 2bdf67eb16 fat: mmu_private race fix
mmu_private is 64bits value, hence it's not atomic to update.

So, the access rule for mmu_private is we must hold ->i_mutex.  But,
fat_get_block() path doesn't follow the rule on non-allocation path.

This fixes by using i_size instead if non-allocation path.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 0e75f5da06 fat: Add printf attribute to fat_fs_panic()
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi fa93ca18a8 fat: Fix _fat_bmap() race
fat_get_cluster() assumes the requested blocknr isn't truncated during
read. _fat_bmap() doesn't follow this rule.

This protects it by ->i_mutex.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi dfc209c006 fat: Fix ATTR_RO for directory
FAT has the ATTR_RO (read-only) attribute. But on Windows, the ATTR_RO
of the directory will be just ignored actually, and is used by only
applications as flag. E.g. it's setted for the customized folder by
Explorer.

http://msdn2.microsoft.com/en-us/library/aa969337.aspx

This adds "rodir" option. If user specified it, ATTR_RO is used as
read-only flag even if it's the directory. Otherwise, inode->i_mode
is not used to hold ATTR_RO (i.e. fat_mode_can_save_ro() returns 0).

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 9183482f5d fat: Fix ATTR_RO in the case of (~umask & S_WUGO) == 0
If inode->i_mode doesn't have S_WUGO, current code assumes it means
ATTR_RO.  However, if (~[ufd]mask & S_WUGO) == 0, inode->i_mode can't
hold S_WUGO. Therefore the updated directory entry will always have
ATTR_RO.

This adds fat_mode_can_hold_ro() to check it. And if inode->i_mode
can't hold, uses -i_attrs to hold ATTR_RO instead.

With this, we don't set ATTR_RO unless users change it via ioctl() if
(~[ufd]mask & S_WUGO) == 0.

And on FAT_IOCTL_GET_ATTRIBUTES path, this adds ->i_mutex to it for
not returning the partially updated attributes by FAT_IOCTL_SET_ATTRIBUTES
to userland.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 9c0aa1b87b fat: Cleanup FAT attribute stuff
This adds three helpers:

fat_make_attrs() - makes FAT attributes from inode.
fat_make_mode()  - makes mode_t from FAT attributes.
fat_save_attrs() - saves FAT attributes to inode.

Then this replaces: MSDOS_MKMODE() by fat_make_mode(), fat_attr() by
fat_make_attrs(), ->i_attrs = attr & ATTR_UNUSED by fat_save_attrs().
And for root inode, those is used with ATTR_DIR instead of bogus
ATTR_NONE.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 45cfbe3547 fat: Cleanup msdos_lookup()
Use same style with vfat_lookup().

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 1c13a243a4 fat: Kill d_invalidate() in vfat_lookup()
d_invalidate() for positive dentry doesn't work in some cases
(vfsmount, nfsd, and maybe others). shrink_dcache_parent() by
d_invalidate() is pointless for vfat usage at all.

So, this kills it, and intead of it uses d_move().

To save old behavior, this returns alias simply for directory (don't
change pwd, etc..). the directory lookup shouldn't be important for
performance.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 1b52467243 fat: Fix/Cleanup dcache handling for vfat
- Add comments for handling dcache of vfat.

- Separate case-sensitive case and case-insensitive to
  vfat_revalidate() and vfat_ci_revalidate().

  vfat_revalidate() doesn't need to drop case-insensitive negative
  dentry on creation path.

- Current code is missing to set ->d_revalidate to the negative dentry
  created by unlink/etc..

  This sets ->d_revalidate always, and returns 1 for positive
  dentry. Now, we don't need to change ->d_op dynamically anymore,
  so this just uses sb->s_root->d_op to set ->d_op.

- d_find_alias() may return DCACHE_DISCONNECTED dentry. It's not
  the interesting dentry there. This checks it.

- Add missing LOOKUP_PARENT check. We don't need to drop the valid
  negative dentry for (LOOKUP_CREATE | LOOKUP_PARENT) lookup.

- For consistent filename on creation path, this drops negative dentry
  if we can't see intent.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi 068f5ae05c vfat: Fix vfat_find() error path in vfat_lookup()
Current vfat_lookup() creates negetive dentry blindly if vfat_find()
returned a error. It's wrong. If the error isn't -ENOENT, just return
error.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:21 -08:00
OGAWA Hirofumi a993b542bb fat: use fat_detach() in fat_clear_inode()
Use fat_detach() instead of opencoding it.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
OGAWA Hirofumi 5e35dd4651 fat: Fix fat_ent_update_ptr() for FAT12
This fixes the missing update for bhs/nr_bhs in case the caller
accessed from block boundary to first block of boundary.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
OGAWA Hirofumi d3dfa8228f fat: improve fat_hash()
fat_hash() is using the algorithm known as bad. Instead of it, this
uses hash_32(). The following is the summary of test.

old hash:
	hash func (1000 times): 33489 cycles
	total inodes in hash table: 70926
	largest bucket contains: 696
	smallest bucket contains: 54

new hash:
	hash func (1000 times): 33129 cycles
	total inodes in hash table: 70926
	largest bucket contains: 315
	smallest bucket contains: 236

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
Darren Jenkins 52e9d9f4b3 fat: cleanup fat_parse_long() error handling
Coverity CID 2332 & 2333 RESOURCE_LEAK

In fat_search_long() if fat_parse_long() returns a -ve value we return
without first freeing unicode.  This patch free's them on this error path.

The above was false positive on current tree, but this change is more
clean, so apply as cleanup.

[hirofumi@mail.parknet.co.jp: fix coding style]
Signed-off-by: Darren Jenkins <darrenrjenkins@gmail.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
OGAWA Hirofumi 53472bc8f8 fat: use generic_file_llseek() for directory
Since fat_dir_ioctl() was already fixed (i.e. called under ->i_mutex),
and __fat_readdir() doesn't take BKL anymore. So, BKL for ->llseek()
is pointless, and we have to use generic_file_llseek().

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
OGAWA Hirofumi 7decd1cb03 fat: Fix and cleanup timestamp conversion
This cleans date_dos2unix()/fat_date_unix2dos() up. New code should be
much more readable.

And this fixes those old functions. Those doesn't handle 2100
correctly. 2100 isn't leap year, but old one handles it as leap year.
Also, with this, centi sec is handled and is fixed.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
OGAWA Hirofumi 9e975dae29 fat: split include/msdos_fs.h
This splits __KERNEL__ stuff in include/msdos_fs.h into fs/fat/fat.h.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
OGAWA Hirofumi 990e194e69 fat: move fs/vfat/* and fs/msdos/* to fs/fat
This just moves those files, but change link order from MSDOS, VFAT to
VFAT, MSDOS.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:20 -08:00
Arthur Jones c87591b719 ext3: wait on all pending commits in ext3_sync_fs
In ext3_sync_fs, we only wait for a commit to finish if we started it, but
there may be one already in progress which will not be synced.

In the case of a data=ordered umount with pending long symlinks which are
delayed due to a long list of other I/O on the backing block device, this
causes the buffer associated with the long symlinks to not be moved to the
inode dirty list in the second phase of fsync_super.  Then, before they
can be dirtied again, kjournald exits, seeing the UMOUNT flag and the
dirty pages are never written to the backing block device, causing long
symlink corruption and exposing new or previously freed block data to
userspace.

This can be reproduced with a script created
by Eric Sandeen <sandeen@redhat.com>:

	#!/bin/bash

	umount /mnt/test2
	mount /dev/sdb4 /mnt/test2
	rm -f /mnt/test2/*
	dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
	touch
	/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
	ln -s
	/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
	/mnt/test2/link
	umount /mnt/test2
	mount /dev/sdb4 /mnt/test2
	ls /mnt/test2/
	umount /mnt/test2

To ensure all commits are synced, we flush all journal commits now when
sync_fs'ing ext3.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org>		[2.6.everything]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:19 -08:00
Ian Kent 96b0317906 autofs4: collect version check return
The function check_dev_ioctl_version() returns an error code upon fail but
it isn't captured and returned in validate_dev_ioctl() as it should be.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:17 -08:00
Ian Kent bc9c406838 autofs4: correct offset mount expire check
When checking a directory tree in autofs_tree_busy() we can incorrectly
decide that the tree isn't busy.  This happens for the case of an active
offset mount as autofs4_follow_mount() follows past the active offset
mount, which has an open file handle used for expires, causing the file
handle not to count toward the busyness check.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-06 15:41:17 -08:00
Theodore Ts'o ac51d83705 ext4: calculate journal credits correctly
This fixes a 2.6.27 regression which was introduced in commit a02908f1.

We weren't passing the chunk parameter down to the two subections,
ext4_indirect_trans_blocks() and ext4_ext_index_trans_blocks(), with
the result that massively overestimate the amount of credits needed by
ext4_da_writepages, especially in the non-extents case.  This causes
failures especially on /boot partitions, which tend to be small and
non-extent using since GRUB doesn't handle extents.

This patch fixes the bug reported by Joseph Fannin at:
http://bugzilla.kernel.org/show_bug.cgi?id=11964

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-06 16:49:36 -05:00
Artem Bityutskiy e84461ad9c UBIFS: fix compilation warnings
We print 'ino_t' type using '%lu' printk() placeholder, but this
results in many warnings when compiling for Alpha platform. Fix
this by adding (unsingned long) casts.

Fixes these warnings:

fs/ubifs/journal.c:693: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/journal.c:1131: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/dir.c:163: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/tnc.c:2680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/tnc.c:2700: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/replay.c:1066: warning: format '%lu' expects type 'long unsigned int', but argument 7 has type 'ino_t'
fs/ubifs/orphan.c:108: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:135: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:142: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:154: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:159: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:451: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:539: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:612: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:843: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:856: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1438: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1443: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1475: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/recovery.c:1495: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:105: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:105: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:110: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:114: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:114: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:118: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:118: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'
fs/ubifs/debug.c:1591: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1671: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1674: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1699: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1788: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1821: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1833: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:1924: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1932: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1938: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1945: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1953: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1960: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1967: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1973: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1988: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/debug.c:1991: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/debug.c:2009: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'ino_t'

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-06 11:06:31 +02:00
Harvey Harrison 0ecb9529a4 UBIFS: endian handling fixes and annotations
Noticed by sparse:
fs/ubifs/file.c:75:2: warning: restricted __le64 degrades to integer
fs/ubifs/file.c:629:4: warning: restricted __le64 degrades to integer
fs/ubifs/dir.c:431:3: warning: restricted __le64 degrades to integer

This should be checked to ensure the ubifs_assert is working as
intended, I've done the suggested annotation in this patch.

fs/ubifs/sb.c:298:6: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:298:6:    expected int [signed] [assigned] tmp
fs/ubifs/sb.c:298:6:    got restricted __le64 [usertype] <noident>
fs/ubifs/sb.c:299:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:299:19:    expected restricted __le64 [usertype] atime_sec
fs/ubifs/sb.c:299:19:    got int [signed] [assigned] tmp
fs/ubifs/sb.c:300:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:300:19:    expected restricted __le64 [usertype] ctime_sec
fs/ubifs/sb.c:300:19:    got int [signed] [assigned] tmp
fs/ubifs/sb.c:301:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:301:19:    expected restricted __le64 [usertype] mtime_sec
fs/ubifs/sb.c:301:19:    got int [signed] [assigned] tmp

This looks like a bugfix as your tmp was a u32 so there was truncation in
the atime, mtime, ctime value, probably not intentional, add a tmp_le64
and use it here.

fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:419:9: warning: cast to restricted __le32

Read from the annotated union member instead.

fs/ubifs/recovery.c:175:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:175:13:    expected unsigned int [unsigned] [usertype] save_flags
fs/ubifs/recovery.c:175:13:    got restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:186:13:    expected restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13:    got unsigned int [unsigned] [usertype] save_flags

Do byteshifting at compile time of the flag value.  Annotate the saved_flags
as le32.

fs/ubifs/debug.c:368:10: warning: cast to restricted __le32
fs/ubifs/debug.c:368:10: warning: cast from restricted __le64

Should be checked if the truncation was intentional, I've changed the
printk to print the full width.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-06 11:06:19 +02:00
Artem Bityutskiy 069782a1ee UBIFS: remove printk
Remove the "UBIFS background thread ubifs_bgd0_0 started" message.
We kill the background thread when we switch to R/O mode, and
start it again whan we switch to R/W mode. OLPC is doing this
many times during boot, and we see this message many times as
well, which is irritating. So just kill the message.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-06 11:06:14 +02:00
Tejun Heo 89f97496e8 block: fix __blkdev_get() for removable devices
Commit 0762b8bde9 moved disk_get_part()
in front of recursive get on the whole disk, which caused removable
devices to try disk_get_part() before rescanning after a new media is
inserted, which might fail legit open attempts or give the old
partition.

This patch fixes the problem by moving disk_get_part() after
__blkdev_get() on the whole disk.

This problem was spotted by Borislav Petkov.

Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-11-06 08:41:56 +01:00
Geert Uytterhoeven dc8a0843a4 [JFFS2] fix race condition in jffs2_lzo_compress()
deflate_mutex protects the globals lzo_mem and lzo_compress_buf.  However,
jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the
compressed data from lzo_compress_buf.  Correct this by moving the mutex
unlock after the copy.

In addition, document what deflate_mutex actually protects.

Cc: stable@kernel.org
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Richard Purdie <rpurdie@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-11-05 23:22:02 +01:00
Theodore Ts'o 14ce0cb411 ext4: wait on all pending commits in ext4_sync_fs()
In ext4_sync_fs, we only wait for a commit to finish if we started it,
but there may be one already in progress which will not be synced.

In the case of a data=ordered umount with pending long symlinks which
are delayed due to a long list of other I/O on the backing block
device, this causes the buffer associated with the long symlinks to
not be moved to the inode dirty list in the second phase of
fsync_super.  Then, before they can be dirtied again, kjournald exits,
seeing the UMOUNT flag and the dirty pages are never written to the
backing block device, causing long symlink corruption and exposing new
or previously freed block data to userspace.

To ensure all commits are synced, we flush all journal commits now
when sync_fs'ing ext4.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
2008-11-03 18:10:55 -05:00
Aneesh Kumar K.V d94e99a64c ext4: Convert to host order before using the values.
Use le16_to_cpu to read the s_reserved_gdt_blocks values
from super block.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-04 09:11:26 -05:00
Aneesh Kumar K.V ae2d9fb18e ext4: fix missing ext4_unlock_group in error path
If we try to free a block which is already freed, the code was
returning without first unlocking the group.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-04 09:10:50 -05:00
Steve French c527c8a7ff [CIFS] Can't rely on iov length and base when kernel_recvmsg returns error
When retrying kernel_recvmsg, reset iov_base and iov_len.

Note comment from Sridhar: "In the normal path, iov.iov_len is clearly set to 4. But i think you are
running into a case where kernel_recvmsg() is called via 'goto incomplete_rcv'
It happens if the previous call fails with EAGAIN.
If you want to call recvmsg() after EAGAIN failure, you need to reset iov."

Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-03 20:46:21 +00:00
Linus Torvalds a75952b72a Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: fix renaming one hardlink on top of another
  [CIFS] fix error in smb_send2
  [CIFS] Reduce number of socket retries in large write path
2008-11-03 11:43:59 -08:00
Jeff Layton ae6884a9da cifs: fix renaming one hardlink on top of another
cifs: fix renaming one hardlink on top of another

POSIX says that renaming one hardlink on top of another to the same
inode is a no-op. We had the logic mostly right, but forgot to clear
the return code.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-03 18:31:05 +00:00
Linus Torvalds c8126cc602 Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc:
  proc: revert /proc/uptime to ->read_proc hook
2008-11-03 09:59:01 -08:00
Sami Liedes 2423840ded jbd2: deregister proc on failure in jbd2_journal_init_inode
jbd2_journal_init_inode() does not call jbd2_stats_proc_exit() on all
failure paths after calling jbd2_stats_proc_init(). This leaves
dangling references to the fs in proc.

This patch fixes a bug reported by Sami Leides at:
http://bugzilla.kernel.org/show_bug.cgi?id=11493

Signed-off-by: Sami Liedes <sliedes@cc.hut.fi>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-02 19:23:30 -05:00
Theodore Ts'o 8c3f25d895 jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space
Commit 23f8b79e introducd a regression because it assumed that if
there were no transactions ready to be checkpointed, that no progress
could be made on making space available in the journal, and so the
journal should be aborted.  This assumption is false; it could be the
case that simply calling jbd2_cleanup_journal_tail() will recover the
necessary space, or, for small journals, the currently committing
transaction could be responsible for chewing up the required space in
the log, so we need to wait for the currently committing transaction
to finish before trying to force a checkpoint operation.

This patch fixes a bug reported by Mihai Harpau at:
https://bugzilla.redhat.com/show_bug.cgi?id=469582

This patch fixes a bug reported by François Valenduc at:
http://bugzilla.kernel.org/show_bug.cgi?id=11840

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Duane Griffin <duaneg@dghda.com>
Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
2008-11-06 22:38:07 -05:00
Theodore Ts'o e219cca082 jbd: don't give up looking for space so easily in __log_wait_for_space
Commit be07c4ed introducd a regression because it assumed that if
there were no transactions ready to be checkpointed, that no progress
could be made on making space available in the journal, and so the
journal should be aborted.  This assumption is false; it could be the
case that simply calling cleanup_journal_tail() will recover the
necessary space, or, for small journals, the currently committing
transaction could be responsible for chewing up the required space in
the log, so we need to wait for the currently committing transaction
to finish before trying to force a checkpoint operation.

This patch fixes the bug reported by Meelis Roos at:
http://bugzilla.kernel.org/show_bug.cgi?id=11937

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Duane Griffin <duaneg@dghda.com>
Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
2008-11-06 22:37:59 -05:00
Al Viro 233e70f422 saner FASYNC handling on file close
As it is, all instances of ->release() for files that have ->fasync()
need to remember to evict file from fasync lists; forgetting that
creates a hole and we actually have a bunch that *does* forget.

So let's keep our lives simple - let __fput() check FASYNC in
file->f_flags and call ->fasync() there if it's been set.  And lose that
crap in ->release() instances - leaving it there is still valid, but we
don't have to bother anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01 09:49:46 -07:00
Linus Torvalds e06f42d6c1 Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
  NLM: Set address family before calling nlm_host_rebooted()
  nfsd: fix failure to set eof in readdir in some situations
2008-10-31 15:44:08 -07:00
David Woodhouse b27cf88e95 [JFFS2] Fix lack of locking in thread_should_wake()
The thread_should_wake() function trawls through the list of 'very
dirty' eraseblocks, determining whether the background GC thread should
wake. Doing this without holding the appropriate locks is a bad idea.

OLPC Trac #8615

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
2008-10-31 14:52:24 +00:00
Linus Torvalds eff2502801 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  delay capable() check in ext4_has_free_blocks()
  merge ext4_claim_free_blocks & ext4_has_free_blocks
  jbd2: Call the commit callback before the transaction could get dropped
  ext4: fix a bug accessing freed memory in ext4_abort
  ext3: fix a bug accessing freed memory in ext3_abort
2008-10-31 07:52:12 -07:00
Harvey Harrison be85940548 fs: replace NIPQUAD()
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31 00:56:28 -07:00
David S. Miller a1744d3bee Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/p54/p54common.c
2008-10-31 00:17:34 -07:00
David Howells 91b7771251 CRED: Wrap task credential accesses in the XFS filesystem
Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
2008-10-31 15:50:04 +11:00
Chuck Lever d7dc61d0a7 NLM: Set address family before calling nlm_host_rebooted()
The nlm_host_rebooted() function uses nlm_cmp_addr() to find an
nsm_handle that matches the rebooted peer.  In order for this to work,
the passed-in address must have a proper address family.

This fixes a post-2.6.28 regression introduced by commit 781b61a6, which
added AF_INET6 support to nlm_cmp_addr().  Before that commit,
nlm_cmp_addr() didn't care about the address family; it compared only
the sin_addr.s_addr field for equality.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-30 17:19:30 -04:00
J. Bruce Fields 8d7c4203c6 nfsd: fix failure to set eof in readdir in some situations
Before 14f7dd6320 "[PATCH] Copy XFS
readdir hack into nfsd code", readdir_cd->err was reset to eof before
each call to vfs_readdir; afterwards, it is set only once.  Similarly,
c002a6c797 "[PATCH] Optimise NFS readdir
hack slightly", can cause us to exit without nfserr_eof set.  Fix this.

This ensures the "eof" bit is set when needed in readdir replies.  (The
particular case I saw was an nfsv4 readdir of an empty directory, which
returned with no entries (the protocol requires "." and ".." to be
filtered out), but with eof unset.)

Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-30 17:16:49 -04:00
Steve French 61de800d33 [CIFS] fix error in smb_send2
smb_send2 exit logic was strange, and with the previous change
could cause us to fail large
smb writes when all of the smb was not sent as one chunk.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-10-30 20:15:22 +00:00
Linus Torvalds 1b2d3d94ec Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  SUNRPC: Fix potential race in put_rpccred()
  SUNRPC: Fix rpcauth_prune_expired
  NFS: Convert nfs_attr_generation_counter into an atomic_long
  SUNRPC: Respond promptly to server TCP resets
2008-10-30 12:51:42 -07:00
Randy Dunlap e74481e232 fs: remove excess kernel-doc
Delete excess kernel-doc notation in fs/ subdirectory:

Warning(linux-2.6.27-git10//fs/jbd/transaction.c:886): Excess function parameter or struct member 'credits' description in 'journal_get_undo_access'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:46 -07:00
Eric Sandeen 87b811c3f9 ecryptfs: fix memory corruption when storing crypto info in xattrs
When ecryptfs allocates space to write crypto headers into, before copying
it out to file headers or to xattrs, it looks at the value of
crypt_stat->num_header_bytes_at_front to determine how much space it
needs.  This is also used as the file offset to the actual encrypted data,
so for xattr-stored crypto info, the value was zero.

So, we kzalloc'd 0 bytes, and then ran off to write to that memory.
(Which returned as ZERO_SIZE_PTR, so we explode quickly).

The right answer is to always allocate a page to write into; the current
code won't ever write more than that (this is enforced by the
(PAGE_CACHE_SIZE - offset) length in the call to
ecryptfs_generate_key_packet_set).  To be explicit about this, we now send
in a "max" parameter, rather than magically using PAGE_CACHE_SIZE there.

Also, since the pointer we pass down the callchain eventually gets the
virt_to_page() treatment, we should be using a alloc_page variant, not
kzalloc (see also 7fcba05437)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:46 -07:00
Nick Piggin 4e02ed4b4a fs: remove prepare_write/commit_write
Nothing uses prepare_write or commit_write. Remove them from the tree
completely.

[akpm@linux-foundation.org: schedule simple_prepare_write() for unexporting]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-30 11:38:45 -07:00
David Chinner 6bfb3d065f [XFS] Fix race when looking up reclaimable inodes
If we get a race looking up a reclaimable inode, we can end up with the
winner proceeding to use the inode before it has been completely
re-initialised. This is a Bad Thing.

Fix the race by checking whether we are still initialising the inod eonce
we have a reference to it, and if so wait for the initialisation to
complete before continuing.

While there, fix a leaked reference count in the same code when
encountering an unlinked inode and we are not doing a lookup for a create
operation.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32429a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:32:43 +11:00
Tim Shimmin e0b8e8b65d [XFS] remove restricted chown parameter from xfs linux
On Linux all filesystems are supposed to be operating under Posix'
restricted chown. Restricted chown means it restricts chown to the owner
unless you have CAP_FOWNER.

NOTE: that 2 files outside of fs/xfs have been modified too for this
change.

Reviewed-by: Dave Chinner <david@fromorbit.com>

SGI-PV: 988919

SGI-Modid: xfs-linux-melb:xfs-kern:32413a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:30:48 +11:00
Christoph Hellwig ea5a3dc835 [XFS] kill sys_cred
capable_cred has been unused for a while so we can kill it and sys_cred.
That also means the cred argument to xfs_setattr and xfs_change_file_space
can be removed now.

SGI-PV: 988918

SGI-Modid: xfs-linux-melb:xfs-kern:32412a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:27:48 +11:00
David Chinner 7ee49acfe5 [XFS] correctly select first log item to push
Under heavy metadata load we are seeing log hangs. The AIL has items in it
ready to be pushed, and they are within the push target window. However,
we are not pushing them when the last pushed LSN is less than the LSN of
the first log item on the AIL. This is a regression introduced by the AIL
push cursor modifications.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32409a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2008-10-30 18:26:51 +11:00
Christoph Hellwig 9ed0451ee0 [XFS] free partially initialized inodes using destroy_inode
To make sure we free the security data inodes need to be freed using the
proper VFS helper (which we also need to export for this). We mark these
inodes bad so we can skip the flush path for them.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32398a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 18:26:04 +11:00
Christoph Hellwig 087e3b0460 Inode: export symbol destroy_inode
To make sure we free the security data inodes need to be freed using
the proper VFS helper (which we also need to export for this). We mark
these inodes bad so we can skip the flush path for them.

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 18:24:37 +11:00
Christoph Hellwig c679eef052 [XFS] stop using xfs_itobp in xfs_bulkstat
xfs_bulkstat only wants the dinode, offset and buffer from a given inode
number. Instead of using xfs_itobp on a fake inode which is complicated
and currently leads to leaks of the security data just use xfs_inotobp
which is designed to do exactly the kind of lookup xfs_bulkstat wants. The
only thing that's missing in xfs_inotobp is a flags paramter that let's us
pass down XFS_IMAP_BULKSTAT, but that can easily added.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32397a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 18:04:13 +11:00
David Chinner 455486b9cc [XFS] avoid all reclaimable inodes in xfs_sync_inodes_ag
If we are syncing data in xfs_sync_inodes_ag(), the VFS inode must still
be referencable as the dirty data state is carried on the VFS inode. hence
if we can't get a reference via igrab(), the inode must be in reclaim
which implies that it has no dirty data attached.

Leave such inodes to the reclaim code to flush the dirty inode state to
disk and so avoid attempting to access the VFS inode when it may not exist
in xfs_sync_inodes_ag().

Version 4:
o don't reference linux inode until after igrab() succeeds

Version 3:
o converted unlock/rele to an xfs_iput() call.

Version 2:
o change igrab logic to be more linear
o remove initial reclaimable inode check now that we are using
  igrab() failure to find reclaimable inodes
o assert that igrab failure occurs only on reclaimable inodes
o clean up inode locking - only grab the iolock if we are doing
  a SYNC_DELWRI call and we have a dirty inode.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32391a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 18:03:14 +11:00
David Chinner 56e73ec47d [XFS] Can't lock inodes in radix tree preload region
When we are inside a radix tree preload region, we cannot sleep. Recently
we moved the inode locking inside the preload region for the inode radix
tree. Fix that, and fix a missed unlock in another error path in the same
code at the same time.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32385a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:55:27 +11:00
Christoph Hellwig 2b7035fd74 [XFS] Trivial xfs_remove comment fixup
The dp to ip comment should be for the unconditional xfs_droplink call,
and the "." link obviously only exists for directories, so it should be in
the is_dir conditional.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32374a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:55:18 +11:00
Christoph Hellwig 1ec7944beb [XFS] fix biosize option
iosizelog shouldn't be the same as iosize but the logarithm of it. Then
again the current biosize option doesn't make much sense to me as it
doesn't set the preferred I/O size as mentioned in the comment next to it
but rather the allocation size and thus is identical to the allocsize
option (except for the missing logarithm). It's also not documented in
Documentation/filesystems/xfs.txt or the mount manpage.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32373a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:55:08 +11:00
Christoph Hellwig 469fc23d5d [XFS] fix the noquota mount option
Noquota should clear all mount options, and not just user and group quota.
Probably doesn't matter very much in real life.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32372a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:54:57 +11:00
Christoph Hellwig 9d565ffa33 [XFS] kill struct xfs_mount_args
No need to parse the mount option into a structure before applying it to
struct xfs_mount.

The content of xfs_start_flags gets merged into xfs_parseargs. Calls
inbetween don't care and can use mount members instead of the args struct.

This patch uncovered that the mount option for shared filesystems wasn't
ever exposed on Linux. The code to handle it is #if 0'ed in this patch
pending a decision on this feature. I'll send a writeup about it to the
list soon.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32371a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:53:24 +11:00
David Chinner 5a792c4579 [XFS] XFS: Check for valid transaction headers in recovery
When we are about to add a new item to a transaction in recovery, we need
to check that it is valid first. Currently we just assert that header
magic number matches, but in production systems that is not present and we
add a corrupted transaction to the list to be processed. This results in a
kernel oops later when processing the corrupted transaction.

Instead, if we detect a corrupted transaction, abort recovery and leave
the user to clean up the mess that has occurred.

SGI-PV: 988145

SGI-Modid: xfs-linux-melb:xfs-kern:32356a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:40:09 +11:00
David Chinner 783a2f656f [XFS] Finish removing the mount pointer from the AIL API
Change all the remaining AIL API functions that are passed struct
xfs_mount pointers to pass pointers directly to the struct xfs_ail being
used. With this conversion, all external access to the AIL is via the
struct xfs_ail. Hence the operation and referencing of the AIL is almost
entirely independent of the xfs_mount that is using it - it is now much
more tightly tied to the log and the items it is tracking in the log than
it is tied to the xfs_mount.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32353a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:58 +11:00
David Chinner fc1829f34d [XFS] Add ail pointer into log items
Add an xfs_ail pointer to log items so that the log items can reference
the AIL directly during callbacks without needed a struct xfs_mount.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32352a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:46 +11:00
David Chinner a9c21c1b9d [XFS] Given the log a pointer to the AIL
When we need to go from the log to the AIL, we have to go via the
xfs_mount. Add a xfs_ail pointer to the log so we can go directly to the
AIL associated with the log.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32351a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:35 +11:00
David Chinner c7e8f26827 [XFS] Move the AIL lock into the struct xfs_ail
Bring the ail lock inside the struct xfs_ail. This means the AIL can be
entirely manipulated via the struct xfs_ail rather than needing both the
struct xfs_mount and the struct xfs_ail.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32350a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:23 +11:00
David Chinner 7b2e2a31f5 [XFS] Allow 64 bit machines to avoid the AIL lock during flushes
When copying lsn's from the log item to the inode or dquot flush lsn, we
currently grab the AIL lock. We do this because the LSN is a 64 bit
quantity and it needs to be read atomically. The lock is used to guarantee
atomicity for 32 bit platforms.

Make the LSN copying a small function, and make the function used
conditional on BITS_PER_LONG so that 64 bit machines don't need to take
the AIL lock in these places.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32349a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:12 +11:00
David Chinner 5b00f14fbd [XFS] move the AIl traversal over to a consistent interface
With the new cursor interface, it makes sense to make all the traversing
code use the cursor interface and make the old one go away. This means
more of the AIL interfacing is done by passing struct xfs_ail pointers
around the place instead of struct xfs_mount pointers.

We can replace the use of xfs_trans_first_ail() in xfs_log_need_covered()
as it is only checking if the AIL is empty. We can do that with a call to
xfs_trans_ail_tail() instead, where a zero LSN returned indicates and
empty AIL...

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32348a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:39:00 +11:00
David Chinner 27d8d5fe0e [XFS] Use a cursor for AIL traversal.
To replace the current generation number ensuring sanity of the AIL
traversal, replace it with an external cursor that is linked to the AIL.

Basically, we store the next item in the cursor whenever we want to drop
the AIL lock to do something to the current item. When we regain the lock.
the current item may already be free, so we can't reference it, but the
next item in the traversal is already held in the cursor.

When we move or delete an object, we search all the active cursors and if
there is an item match we clear the cursor(s) that point to the object.
This forces the traversal to restart transparently.

We don't invalidate the cursor on insert because the cursor still points
to a valid item. If the intem is inserted between the current item and the
cursor it does not matter; the traversal is considered to be past the
insertion point so it will be picked up in the next traversal.

Hence traversal restarts pretty much disappear altogether with this method
of traversal, which should substantially reduce the overhead of pushing on
a busy AIL.

Version 2 o add restart logic o comment cursor interface o minor cleanups

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32347a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:38:39 +11:00
David Chinner 82fa901245 [XFS] Allocate the struct xfs_ail
Rather than embedding the struct xfs_ail in the struct xfs_mount, allocate
it during AIL initialisation. Add a back pointer to the struct xfs_ail so
that we can pass around the xfs_ail and still be able to access the
xfs_mount if need be. This is th first step involved in isolating the AIL
implementation from the surrounding filesystem code.

SGI-PV: 988143

SGI-Modid: xfs-linux-melb:xfs-kern:32346a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:38:26 +11:00
David Chinner a7444053fb [XFS] Account for allocated blocks when expanding directories
When we create a directory, we reserve a number of blocks for the maximum
possible expansion of of the directory due to various btree splits,
freespace allocation, etc. Unfortunately, each allocation is not reflected
in the total number of blocks still available to the transaction, so the
maximal reservation is used over and over again.

This leads to problems where an allocation group has only enough blocks
for *some* of the allocations required for the directory modification.
After the first N allocations, the remaining blocks in the allocation
group drops below the total reservation, and subsequent allocations fail
because the allocator will not allow the allocation to proceed if the AG
does not have the enough blocks available for the entire allocation total.

This results in an ENOSPC occurring after an allocation has already
occurred. This results in aborting the directory operation (leaving the
directory in an inconsistent state) and cancelling a dirty transaction,
which results in a filesystem shutdown.

Avoid the problem by reflecting the number of blocks allocated in any
directory expansion in the total number of blocks available to the
modification in progress. This prevents a directory modification from
being aborted part way through with an ENOSPC.

SGI-PV: 988144

SGI-Modid: xfs-linux-melb:xfs-kern:32340a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:38:12 +11:00
David Chinner 8c38ab0320 [XFS] Prevent looping in xfs_sync_inodes_ag
If the last block of the AG has inodes in it and the AG is an exactly
power-of-2 size then the last inode in the AG points to the last block in
the AG. If we try to find the next inode in the AG by adding one to the
inode number, we increment the inode number past the size of the AG. The
result is that the macro XFS_INO_TO_AGINO() will strip the AG portion of
the inode number and return an inode number of zero.

That is, instead of terminating the lookup loop because we hit the inode
number went outside the valid range for the AG, the search index returns
to zero and we start traversing the radix tree from the start again. This
results in an endless loop in xfs_sync_inodes_ag().

Fix it be detecting if the new search index decreases as a result of
incrementing the current inode number. That indicate an overflow and hence
that we have finished processing the AG so we can terminate the loop.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32335a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:38:00 +11:00
David Chinner 116545130c [XFS] kill deleted inodes list
Now that the deleted inodes list is unused, kill it. This also removes the
i_reclaim list head from the xfs_inode, shrinking it by two pointers.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32334a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:49 +11:00
David Chinner 7a3be02bae [XFS] use the inode radix tree for reclaiming inodes
Use the reclaim tag to walk the radix tree and find the inodes under
reclaim. This was the only user of the deleted inode list.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32333a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:37 +11:00
David Chinner 396beb8531 [XFS] mark inodes for reclaim via a tag in the inode radix tree
Prepare for removing the deleted inode list by marking inodes for reclaim
in the inode radix trees so that we can use the radix trees to find
reclaimable inodes.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32331a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:26 +11:00
David Chinner 1dc3318ae1 [XFS] rename inode reclaim functions
The function names xfs_finish_reclaim and xfs_finish_reclaim_all are not
very descriptive of what they are reclaiming. Rename to
xfs_reclaim_inode[s] to match the xfs_sync_inodes() function.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32330a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:15 +11:00
David Chinner fce08f2f3b [XFS] move inode reclaim functions to xfs_sync.c
Background inode reclaim is run by the xfssyncd. Move the reclaim worker
functions to be close to the sync code as the are very similar in
structure and are both run from the same background thread.

SGI-PV: 988142

SGI-Modid: xfs-linux-melb:xfs-kern:32329a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:37:03 +11:00
Lachlan McIlroy 493dca6178 [XFS] Fix build warning - xfs_fs_alloc_inode() needs a return statement
SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32325a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:36:52 +11:00
David Chinner 99fa8cb3c5 [XFS] Prevent use-after-free caused by synchronous inode reclaim
With the combined linux and XFS inode, we need to ensure that the combined
structure is not freed before the generic code is finished with the inode.
As it turns out, there is a case where the XFS inode is freed before the
linux inode - when xfs_reclaim() is called from ->clear_inode() on a clean
inode, the xfs inode is freed during that call. The generic code
references the inode after the ->clear_inode() call, so this is a use
after free situation.

Fix the problem by moving the xfs_reclaim() call to ->destroy_inode()
instead of in ->clear_inode(). This ensures the combined inode structure
is not freed until after the generic code has finished with it.

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32324a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:36:40 +11:00
David Chinner bf904248a2 [XFS] Combine the XFS and Linux inodes
To avoid issues with different lifecycles of XFS and Linux inodes, embedd
the linux inode inside the XFS inode. This means that the linux inode has
the same lifecycle as the XFS inode, even when it has been released by the
OS. XFS inodes don't live much longer than this (a short stint in reclaim
at most), so there isn't significant memory usage penalties here.

Version 3 o kill xfs_icount()

Version 2 o remove unused commented out code from xfs_iget(). o kill
useless cast in VFS_I()

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32323a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:36:14 +11:00
David Chinner 8290c35f87 Inode: Allow external list initialisation
To allow XFS to combine the XFS and linux inodes into a single
structure, we need to drive inode lookup from the XFS inode cache,
not the generic inode cache. This means that we need initialise a
struct inode from a context outside alloc_inode() as it is no longer
used by XFS.

After inode allocation and initialisation, we need to add the inode
to the superblock list, the in-use list, hash it and do some
accounting. This all needs to be done with the inode_lock held and
there are already several places in fs/inode.c that do this list
manipulation.  Factor out the common code, add a locking wrapper and
export the function so ti can be called from XFS.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:35:24 +11:00
David Chinner 2cb1599f9b Inode: Allow external initialisers
To allow XFS to combine the XFS and linux inodes into a single
structure, we need to drive inode lookup from the XFS inode cache,
not the generic inode cache. This means that we need initialise a
struct inode from a context outside alloc_inode() as it is no longer
used by XFS.

Factor and export the struct inode initialisation code from
alloc_inode() to inode_init_always() as a counterpart to
inode_init_once().  i.e. we have to call this init function for each
inode instantiation (always), as opposed inode_init_once() which is
only called on slab object instantiation (once).

Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:32:23 +11:00
David Chinner 94b97e39b0 [XFS] Never call mark_inode_dirty_sync() directly
Once the Linux inode and the XFS inode are combined, we cannot rely on
just check if the linux inode exists as a method of determining if it is
valid or not. Hence we should always call xfs_mark_inode_dirty_sync()
instead as it does the correct checks to determine if the liinux inode is
in a valid state or not.

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32318a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:21:30 +11:00
David Chinner 6441e54915 [XFS] factor xfs_iget_core() into hit and miss cases
There are really two cases in xfs_iget_core(). The first is the cache hit
case, the second is the miss case. They share very little code, and hence
can easily be factored out into separate functions. This makes the code
much easier to understand and subsequently modify.

SGI-PV: 988141

SGI-Modid: xfs-linux-melb:xfs-kern:32317a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:21:19 +11:00
Christoph Hellwig 3471394ba5 [XFS] fix instant oops with tracing enabled
We can only read inode->i_count if the inode is actually there and not a
NULL pointer. This was introduced in one of the recent sync patches.

SGI-PV: 988255

SGI-Modid: xfs-linux-melb:xfs-kern:32315a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:21:10 +11:00
David Chinner 76bf105cb1 [XFS] Move remaining quiesce code.
With all the other filesystem sync code it in xfs_sync.c including the
data quiesce code, it makes sense to move the remaining quiesce code to
the same place.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32312a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:16:21 +11:00
David Chinner a4e4c4f4a8 [XFS] Kill xfs_sync()
There are no more callers to xfs_sync() now, so remove the function
altogther.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32311a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:16:11 +11:00
David Chinner cb56a4b995 [XFS] Kill SYNC_CLOSE
SYNC_CLOSE is only ever used and checked in conjunction with SYNC_WAIT,
and this only done in one spot. The only thing this does is make
XFS_bflush() calls to the data buftargs.

This will happen very shortly afterwards the xfs_sync() call anyway in the
unmount path via the xfs_close_devices(), so this code is redundant and
can be removed. That only user of SYNC_CLOSE is now gone, so kill the flag
completely.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32310a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:16:00 +11:00
David Chinner e9f1c6ee12 [XFS] make SYNC_DELWRI no longer use xfs_sync
Continue to de-multiplex xfs_sync be replacing all SYNC_DELWRI callers
with direct calls functions that do the work. Isolate the data quiesce
case to a function in xfs_sync.c. Isolate the FSDATA case with explicit
calls to xfs_sync_fsdata().

Version 2: o Push delwri related log forces into xfs_sync_inodes().

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32309a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:50 +11:00
David Chinner be97d9d557 [XFS] make SYNC_ATTR no longer use xfs_sync
Continue to de-multiplex xfs_sync be replacing all SYNC_ATTR callers with
direct calls xfs_sync_inodes(). Add an assert into xfs_sync() to ensure we
caught all the SYNC_ATTR callers.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32308a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:38 +11:00
David Chinner aacaa880bf [XFS] xfssyncd: don't call xfs_sync
Start de-multiplexing xfs_sync() by making xfs_sync_worker() call the
specific sync functions it needs. This is only a small, unique subset of
the entire xfs_sync() code so is easier to follow.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32307a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:29 +11:00
David Chinner dfd837a9eb [XFS] kill xfs_syncsub
Now that the only caller is xfs_sync(), merge the two together as it makes
no sense to keep them separate.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32306a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:21 +11:00
David Chinner 2030b5aba8 [XFS] use xfs_sync_inodes rather than xfs_syncsub
Kill the unused arg in xfs_syncsub() and xfs_sync_inodes(). For callers of
xfs_syncsub() that only want to flush inodes, replace xfs_syncsub() with
direct calls to xfs_sync_inodes() as that is all that is being done with
the specific flags being passed in.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32305a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:12 +11:00
David Chinner bc60a99323 [XFS] Use struct inodes instead of vnodes to kill vn_grab
With the sync code relocated to the linux-2.6 directory we can use struct
inodes directly. If we do the same thing for the quota release code, we
can remove vn_grab altogether. While here, convert the VN_BAD() checks to
is_bad_inode() so we can remove vnodes entirely from this code.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32304a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:15:03 +11:00
Christoph Hellwig 2af75df7be [XFS] split out two helpers from xfs_syncsub
Split out two helpers from xfs_syncsub for the dummy log commit and the
superblock writeout.

SGI-PV: 988140

SGI-Modid: xfs-linux-melb:xfs-kern:32303a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:14:53 +11:00
Christoph Hellwig 4e8938feba [XFS] Move XFS_BMAP_SANITY_CHECK out of line.
Move the XFS_BMAP_SANITY_CHECK macro out of line and make it a properly
typed function. Also pass the xfs_buf for the btree block instead of just
the btree block header, as we will need some additional information for it
to implement CRC checking of btree blocks.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32301a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:14:43 +11:00
Christoph Hellwig 7cc95a821d [XFS] Always use struct xfs_btree_block instead of short / longform
structures.

Always use the generic xfs_btree_block type instead of the short / long
structures. Add XFS_BTREE_SBLOCK_LEN / XFS_BTREE_LBLOCK_LEN defines for
the length of a short / long form block. The rationale for this is that we
will grow more btree block header variants to support CRCs and other RAS
information, and always accessing them through the same datatype with
unions for the short / long form pointers makes implementing this much
easier.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32300a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:14:34 +11:00
Christoph Hellwig 136341b41a [XFS] cleanup btree record / key / ptr addressing macros.
Replace the generic record / key / ptr addressing macros that use cpp
token pasting with simpler macros that do the job for just one given btree
type. The new macros lose the cur argument and thus can be used outside
the core btree code, but also gain an xfs_mount * argument to allow for
checking the CRC flag in the near future. Note that many of these macros
aren't actually used in the kernel code, but only in userspace (mostly in
xfs_repair).

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32295a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:11:40 +11:00
David Chinner 6c7699c047 [XFS] remove the mount inode list
Now we've removed all users of the mount inode list, we can kill it. This
reduces the size of the xfs_inode by 2 pointers.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32293a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:11:29 +11:00
Christoph Hellwig 60197e8df3 [XFS] Cleanup maxrecs calculation.
Clean up the way the maximum and minimum records for the btree blocks are
calculated. For the alloc and inobt btrees all the values are
pre-calculated in xfs_mount_common, and we switch the current loop around
the ugly generic macros that use cpp token pasting to generate type names
to two small helpers in normal C code. For the bmbt and bmdr trees these
helpers also exist, but can be called during runtime, too. Here we also
kill various macros dealing with them and inline the logic into the
get_minrecs / get_maxrecs / get_dmaxrecs methods in xfs_bmap_btree.c.

Note that all these new helpers take an xfs_mount * argument which will be
needed to determine the size of a btree block once we add support for
extended btree blocks with CRCs and other RAS information.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32292a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:11:19 +11:00
David Chinner 5b4d89ae0f [XFS] Traverse inode trees when releasing dquots
Make releasing all inode dquots traverse the per-ag inode radix trees
rather than the mount inode list. This removes another user of the mount
inode list.

Version 3 o fix comment relating to avoiding trying to release the

quota inodes and those in reclaim.

Version 2 o add comment explaining use of gang lookups for a single inode
o use IRELE, not VN_RELE o move check for ag initialisation to caller.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32291a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:08:03 +11:00
David Chinner 683a897080 [XFS] Use the inode tree for finding dirty inodes
Update xfs_sync_inodes to walk the inode radix tree cache to find dirty
inodes. This removes a huge bunch of nasty, messy code for traversing the
mount inode list safely and removes another user of the mount inode list.

Version 3 o rediff against new linux-2.6/xfs_sync.c code

Version 2 o add comment explaining use of gang lookups for a single inode
o use IRELE, not VN_RELE o move check for ag initialisation to caller.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32290a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:07:29 +11:00
David Chinner 2f8a3ce1c2 [XFS] don't block in xfs_qm_dqflush() during async writeback.
Normally dquots are written back via delayed write mechanisms. They are
flushed to their backing buffer by xfssyncd, which is then pushed out by
either AIL or xfsbufd flushing. The flush from the xfssyncd is supposed to
be non-blocking, but xfs_qm_dqflush() always waits for pinned duots, which
means that it will block for the length of time it takes to do a
synchronous log force. This causes unnecessary extra log I/O to be issued
whenever we try to flush a busy dquot.

Avoid the log forces and blocking xfssyncd by making xfs_qm_dqflush() pay
attention to what type of sync it is doing when it sees a pinned dquot and
not waiting when doing non-blocking flushes.

SGI-PV: 988147

SGI-Modid: xfs-linux-melb:xfs-kern:32287a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:07:20 +11:00
David Chinner 75c68f411b [XFS] Remove xfs_iflush_all and clean up xfs_finish_reclaim_all()
xfs_iflush_all() walks the m_inodes list to find inodes that need
reclaiming. We already have such a list - the m_del_inodes list. Replace
xfs_iflush_all() with a call to xfs_finish_reclaim_all() and clean up
xfs_finish_reclaim_all() to handle the different flush modes now needed.

Originally based on a patch from Christoph Hellwig.

Version 3 o rediff against new linux-2.6/xfs_sync.c code

Version 2 o revert xfs_syncsub() inode reclaim behaviour back to original

code o xfs_quiesce_fs() should use XFS_IFLUSH_DELWRI_ELSE_ASYNC, not

XFS_IFLUSH_ASYNC, to prevent change of behaviour.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32284a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:06:28 +11:00
David Chinner a167b17e89 [XFS] move xfssyncd code to xfs_sync.c
Move all the xfssyncd code to the new xfs_sync.c file. This places it
closer to the actual code that it interacts with, rather than just being
associated with high level VFS code.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32283a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:06:18 +11:00
David Chinner fe4fa4b8e4 [XFS] move sync code to its own file
The sync code in XFS is spread around several files. While it used to make
sense to have such a distribution, the code is about to be cleaned up and
so centralising it in one spot as the first step makes sense.

SGI-PV: 988139

SGI-Modid: xfs-linux-melb:xfs-kern:32282a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:06:08 +11:00
Barry Naujok 34519daae6 [XFS] Show buffer address with debug hexdump on corruption
SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32233a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:58 +11:00
Barry Naujok 89b2839319 [XFS] Check agf_btreeblks is valid when reading in the AGF
SGI-PV: 987683

SGI-Modid: xfs-linux-melb:xfs-kern:32232a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:49 +11:00
Barry Naujok 847fff5ca8 [XFS] Sync up kernel and user-space headers
SGI-PV: 986558

SGI-Modid: xfs-linux-melb:xfs-kern:32231a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:38 +11:00
Lachlan McIlroy 24ee0e49c9 [XFS] Make xfs_btree_check_ptr() debug-only code.
SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32224a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 17:05:26 +11:00
Peter Leckie d1de802155 [XFS] Fix build brakage from patch "Clean up dquot pincount code"
This is a fix for patch " Clean up dquot pincount code" which introduced a
build breakage due to a missing & in xfs_qm_dquot_logitem_pin.

SGI-PV: 986789

SGI-Modid: xfs-linux-melb:xfs-kern:32221a

Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:18 +11:00
Peter Leckie bc3048e3cd [XFS] Clean up dquot pincount code.
This is a code cleanup and optimization that removes a per mount point
spinlock from the quota code and cleans up the code.

The patch changes the pincount from being an int protected by a spinlock
to an atomic_t allowing the pincount to be manipulated without holding the
spinlock.

This cleanup also protects against random wakup's of both the aild and
xfssyncd by reevaluating the pincount after been woken. Two latter patches
will address the Spurious wakeups.

SGI-PV: 986789

SGI-Modid: xfs-linux-melb:xfs-kern:32215a

Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 17:05:04 +11:00
Lachlan McIlroy d112f29845 [XFS] Wait for all I/O on truncate to zero file size
It's possible to have outstanding xfs_ioend_t's queued when the file size
is zero. This can happen in the direct I/O path when a direct I/O write
fails due to ENOSPC. In this case the xfs_ioend_t will still be queued (ie
xfs_end_io_direct() does not know that the I/O failed so can't force the
xfs_ioend_t to be flushed synchronously).

When we truncate a file on unlink we don't know to wait for these
xfs_ioend_ts and we can have a use-after-free situation if the inode is
reclaimed before the xfs_ioend_t is finally processed.

As was suggested by Dave Chinner lets wait for all I/Os to complete when
truncating the file size to zero.

SGI-PV: 981668

SGI-Modid: xfs-linux-melb:xfs-kern:32216a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:59:06 +11:00
Christoph Hellwig 7f7c39ccb6 [XFS] make btree tracing generic
Make the existing bmap btree tracing generic so that it applies to all
btree types.

Some fragments lifted from a patch by Dave Chinner.

This adds two files that were missed from the previous btree tracing
checkin.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32210a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:50 +11:00
Christoph Hellwig 3cc7524c84 [XFS] mark various functions in xfs_btree.c static
Lots of functionality in xfs_btree.c isn't needed by callers outside of
this file anymore, so mark these functions static.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32209a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:41 +11:00
Christoph Hellwig 4a26e66e77 [XFS] add keys_inorder and recs_inorder btree methods
Add methods to check whether two keys/records are in the righ order. This
replaces the xfs_btree_check_key and xfs_btree_check_rec methods. For the
callers from xfs_bmap.c just opencode the bmbt-specific asserts.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32208a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:32 +11:00
Christoph Hellwig fd6bcc5b63 [XFS] kill xfs_bmbt_log_block and xfs_bmbt_log_recs
These are equivalent to the xfs_btree_* versions, and the only remaining
caller can be switched to the generic one after they are exported. Also
remove some now dead infrastructure in xfs_bmap_btree.c.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32207a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:21 +11:00
Christoph Hellwig 8cc938fe42 [XFS] implement generic xfs_btree_get_rec
Not really much reason to make it generic given that it's so small, but
this is the last non-method in xfs_alloc_btree.c and xfs_ialloc_btree.c,
so it makes the whole btree implementation more structured.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32206a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:11 +11:00
Christoph Hellwig 91cca5df9b [XFS] implement generic xfs_btree_delete/delrec
Make the btree delete code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32205a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:58:01 +11:00
Christoph Hellwig d4b3a4b7dd [XFS] move xfs_bmbt_killroot to common code
xfs_bmbt_killroot is a mostly generic implementation of moving from a real
block based root to an inode based root. So move it to xfs_btree.c where
it can use all the nice infrastructure there and make it pointer size
agnostic

The new name for it is xfs_btree_kill_iroot, following the old naming but
making it clear we're dealing with the root in inode case here, and to
avoid confusion with xfs_btree_new_root which is used for the not inode
rooted case. I've also added a comment describing what it does and why
it's named the way it is.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32203a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:51 +11:00
Christoph Hellwig 4b22a57188 [XFS] implement generic xfs_btree_insert/insrec
Make the btree insert code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32202a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:40 +11:00
Christoph Hellwig ea77b0a66e [XFS] move xfs_bmbt_newroot to common code
xfs_bmbt_newroot is a mostly generic implementation of moving from an
inode root to a real block based root. So move it to xfs_btree.c where it
can use all the nice infrastructure there and make it pointer size
agnostic

The new name for it is xfs_btree_new_iroot, following the old naming but
making it clear we're dealing with the root in inode case here, and to
avoid confusion with xfs_btree_new_root which is used for the not inode
rooted case.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32201a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:28 +11:00
Christoph Hellwig 344207ce84 [XFS] implement semi-generic xfs_btree_new_root
From: Dave Chinner <dgc@sgi.com>

Add a xfs_btree_new_root helper for the alloc and ialloc btrees. The bmap
btree needs it's own version and is not converted.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32200a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:16 +11:00
Christoph Hellwig f5eb8e7ca5 [XFS] implement generic xfs_btree_split
Make the btree split code generic. Based on a patch from David Chinner
with lots of changes to follow the original btree implementations more
closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32198a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:57:03 +11:00
Christoph Hellwig 687b890a18 [XFS] implement generic xfs_btree_lshift
Make the btree left shift code generic. Based on a patch from David
Chinner with lots of changes to follow the original btree implementations
more closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32197a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:53 +11:00
Christoph Hellwig 9eaead51be [XFS] implement generic xfs_btree_rshift
Make the btree right shift code generic. Based on a patch from David
Chinner with lots of changes to follow the original btree implementations
more closely. While this loses some of the generic helper routines for
inserting/moving/removing records it also solves some of the one off bugs
in the original code and makes it easier to verify.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32196a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:43 +11:00
Christoph Hellwig 278d0ca14e [XFS] implement generic xfs_btree_update
From: Dave Chinner <dgc@sgi.com>

The most complicated part here is the lastrec tracking for the alloc
btree. Most logic is in the update_lastrec method which has to do some
hopefully good enough dirty magic to maintain it.

[hch: split out from bigger patch and a rework of the lastrec

logic]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32194a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:32 +11:00
Christoph Hellwig 38bb74237d [XFS] implement generic xfs_btree_updkey
From: Dave Chinner <dgc@sgi.com>

Note that there are many > 80 char lines introduced due to the
xfs_btree_key casts. But the places where this happens is throw-away code
once the whole btree code gets merged into a common implementation.

The same is true for the temporary xfs_alloc_log_keys define to the new
name. All old users will be gone after a few patches.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32193a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:22 +11:00
Christoph Hellwig fe033cc848 [XFS] implement generic xfs_btree_lookup
From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32192a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:56:09 +11:00
Christoph Hellwig 8df4da4a0a [XFS] implement generic xfs_btree_decrement
From: Dave Chinner <dgc@sgi.com>

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32191a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:58 +11:00
Christoph Hellwig 637aa50f46 [XFS] implement generic xfs_btree_increment
From: Dave Chinner <dgc@sgi.com>

Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32190a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:45 +11:00
Christoph Hellwig 65f1eaeac0 [XFS] add helpers for addressing entities inside a btree block
Add new helpers in xfs_btree.c to find the record, key and block pointer
entries inside a btree block. To implement this genericly the
->get_maxrecs methods and two new xfs_btree_ops entries for the key and
record sizes are used. Also add a big comment describing how the
addressing inside a btree block works.

Note that these helpers are unused until users are introduced in the next
patches and this patch will thus cause some harmless compiler warnings.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32189a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:34 +11:00
Christoph Hellwig ce5e42db42 [XFS] add get_maxrecs btree operation
Factor xfs_btree_maxrecs into a per-btree operation.

The get_maxrecs method is based on a patch from Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32188a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:23 +11:00
Christoph Hellwig 8c4ed633e6 [XFS] make btree tracing generic
Make the existing bmap btree tracing generic so that it applies to all
btree types.

Some fragments lifted from a patch by Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32187a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:13 +11:00
David Chinner 854929f058 [XFS] add new btree statistics
From: Dave Chinner <dgc@sgi.com>

Introduce statistics coverage of all the btrees and cover all the btree
operations, not just some.

Invaluable for determining test code coverage of all the btree
operations....

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32184a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
2008-10-30 16:55:03 +11:00
Christoph Hellwig a23f6ef8ce [XFS] refactor btree validation helpers
Move the various btree validation helpers around in xfs_btree.c so that
they are close to each other and in common #ifdef DEBUG sections.

Also add a new xfs_btree_check_ptr helper to check a btree ptr that can be
either long or short form.

Split out from a bigger patch from Dave Chinner with various small changes
applied by me.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32183a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:53 +11:00
Christoph Hellwig b524bfeee2 [XFS] refactor xfs_btree_readahead
From: Dave Chinner <dgc@sgi.com>

Refactor xfs_btree_readahead to make it more readable:

(a) remove the inline xfs_btree_readahead wrapper and move all checks out

of line into the main routine.

(b) factor out helpers for short/long form btrees

(c) move check for root in inodes from the callers into
xfs_btree_readahead

[hch: split out from a big patch and minor cleanups]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32182a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:43 +11:00
Christoph Hellwig e99ab90d6a [XFS] add a long pointers flag to xfs_btree_cur
Add a flag to the xfs btree cursor when using long (64bit) block pointers
instead of checking btnum == XFS_BTNUM_BMAP.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32181a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:33 +11:00
Christoph Hellwig 8186e517fa [XFS] make btree root in inode support generic
The bmap btree is rooted in the inode and not in a disk block. Make the
support for this feature more generic by adding a btree flag to for this
feature instead of relying on the XFS_BTNUM_BMAP btnum check.

Also clean up xfs_btree_get_block where this new flag is used.

Based upon a patch from Dave Chinner.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32180a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:22 +11:00
Christoph Hellwig de227dd960 [XFS] add generic btree types
Add generic union types for btree pointers, keys and records. The generic
btree pointer contains either a 32 and 64bit big endian scalar for short
and long form btrees, and the key and record contain the relevant type for
each possible btree.

Split out from a bigger patch from Dave Chinner and simplified a little
further.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32178a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:54:12 +11:00
Christoph Hellwig 561f7d1739 [XFS] split up xfs_btree_init_cursor
xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.

Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.

The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32176a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:59 +11:00
Christoph Hellwig f2277f06e6 [XFS] kill struct xfs_btree_hdr
This type is only embedded in struct xfs_btree_block and never used
directly. By moving the fields directly into struct xfs_btree_block a lot
of the macros for struct xfs_btree_sblock and struct xfs_btree_lblock can
be used for struct xfs_btree_block too now which helps greatly with some
of the migrations during implementing the generic btree code.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32174a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:47 +11:00
Lachlan McIlroy f338f90364 [XFS] Unlock inode before calling xfs_idestroy()
Lock debugging reported the ilock was being destroyed without being
unlocked. We don't need to lock the inode until we are going to insert it
into the radix tree.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32159a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:53:38 +11:00
Lachlan McIlroy a357a12156 [XFS] Fix use-after-free with log and quotas
Destroying the quota stuff on unmount can access the log - ie
XFS_QM_DONE() ends up in xfs_dqunlock() which calls
xfs_trans_unlocked_item() and then xfs_log_move_tail(). By this time the
log has already been destroyed. Just move the cleanup of the quota code
earlier in xfs_unmountfs() before the call to xfs_log_unmount(). Moving
XFS_QM_DONE() up near XFS_QM_DQPURGEALL() seems like a good spot.

SGI-PV: 987086

SGI-Modid: xfs-linux-melb:xfs-kern:32148a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
2008-10-30 16:53:25 +11:00
Barry Naujok 46039928c9 [XFS] Remove final remnants of dirv1 macros and other stuff
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:32002a

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 16:52:35 +11:00
Lachlan McIlroy d07c60e54f [XFS] Use xfs_idestroy() to cleanup an inode.
SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31927a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:50:35 +11:00
Lachlan McIlroy be8b78a626 [XFS] Remove kmem_zone_t argument from xfs_inode_init_once()
kmem cache constructor no longer takes a kmem_zone_t argument.

SGI-PV: 957103

SGI-Modid: xfs-linux-melb:xfs-kern:32254a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-10-30 16:42:34 +11:00
David Chinner 07c8f67587 [XFS] Make use of the init-once slab optimisation.
To avoid having to initialise some fields of the XFS inode on every
allocation, we can use the slab init-once feature to initialise them. All
we have to guarantee is that when we free the inode, all it's entries are
in the initial state. Add asserts where possible to ensure debug kernels
check this initial state before freeing and after allocation.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31925a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:11:59 +11:00
Harvey Harrison 5b095d9892 net: replace %p6 with %pI6
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 12:52:50 -07:00
Harvey Harrison 4b7a4274ca net: replace %#p6 format specifier with %pi6
gcc warns when using the # modifier with the %p format specifier,
so we can't use this to omit the colons when needed, introduces
%pi6 instead.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-29 12:50:24 -07:00
Steve French edf1ae4038 [CIFS] Reduce number of socket retries in large write path
CIFS in some heavy stress conditions cifs could get EAGAIN
repeatedly in smb_send2 which led to repeated retries and eventually
failure of large writes which could lead to data corruption.

There are three changes that were suggested by various network
developers:

1) convert cifs from non-blocking to blocking tcp sendmsg
(we left in the retry on failure)
2) change cifs to not set sendbuf and rcvbuf size for the socket
(let tcp autotune the buffer sizes since that works much better
in the TCP stack now)
3) if we have a partial frame sent in smb_send2, mark the tcp
session as invalid (close the socket and reconnect) so we do
not corrupt the remaining part of the SMB with the beginning
of the next SMB.

This does not appear to hurt performance measurably and has
been run in various scenarios, but it definately removes
a corruption that we were seeing in some high stress
test cases.

Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-10-29 00:47:57 +00:00
Harvey Harrison 1afa67f5e7 misc: replace NIP6_FMT with %p6 format specifier
The iscsi_ibft.c changes are almost certainly a bugfix as the
pointer 'ip' is a u8 *, so they never print the last 8 bytes
of the IPv6 address, and the eight bytes they do print have
a zero byte with them in each 16-bit word.

Other than that, this should cause no difference in functionality.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:06:44 -07:00
Harvey Harrison b071195deb net: replace all current users of NIP6_SEQFMT with %#p6
The define in kernel.h can be done away with at a later time.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28 16:05:40 -07:00
Trond Myklebust ae05f26940 NFS: Convert nfs_attr_generation_counter into an atomic_long
The most important property we need from nfs_attr_generation_counter is
monotonicity, which is not guaranteed by the current system of smp memory
barriers. We should convert it to an atomic_long_t, and drop the memory
barriers.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-28 15:21:40 -04:00
Eric Sandeen a996031c87 delay capable() check in ext4_has_free_blocks()
As reported by Eric Paris, the capable() check in ext4_has_free_blocks()
sometimes causes SELinux denials.

We can rearrange the logic so that we only try to use the root-reserved
blocks when necessary, and even then we can move the capable() test
to last, to avoid the check most of the time.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-28 00:08:17 -04:00
Eric Sandeen 8c3bf8a01c merge ext4_claim_free_blocks & ext4_has_free_blocks
Mingming pointed out that ext4_claim_free_blocks & ext4_has_free_blocks
are largely cut & pasted; they can be collapsed/merged as follows.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-28 00:08:12 -04:00
Theodore Ts'o 6c20ec8503 jbd2: Call the commit callback before the transaction could get dropped
The transaction can potentially get dropped if there are no buffers
that need to be written.  Make sure we call the commit callback before
potentially deciding to drop the transaction.  Also avoid
dereferencing the commit_transaction pointer in the marker for the
same reason.

This patch fixes the bug reported by Eric Paris at:
http://bugzilla.kernel.org/show_bug.cgi?id=11838

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Tested-by: Eric Paris <eparis@redhat.com>
2008-10-28 21:08:20 -04:00
Hidehiro Kawai ef2cabf7c6 ext4: fix a bug accessing freed memory in ext4_abort
Vegard Nossum reported a bug which accesses freed memory (found via
kmemcheck).  When journal has been aborted, ext4_put_super() calls
ext4_abort() after freeing the journal_t object, and then ext4_abort()
accesses it.  This patch fix it.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-27 22:53:05 -04:00
Hidehiro Kawai 44d6f78756 ext3: fix a bug accessing freed memory in ext3_abort
Vegard Nossum reported a bug which accesses freed memory (found via
kmemcheck).  When journal has been aborted, ext3_put_super() calls
ext3_abort() after freeing the journal_t object, and then ext3_abort()
accesses it.  This patch fix it.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-27 22:51:46 -04:00
Alexey Dobriyan 6c87df37dc proc: revert /proc/uptime to ->read_proc hook
Turned out some VMware userspace does pread(2) on /proc/uptime, but
seqfiles currently don't allow pread() resulting in -ESPIPE.

Seqfiles in theory can do pread(), but this can be a long story,
so revert to ->read_proc until then.

http://bugzilla.kernel.org/show_bug.cgi?id=11856

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-27 22:56:56 +03:00
Alan Cox 526719ba51 Switch to a valid email address...
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-27 08:40:17 -07:00
Davide Libenzi 9ce209d64d epoll: avoid double-inserts in case of EFAULT
In commit f337b9c583 ("epoll: drop
unnecessary test") Thomas found that there is an unnecessary (always
true) test in ep_send_events().  The callback never inserts into
->rdllink while the send loop is performed, and also does the
~EP_PRIVATE_BITS test.  Given we're holding the mutex during this time,
the conditions tested inside the loop are always true.

HOWEVER.

The test "!ep_is_linked(&epi->rdllink)" wasn't there because we insert
into ->rdllink, but because the send-events loop might terminate before
the whole list is scanned (-EFAULT).

In such cases, when the loop terminates early, and when a (leftover)
file received an event while we're performing the lockless loop, we need
such test to avoid to double insert the epoll items.  The list_splice()
done a few steps below, will correctly re-insert the ones that were left
on "txlist".

This should fix the kenrel.org bugzilla entry 11831.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 12:09:49 -07:00
Arjan van de Ven 4d36a9e65d select: deal with math overflow from borderline valid userland data
Some userland apps seem to pass in a "0" for the seconds, and several
seconds worth of usecs to select().  The old kernels accepted this just
fine, so the new kernels must too.

However, due to the upscaling of the microseconds to nanoseconds we had
some cases where we got math overflow, and depending on the GCC version
(due to inlining decisions) that actually resulted in an -EINVAL return.

This patch fixes this by adding the excess microseconds to the seconds
field.

Also with thanks to Marcin Slusarz for spotting some implementation bugs
in the diagnostics patches.

Reported-by: Carlos R. Mafra <crmafra2@gmail.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-26 11:22:08 -07:00
Theodore Ts'o 3c37fc86d2 ext4: Fix duplicate entries returned from getdents() system call
Fix a regression caused by commit d0156417, "ext4: fix ext4_dx_readdir
hash collision handling", where deleting files in a large directory
(requiring more than one getdents system call), results in some
filenames being returned twice.  This was caused by a failure to
update info->curr_hash and info->curr_minor_hash, so that if the
directory had gotten modified since the last getdents() system call
(as would be the case if the user is running "rm -r" or "git clean"),
a directory entry would get returned twice to the userspace.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

This patch fixes the bug reported by Markus Trippelsdorf at:
http://bugzilla.kernel.org/show_bug.cgi?id=11844

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
2008-10-25 22:37:55 -04:00
Theodore Ts'o 8c9fa93d51 ext3: Fix duplicate entries returned from getdents() system call
Fix a regression caused by commit 6a897cf4, "ext3: fix ext3_dx_readdir
hash collision handling", where deleting files in a large directory
(requiring more than one getdents system call), results in some
filenames being returned twice.  This was caused by a failure to
update info->curr_hash and info->curr_minor_hash, so that if the
directory had gotten modified since the last getdents() system call
(as would be the case if the user is running "rm -r" or "git clean"),
a directory entry would get returned twice to the userspace.

This patch fixes the bug reported by Markus Trippelsdorf at:
http://bugzilla.kernel.org/show_bug.cgi?id=11844

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
2008-10-25 22:37:44 -04:00
Linus Torvalds 88ed86fee6 Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc
* 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
  proc: remove fs/proc/proc_misc.c
  proc: move /proc/vmcore creation to fs/proc/vmcore.c
  proc: move pagecount stuff to fs/proc/page.c
  proc: move all /proc/kcore stuff to fs/proc/kcore.c
  proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
  proc: move /proc/modules boilerplate to kernel/module.c
  proc: move /proc/diskstats boilerplate to block/genhd.c
  proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmstat boilerplate to mm/vmstat.c
  proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
  proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
  proc: move /proc/vmallocinfo to mm/vmalloc.c
  proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
  proc: move /proc/slab_allocators boilerplate to mm/slab.c
  proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
  proc: move /proc/stat to fs/proc/stat.c
  proc: move rest of /proc/partitions code to block/genhd.c
  proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
  proc: move /proc/devices code to fs/proc/devices.c
  proc: move rest of /proc/locks to fs/locks.c
  ...
2008-10-23 12:04:37 -07:00
Christoph Hellwig 3856d30ded ext4: remove unused variable in ext4_get_parent
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ All users removed in "switch all filesystems over to d_obtain_alias",
  aka commit 440037287c ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 12:03:23 -07:00
Linus Torvalds 12e1ec9ff3 ext3 quota support: fix compile failure
This one was due to a merge error: we added a use of nd.path in commit
2d7c820e56 ("ext3: add checks for errors
from jbd"), and concurrently we got rid of 'nd' and used a naked 'path'
in commit 8264613def ("[PATCH] switch
quota_on-related stuff to kern_path()").

That all merged cleanly, but it didn't actually _work_.  This should fix
it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 11:48:56 -07:00
Linus Torvalds 1f6d6e8ebe Merge branch 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
  hrtimers: add missing docbook comments to struct hrtimer
  hrtimers: simplify hrtimer_peek_ahead_timers()
  hrtimers: fix docbook comments
  DECLARE_PER_CPU needs linux/percpu.h
  hrtimers: fix typo
  rangetimers: fix the bug reported by Ingo for real
  rangetimer: fix BUG_ON reported by Ingo
  rangetimer: fix x86 build failure for the !HRTIMERS case
  select: fix alpha OSF wrapper
  select: fix alpha OSF wrapper
  hrtimer: peek at the timer queue just before going idle
  hrtimer: make the futex() system call use the per process slack value
  hrtimer: make the nanosleep() syscall use the per process slack
  hrtimer: fix signed/unsigned bug in slack estimator
  hrtimer: show the timer ranges in /proc/timer_list
  hrtimer: incorporate feedback from Peter Zijlstra
  hrtimer: add a hrtimer_start_range() function
  hrtimer: another build fix
  hrtimer: fix build bug found by Ingo
  hrtimer: make select() and poll() use the hrtimer range feature
  ...
2008-10-23 10:53:02 -07:00
Linus Torvalds db563fc2e8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: handle the TCP_Server_Info->tsk field more carefully
  cifs: fix unlinking of rename target when server doesn't support open file renames
  [CIFS] improve setlease handling
  [CIFS] fix saving of resume key before CIFSFindNext
  cifs: make cifs_rename handle -EACCES errors
  [CIFS] fix build error
  [CIFS] undo changes in cifs_rename_pending_delete if it errors out
  cifs: track DeletePending flag in cifsInodeInfo
  cifs: don't use CREATE_DELETE_ON_CLOSE in cifs_rename_pending_delete
  [CIFS] eliminate usage of kthread_stop for cifsd
  [CIFS] Add nodfs mount option
2008-10-23 10:43:36 -07:00
Linus Torvalds 2248485640 Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits)
  [PATCH] kill the rest of struct file propagation in block ioctls
  [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET
  [PATCH] get rid of blkdev_locked_ioctl()
  [PATCH] get rid of blkdev_driver_ioctl()
  [PATCH] sanitize blkdev_get() and friends
  [PATCH] remember mode of reiserfs journal
  [PATCH] propagate mode through swsusp_close()
  [PATCH] propagate mode through open_bdev_excl/close_bdev_excl
  [PATCH] pass fmode_t to blkdev_put()
  [PATCH] kill the unused bsize on the send side of /dev/loop
  [PATCH] trim file propagation in block/compat_ioctl.c
  [PATCH] end of methods switch: remove the old ones
  [PATCH] switch sr
  [PATCH] switch sd
  [PATCH] switch ide-scsi
  [PATCH] switch tape_block
  [PATCH] switch dcssblk
  [PATCH] switch dasd
  [PATCH] switch mtd_blkdevs
  [PATCH] switch mmc
  ...
2008-10-23 10:23:07 -07:00
Linus Torvalds 5ed487bc2c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (46 commits)
  [PATCH] fs: add a sanity check in d_free
  [PATCH] i_version: remount support
  [patch] vfs: make security_inode_setattr() calling consistent
  [patch 1/3] FS_MBCACHE: don't needlessly make it built-in
  [PATCH] move executable checking into ->permission()
  [PATCH] fs/dcache.c: update comment of d_validate()
  [RFC PATCH] touch_mnt_namespace when the mount flags change
  [PATCH] reiserfs: add missing llseek method
  [PATCH] fix ->llseek for more directories
  [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent
  [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup
  [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate()
  [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper
  [PATCH vfs-2.6 2/6] vfs: add d_ancestor()
  [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT()
  [PATCH] get rid of on-stack dentry in udf
  [PATCH 2/2] anondev: switch to IDA
  [PATCH 1/2] anondev: init IDR statically
  [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup()
  [PATCH] Optimise NFS readdir hack slightly.
  ...
2008-10-23 10:22:40 -07:00
Linus Torvalds b4d0b08a4c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: fix sparse warnings
  9p: rdma: RDMA Transport Support for 9P
  9p: fix format warning
  9p: fix debug build error
2008-10-23 10:14:45 -07:00
Linus Torvalds 33217379be Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux:
  nfsd: clean up expkey_parse error cases
  nfsd: Drop reference in expkey_parse error cases
  nfsd: Fix memory leak in nfsd_getxattr
  NFSD: Fix BUG during NFSD shutdown processing
2008-10-23 10:14:23 -07:00
Duane Griffin be07c4ed40 jbd: abort instead of waiting for nonexistent transactions
The __log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal.
However, if there are no transactions to be processed (e.g.  because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Tested-by: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 08:55:02 -07:00
Hidehiro Kawai 9f818b4ac0 jbd: test BH_Write_EIO to detect errors on metadata buffers
__try_to_free_cp_buf(), __process_buffer(), and __wait_cp_io() test
BH_Uptodate flag to detect write I/O errors on metadata buffers.  But by
commit 95450f5a7e "ext3: don't read inode
block if the buffer has a write error"(*), BH_Uptodate flag can be set to
inode buffers with BH_Write_EIO in order to avoid reading old inode data.
So now, we have to test BH_Write_EIO flag of checkpointing inode buffers
instead of BH_Uptodate.  This patch does it.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Acked-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 08:55:02 -07:00
Hidehiro Kawai 2d7c820e56 ext3: add checks for errors from jbd
If the journal has aborted due to a checkpointing failure, we have to
keep the contents of the journal space.  Otherwise, the filesystem will
lose uncheckpointed metadata completely and become inconsistent.  To
avoid this, we need to keep needs_recovery flag if checkpoint has
failed.

With this patch, ext3_put_super() detects a checkpointing failure from
the return value of journal_destroy(), then it invokes ext3_abort() to
make the filesystem read only and keep needs_recovery flag.  Errors
from journal_flush() are also handled by this patch in some places.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 08:55:01 -07:00
Hidehiro Kawai 4afe978530 jbd: fix error handling for checkpoint io
When a checkpointing IO fails, current JBD code doesn't check the error
and continue journaling.  This means latest metadata can be lost from both
the journal and filesystem.

This patch leaves the failed metadata blocks in the journal space and
aborts journaling in the case of log_do_checkpoint().  To achieve this, we
need to do:

1. don't remove the failed buffer from the checkpoint list where in
   the case of __try_to_free_cp_buf() because it may be released or
   overwritten by a later transaction
2. log_do_checkpoint() is the last chance, remove the failed buffer
   from the checkpoint list and abort the journal
3. when checkpointing fails, don't update the journal super block to
   prevent the journaled contents from being cleaned.  For safety,
   don't update j_tail and j_tail_sequence either
4. when checkpointing fails, notify this error to the ext3 layer so
   that ext3 don't clear the needs_recovery flag, otherwise the
   journaled contents are ignored and cleaned in the recovery phase
5. if the recovery fails, keep the needs_recovery flag
6. prevent cleanup_journal_tail() from being called between
   __journal_drop_transaction() and journal_abort() (a race issue
   between journal_flush() and __log_wait_for_space()

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23 08:55:01 -07:00
Alexey Dobriyan 59c7572e82 proc: remove fs/proc/proc_misc.c
Now that everything was moved to their more or less expected places,
apply rm(1).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 18:54:05 +04:00
Alexey Dobriyan 5aa140c2de proc: move /proc/vmcore creation to fs/proc/vmcore.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 18:51:22 +04:00
Alexey Dobriyan 6d80e53f00 proc: move pagecount stuff to fs/proc/page.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 18:48:31 +04:00
Alexey Dobriyan 97ce5d6dcb proc: move all /proc/kcore stuff to fs/proc/kcore.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 18:32:38 +04:00
Alexey Dobriyan b5aadf7f14 proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 18:06:12 +04:00
Alexey Dobriyan 3b5d5c6b0c proc: move /proc/modules boilerplate to kernel/module.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 18:03:13 +04:00
Alexey Dobriyan 31d85ab28e proc: move /proc/diskstats boilerplate to block/genhd.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-23 17:57:37 +04:00
Alexey Dobriyan 5c9fe6281b proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
2008-10-23 17:35:04 +04:00
Alexey Dobriyan b6aa44ab69 proc: move /proc/vmstat boilerplate to mm/vmstat.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
2008-10-23 17:12:51 +04:00
Alexey Dobriyan 74e2e8e8ce proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 16:33:29 +04:00
Alexey Dobriyan 8f32f7e5ac proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 16:12:04 +04:00
Alexey Dobriyan 5f6a6a9c4e proc: move /proc/vmallocinfo to mm/vmalloc.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
2008-10-23 15:48:28 +04:00
Alexey Dobriyan 7b3c3a50a3 proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
Lose dummy ->write hook in case of SLUB, it's possible now.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-10-23 15:20:06 +04:00
Alexey Dobriyan a0ec95a8e6 proc: move /proc/slab_allocators boilerplate to mm/slab.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-10-23 15:17:27 +04:00
Alexey Dobriyan d6917e19f3 proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 15:15:46 +04:00
Alexey Dobriyan df8106dbb5 proc: move /proc/stat to fs/proc/stat.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 15:14:05 +04:00
Alexey Dobriyan f500975a3f proc: move rest of /proc/partitions code to block/genhd.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
2008-10-23 15:07:31 +04:00
Alexey Dobriyan 8591cf4322 proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 15:05:11 +04:00
Alexey Dobriyan fe2510426a proc: move /proc/devices code to fs/proc/devices.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 15:02:18 +04:00
Alexey Dobriyan d8ba7a3633 proc: move rest of /proc/locks to fs/locks.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:37:00 +04:00
Alexey Dobriyan ae048112c0 proc: move /proc/kmsg creation to fs/proc/kmsg.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:35:08 +04:00
Alexey Dobriyan 659689280a proc: remove remnants of ->read_proc in proc_misc.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:32:59 +04:00
Alexey Dobriyan 6e62775ece proc: move /proc/execdomains to kernel/exec_domain.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:30:41 +04:00
Alexey Dobriyan cf9887f102 proc: switch /proc/cmdline to seq_file
and move it to fs/proc/cmdline.c while I'm at it.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:29:04 +04:00
Alexey Dobriyan 6827400713 proc: move /proc/filesystems to fs/filesystems.c
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:27:09 +04:00
Alexey Dobriyan 4c150f6c30 proc: move /proc/stram to m68k-specific code
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:25:35 +04:00
Alexey Dobriyan 813dcf7a6e proc: move /proc/hardware to m68k-specific code
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:24:03 +04:00
Alexey Dobriyan b457d15161 proc: switch /proc/version to seq_file
and move it to fs/proc/version.c while I'm at it.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 14:19:58 +04:00
Alexey Dobriyan e1759c215b proc: switch /proc/meminfo to seq_file
and move it to fs/proc/meminfo.c while I'm at it.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:52:40 +04:00
Alexey Dobriyan 9617760287 proc: switch /proc/uptime to seq_file
and move it to fs/proc/uptime.c while I'm at it.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:48:01 +04:00
Alexey Dobriyan 5b3acc8de8 proc: switch /proc/loadavg to seq_file
and move it to fs/proc/loadavg.c while I'm at it.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:45:28 +04:00
Arjan van de Ven 6c2f91e077 proc: use WARN() rather than printk+backtrace
Use WARN() rather than a printk() + backtrace();
this gives a more standard format message as well as complete
information (including line numbers etc) that will be collected
by kerneloops.org

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:34:38 +04:00
Alexey Dobriyan 1e0edd3f67 proc: spread __init
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:32:31 +04:00
Alexey Dobriyan 5bcd7ff9e1 proc: proc_init_inodecache() can't fail
kmem_cache creation code will panic, don't return anything.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:27:50 +04:00
Joe Korty 7c88db0cb5 proc: fix vma display mismatch between /proc/pid/{maps,smaps}
Commit 4752c36978 aka
"maps4: simplify interdependence of maps and smaps" broke /proc/pid/smaps,
causing it to display some vmas twice and other vmas not at all.  For example:

    grep .- /proc/1/smaps >/tmp/smaps; diff /proc/1/maps /tmp/smaps

    1  25d24
    2  < 7fd7e23aa000-7fd7e23ac000 rw-p 7fd7e23aa000 00:00 0
    3  28a28
    4  > ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0  [vsyscall]

The bug has something to do with setting m->version before all the
seq_printf's have been performed.  show_map was doing this correctly,
but show_smap was doing this in the middle of its seq_printf sequence.
This patch arranges things so that the setting of m->version in show_smap
is also done at the end of its seq_printf sequence.

Testing: in addition to the above grep test, for each process I summed
up the 'Rss' fields of /proc/pid/smaps and compared that to the 'VmRSS'
field of /proc/pid/status.  All matched except for Xorg (which has a
/dev/mem mapping which Rss accounts for but VmRSS does not).  This result
gives us some confidence that neither /proc/pid/maps nor /proc/pid/smaps
are any longer skipping or double-counting vmas.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 13:21:29 +04:00
Arjan van de Ven fd217f4d70 [PATCH] fs: add a sanity check in d_free
Hi Al,

remember that debug session we did at KS? You suggested this patch back
then....

From 7751eaf30474b8cbfaea64795805a17eab05ac53 Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Tue, 16 Sep 2008 16:51:17 -0700
Subject: [PATCH] fs: add a sanity check in d_free

we're seeing some corruption in the dentry->d_alias list that
appears like a free of an entry still on the list; this patch
adds a WARN_ON() to catch this scenario, as suggested by Al Viro

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-10-23 05:17:12 -04:00
Miklos Szeredi a77b72da24 [patch] vfs: make security_inode_setattr() calling consistent
Call security_inode_setattr() consistetly before inode_change_ok().
It doesn't make sense to try to "optimize" the i_op->setattr == NULL
case, as most filesystem do define their own setattr function.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-10-23 05:13:27 -04:00
Adrian Bunk 2c512397ca [patch 1/3] FS_MBCACHE: don't needlessly make it built-in
Assume you have:
- one or more of ext2/3/4 statically built into your kernel
- none of these with extended attributes enabled and
- want to add onother one of ext2/3/4 modular and with
  extended attributes enabled

then you currently have to reboot to use it since this results in
CONFIG_FS_MBCACHE=y.

That's not a common issue, but I just ran into it and since there's no
reason to get a built-in mbcache in this case this patch fixes it.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Andreas Gruenbacher <agruen@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-10-23 05:13:26 -04:00
Miklos Szeredi f696a3659f [PATCH] move executable checking into ->permission()
For execute permission on a regular files we need to check if file has
any execute bits at all, regardless of capabilites.

This check is normally performed by generic_permission() but was also
added to the case when the filesystem defines its own ->permission()
method.  In the latter case the filesystem should be responsible for
performing this check.

Move the check from inode_permission() inside filesystems which are
not calling generic_permission().

Create a helper function execute_ok() that returns true if the inode
is a directory or if any execute bits are present in i_mode.

Also fix up the following code:

 - coda control file is never executable
 - sysctl files are never executable
 - hfs_permission seems broken on MAY_EXEC, remove
 - hfsplus_permission is eqivalent to generic_permission(), remove

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-10-23 05:13:25 -04:00
Qinghuang Feng 5cec56deb6 [PATCH] fs/dcache.c: update comment of d_validate()
Parameters @hash and @len have been removed since 2.4.3,
now just to delete them.

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
2008-10-23 05:13:24 -04:00
Dan Williams 0e55a7cca4 [RFC PATCH] touch_mnt_namespace when the mount flags change
Daemons that need to be launched while the rootfs is read-only can now
poll /proc/mounts to be notified when their O_RDWR requests may no
longer end in EROFS.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-10-23 05:13:23 -04:00
Christoph Hellwig 91efc167d0 [PATCH] reiserfs: add missing llseek method
Reiserfs currently doesn't set a llseek method for regular files, which
means it will fall back to default_llseek.  This means no one can seek
beyond 2 Gigabytes on reiserfs, and that there's not protection vs
the i_size updates from writers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2008-10-23 05:13:22 -04:00
Christoph Hellwig 3222a3e55f [PATCH] fix ->llseek for more directories
With this patch all directory fops instances that have a readdir
that doesn't take the BKL are switched to generic_file_llseek.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2008-10-23 05:13:21 -04:00
OGAWA Hirofumi 4e9ed2f85a [PATCH vfs-2.6 6/6] vfs: add LOOKUP_RENAME_TARGET intent
This adds LOOKUP_RENAME_TARGET intent for lookup of rename destination.

LOOKUP_RENAME_TARGET is going to be used like LOOKUP_CREATE. But since
the destination of rename() can be existing directory entry, so it has a
difference. Although that difference doesn't matter in my usage, this
tells it to user of this intent.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:20 -04:00
OGAWA Hirofumi 0612d9fb27 [PATCH vfs-2.6 5/6] vfs: remove LOOKUP_PARENT from non LOOKUP_PARENT lookup
lookup_hash() with LOOKUP_PARENT is bogus. And this prepares to add
new intent on those path.

The user of LOOKUP_PARENT intent is nfs only, and it checks whether
nd->flags has LOOKUP_CREATE or LOOKUP_OPEN, so the result is same.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:19 -04:00
OGAWA Hirofumi 8f3dfaa5ba [PATCH vfs-2.6 4/6] vfs: remove unnecessary fsnotify_d_instantiate()
This calls d_move(), so fsnotify_d_instantiate() is unnecessary like
rename path.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:18 -04:00
OGAWA Hirofumi 360da90029 [PATCH vfs-2.6 3/6] vfs: add __d_instantiate() helper
This adds __d_instantiate() for users which is already taking
dcache_lock, and replace with it.

The part of d_add_ci() isn't equivalent. But it should be needed
fsnotify_d_instantiate() actually, because the path is to add the
inode to negative dentry.  fsnotify_d_instantiate() should be called
after change from negative to positive.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:17 -04:00
OGAWA Hirofumi e2761a1167 [PATCH vfs-2.6 2/6] vfs: add d_ancestor()
This adds d_ancestor() instead of d_isparent(), then use it.

If new_dentry == old_dentry, is_subdir() returns 1, looks strange.
"new_dentry == old_dentry" is not subdir obviously. But I'm not
checking callers for now, so this keeps current behavior.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:16 -04:00
OGAWA Hirofumi 871c0067d5 [PATCH vfs-2.6 1/6] vfs: replace parent == dentry->d_parent by IS_ROOT()
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
2008-10-23 05:13:16 -04:00
Al Viro 9fbb76ce0f [PATCH] get rid of on-stack dentry in udf
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:15 -04:00
Alexey Dobriyan ad76cbc63b [PATCH 2/2] anondev: switch to IDA
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 05:13:14 -04:00
Alexey Dobriyan 6de24f0ed0 [PATCH 1/2] anondev: init IDR statically
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-10-23 05:13:13 -04:00
David Woodhouse 8966c5e0fc [JFFS2] Use d_splice_alias() not d_add() in jffs2_lookup()
Now that JFFS2 can be exported by NFS, we need to get this right.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:12 -04:00
David Woodhouse c002a6c797 [PATCH] Optimise NFS readdir hack slightly.
Avoid calling the underlying ->readdir() again when we reached the end
already; keep going round the loop only if we stopped due to our own
buffer being full.

[AV: tidy the things up a bit, while we are there]

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:11 -04:00
Al Viro 53c9c5c0e3 [PATCH] prepare vfs_readdir() callers to returning filldir result
It's not the final state, but it allows moving ->readdir() instances
to passing filldir return value to caller of vfs_readdir().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:10 -04:00
Al Viro a9885444f7 [PATCH] get rid of on-stack dentry in ext2_get_parent()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:09 -04:00
Al Viro 734711abac [PATCH] get rid of on-stack fake dentry in ext3_get_parent()
Better pass parent and qstr to ext3_find_entry() explicitly than
use such kludges, especially since the stack footprint is nasty
enough and we have every chance to be deep in call chain.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:08 -04:00
David Woodhouse 5f556aab90 [JFFS2] Reinstate NFS exportability
Now that the readdir/lookup deadlock issues have been dealt with, we can
export JFFS2 file systems again.

(For now, you have to specify fsid manually; we should add a method to
the export_ops to handle that too.)

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:07 -04:00
David Woodhouse d88f1833fc [PATCH] Remove XFS buffered readdir hack
Now that we've moved the readdir hack to the nfsd code, we can
remove the local version from the XFS code.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:06 -04:00
David Woodhouse 14f7dd6320 [PATCH] Copy XFS readdir hack into nfsd code.
Some file systems with their own internal locking have problems with the
way that nfsd calls the ->lookup() method from within a filldir function
called from their ->readdir() method. The recursion back into the file
system code can cause deadlock.

XFS has a fairly hackish solution to this which involves doing the
readdir() into a locally-allocated buffer, then going back through it
calling the filldir function afterwards. It's not ideal, but it works.

It's particularly suboptimal because XFS does this for local file
systems too, where it's completely unnecessary.

Copy this hack into the NFS code where it can be used only for NFS
export. In response to feedback, use it unconditionally rather than only
for the affected file systems.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:05 -04:00
David Woodhouse 2628b76636 [PATCH] Factor out nfsd_do_readdir() into its own function
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:04 -04:00
Al Viro f3f8e17571 [PATCH] reduce the stack footprint of exportfs_decode_fh()
no need to have _two_ 256-byte arrays on stack...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:03 -04:00
Christoph Hellwig 9308a6128d [PATCH] kill d_alloc_anon
Remove d_alloc_anon now that no users are left.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:02 -04:00
Christoph Hellwig 440037287c [PATCH] switch all filesystems over to d_obtain_alias
Switch all users of d_alloc_anon to d_obtain_alias.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:01 -04:00
Christoph Hellwig 4ea3ada295 [PATCH] new helper: d_obtain_alias
The calling conventions of d_alloc_anon are rather unfortunate for all
users, and it's name is not very descriptive either.

Add d_obtain_alias as a new exported helper that drops the inode
reference in the failure case, too and allows to pass-through NULL
pointers and inodes to allow for tail-calls in the export operations.

Incidentally this helper already existed as a private function in
libfs.c as exportfs_d_alloc so kill that one and switch the callers
to d_obtain_alias.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:13:00 -04:00
Christoph Hellwig 3a8cff4f02 [PATCH] generic_file_llseek tidyups
Add kerneldoc for generic_file_llseek and generic_file_llseek_unlocked,
use sane variable names and unclutter the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:59 -04:00
Christoph Hellwig a518ab9329 [PATCH] tidy up chrdev_open
Use a single goto label for chrdev_put + return error cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:59 -04:00
Christoph Hellwig ca30bc9952 [PATCH] hpfs: cleanup ->setattr
Reformat hpfs_notify_change to standard kernel style to make it readable
and rename it to hpfs_setattr as that's what the method is called.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:58 -04:00
Al Viro 3516586a42 [PATCH] make O_EXCL in nd->intent.flags visible in nd->flags
New flag: LOOKUP_EXCL.  Set before doing the final step of pathname
resolution on the paths that have LOOKUP_CREATE and O_EXCL.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:56 -04:00
Al Viro 8737f3a1b3 [PATCH] get rid of path_lookup_create()
... and don't pass bogus flags when we are just looking for parent.
Fold __path_lookup_intent_open() into path_lookup_open() while we
are at it; that's the only remaining caller.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:54 -04:00
Al Viro 421748ecde [PATCH] assorted path_lookup() -> kern_path() conversions
more nameidata eviction

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:52 -04:00
Al Viro a63bb99660 [PATCH] switch nfsd to kern_path()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:51 -04:00
Al Viro c1a2a4756d [PATCH] sanitize svc_export_parse()
clean up the exit paths, get rid of nameidata

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:50 -04:00
Al Viro 8264613def [PATCH] switch quota_on-related stuff to kern_path()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 05:12:44 -04:00
Al Viro 0a0d8a4675 [PATCH] no need for noinline stuff in fs/namespace.c anymore
Stack footprint from hell had been due to many struct nameidata in there.
No more.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 03:34:22 -04:00
Al Viro 2d92ab3c62 [PATCH] finally get rid of nameidata in namespace.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 03:34:20 -04:00
Al Viro d181146572 [PATCH] new helper - kern_path()
Analog of lookup_path(), takes struct path *.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23 03:34:19 -04:00
Jeff Layton b1c8d2b421 cifs: handle the TCP_Server_Info->tsk field more carefully
cifs: handle the TCP_Server_Info->tsk field more carefully

We currently handle the TCP_Server_Info->tsk field without any locking,
but with some half-measures to try and prevent races. These aren't
really sufficient though. When taking down cifsd, use xchg() to swap
the contents of the tsk field with NULL so we don't end up trying
to send it more than one signal. Also, don't allow cifsd to exit until
the signal is received if we expect one.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-10-23 05:06:20 +00:00
Jeff Layton 8d281efb67 cifs: fix unlinking of rename target when server doesn't support open file renames
cifs: fix unlinking of rename target when server doesn't support open file renames

The patch to make cifs_rename undoable broke renaming one file on top of
another when the server doesn't support busy file renames. Remove the
code that uses busy file renames to unlink the target file, and just
have it call cifs_unlink. If the rename of the source file fails, then
the unlink won't be undoable, but hopefully that's rare enough that it
won't be a problem.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-10-23 04:50:17 +00:00
Steve French 84210e9120 [CIFS] improve setlease handling
fcntl(F_SETLEASE) currently is not exported by cifs (nor by local file
systems) so cifs grants leases based on how other local processes have
opened the file not by whether the file is cacheable (oplocked).  This
adds the check to make sure that the file is cacheable on the client
before checking whether we can grant the lease locally
(generic_setlease).  It also adds a mount option for cifs (locallease)
if the user wants to override this and try to grant leases even
if the server did not grant oplock.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-10-23 04:42:37 +00:00
Eric Van Hensbergen ea2e7996fc 9p: fix format warning
This patch fixes a format warning which appears on 64-bit builds.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-10-22 18:48:45 -05:00
J. Bruce Fields 30bc4dfd3b nfsd: clean up expkey_parse error cases
We might as well do all of these at the end.  Fix up a couple minor
style nits while we're there.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-22 14:05:30 -04:00
Krishna Kumar 6dfcde98a2 nfsd: Drop reference in expkey_parse error cases
Drop reference to export key on error. Compile tested.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-22 14:04:34 -04:00