linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Trond Myklebust	2a3f5fd459	NFS: nfs_refresh_inode should clear cache_validity flags on success If the cached attributes match the ones supplied in the fattr, then assume we've revalidated the inode. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:50 -04:00
Trond Myklebust	40d2470409	NFS: Fix a connectathon regression in NFSv3 and NFSv4 We're failing basic test6 against Linux servers because they lack a correct change attribute. The fix is to assume that we always want to invalidate the readdir caches when we call update_changeattr and/or nfs_post_op_update_inode on a directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:47 -04:00
Trond Myklebust	9e08a3c5ae	NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode nfs_post_op_update_inode() is really only meant to be used if we expect the inode and its attributes to have changed in some way. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:45 -04:00
Trond Myklebust	c7c209730d	NFS: Get rid of some obsolete macros - NFS_READTIME, NFS_CHANGE_ATTR are completely unused. - Inline the few remaining uses of NFS_ATTRTIMEO, and remove. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:23 -04:00
Trond Myklebust	4f48af4584	NFS: Simplify filehandle revalidation Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:20 -04:00
Trond Myklebust	9697d2342e	NFS: Ensure that nfs_link() returns a hashed dentry Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:18 -04:00
Trond Myklebust	a12802cab8	NFS: Be strict about dentry revalidation when doing exclusive create Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:16 -04:00
Trond Myklebust	b050aa791f	NFS: Don't zap the readdir caches upon error If necessary, the caches will get zapped under normal revalidation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:13 -04:00
Trond Myklebust	efbb06b7f9	NFS: Remove the redundant nfs_reval_fsid() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:11 -04:00
Trond Myklebust	81c768808c	NFSv3: Always use directory post-op attributes in nfs3_proc_lookup LOOKUP returns the directory post-op attributes whether or not the operation was successful. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:08 -04:00
Trond Myklebust	d75340cc4d	NFSv4: Fix nfs_atomic_open() to set the verifier on negative dentries too Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:06 -04:00
Trond Myklebust	216d5d0688	NFSv4: Use NFSv2/v3 rules for negative dentries in nfs_open_revalidate Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:03 -04:00
Trond Myklebust	0a5ebc1488	NFSv4: Don't revalidate the directory in nfs_atomic_lookup() Why bother, since the call to nfs4_atomic_open() will do it for us. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:01 -04:00
Trond Myklebust	f2c77f4e62	NFS: Optimise nfs_lookup_revalidate() We don't need to call nfs_revalidate_inode() on the directory if we already know that the verifiers don't match. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:58 -04:00
Trond Myklebust	6d2b296686	NFS: Reset nfsi->last_updated only if the attribute changed Otherwise set it to nfsi->read_cache_jiffies in order to prevent jiffy wraparound issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:55 -04:00
Trond Myklebust	60ccd4ec41	NFS: Remove nfs_begin_data_update/nfs_end_data_update The lower level routines in fs/nfs/proc.c, fs/nfs/nfs3proc.c and fs/nfs/nfs4proc.c should already be dealing with the revalidation issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:53 -04:00
Trond Myklebust	80eb209def	NFS: Remove NFS_I(inode)->data_updates We have no more users... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:50 -04:00
Trond Myklebust	a1643a92f6	NFS: NFS_CACHEINV() should not test for nfs_caches_unstable() The fact that we're in the process of modifying the inode does not mean that we should not invalidate the attribute and data caches. The defensive thing is to always invalidate when we're confronted with inode mtime/ctime or change_attribute updates that we do not immediately recognise. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:48 -04:00
Trond Myklebust	3258b4fa55	NFS: Remove bogus nfs_mark_for_revalidate() in nfs_lookup The parent of the newly materialised dentry has just been revalidated... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:45 -04:00
Trond Myklebust	cf8ba45e05	NFS: don't cache the verifer across ->lookup() calls If the ->lookup() call causes the directory verifier to change, then there is still no need to use the old verifier, since our dentry has been verified. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:42 -04:00
Trond Myklebust	7668fdbe9a	NFS: nfs_post_op_update_inode don't update cache_change_attribute If nfs_post_op_update_inode fails because the server didn't return any attributes, then we let the subsequent inode revalidation update cache_change_attribute. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:37 -04:00
Trond Myklebust	12b373ebf0	NFS: Don't revalidate dentries on directory size or ctime changes We only need to look at the mtime changes... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:35 -04:00
Trond Myklebust	2f78e4313a	NFS: Don't set cache_change_attribute in nfs_revalidate_mapping The attribute revalidation code will already have taken care of resetting nfsi->cache_change_attribute. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:32 -04:00
Trond Myklebust	446e534985	NFS: Fix a bug in nfs_open_revalidate() We want to set the verifier when the call to nfs4_open_revalidate() _succeeds_. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:30 -04:00
Trond Myklebust	d4d9cdcb47	NFS: Don't hash the negative dentry when optimising for an O_EXCL open We don't want to leave an unverified hashed negative dentry if the exclusive create fails to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:27 -04:00
Trond Myklebust	5724ab3787	NFS: nfs_instantiate() should set the dentry verifier That will also allow us to remove the calls in mknod and mkdir. In addition it will ensure that symlinks set it correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:25 -04:00
Trond Myklebust	fab728e156	NFS: Ensure nfs_instantiate() invalidates the parent dir on error Also ensure that it drops the dentry in this case. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:22 -04:00
Trond Myklebust	4b841736bc	NFS: Fix nfs_verify_change_attribute() We don't care about whether or not some other process on our client is changing the directory while we're in nfs_lookup_revalidate(), because the dcache will take care of ensuring local atomicity. We can therefore remove the test for nfs_caches_unstable(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:20 -04:00
Trond Myklebust	70ca88521f	NFS: Fake up 'wcc' attributes to prevent cache invalidation after write NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls. In NFSv3, returning wcc data is optional. In all cases, we want to prevent the client from invalidating our cached data whenever ->write_done() attempts to update the inode attributes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:15 -04:00
Trond Myklebust	b64e8a5ef7	NFS: Remove bogus check of cache_change_attribute in nfs_update_inode Remove the bogus 'data_stable' check in nfs_update_inode. The cache_change_attribute tells you if the directory changed on the server, and should have nothing to do with the file length. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:11 -04:00
Trond Myklebust	7fdc49c4e4	NFS: Fix the ESTALE "revalidation" in _nfs_revalidate_inode() For one thing, the test NFS_ATTRTIMEO() == 0 makes no sense: we're testing whether or not the cache timeout length is zero, which is totally unrelated to the issue of whether or not we trust the file staleness. Secondly, we do not want to retry the GETATTR once a file has been declared stale by the server: we rather want to discard that inode as soon as possible, since there are broken servers still in use out there that reuse filehandles on new files. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:08 -04:00
Trond Myklebust	8850df999c	NFS: Fix atime revalidation in read() NFSv3 will correctly update atime on a read() call, so there is no need to set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode() fails. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:06 -04:00
Trond Myklebust	c481299839	NFS: Fix atime revalidation in readdir() NFSv3 will correctly update atime on a readdir call, so there is no need to set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode() fails. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:03 -04:00
Trond Myklebust	57fa76f2da	NFS: Don't use readdirplus data if the page cache is invalid Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:19:00 -04:00
Trond Myklebust	47aabaa7e4	NFSv4: Don't use ctime/mtime for determining when to invalidate the caches In NFSv4 we should only be looking at the change attribute. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:57 -04:00
Trond Myklebust	17cadc9537	NFS: Don't force a dcache revalidation if nfs_wcc_update_inode succeeds The reason is that if the weak cache consistency update was successful, then we know that our client must be the only one that changed the directory, and we've already updated the dcache to reflect the change. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:55 -04:00
Trond Myklebust	e323ea46d9	NFS: nfs_wcc_update_inode: directory caches are always invalidated We must ensure that the readdir data is always invalidated whether or not the weak cache consistency data update succeeds. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:51 -04:00
Trond Myklebust	6ecc5e8fca	NFS: Fix dcache revalidation bugs We don't need to force a dentry lookup just because we're making changes to the directory. Don't update nfsi->cache_change_attribute in nfs_end_data_update: that overrides the NFSv3/v4 weak consistency checking that tells us our update was the only one, and that tells us the dcache is still valid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:49 -04:00
Trond Myklebust	7957c1418f	NFS: fix nfs_verify_change_attribute We always want to check that the verifier and directory cache_change_attribute match. This also allows us to remove the 'wraparound hack' for the cache_change_attribute. If we're only checking for equality, then we don't care about wraparound issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:46 -04:00
Trond Myklebust	68e8a70d3c	NFS: nfs_post_op_update_inode() should call nfs_refresh_inode() Ensure that we don't clobber the results from a more recent getattr call... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:43 -04:00
Trond Myklebust	f2115dc987	NFS: Fix over-conservative attribute invalidation in nfs_update_inode() We should always be declaring the attribute cache as valid after having updated it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:40 -04:00
Trond Myklebust	76b32999df	NFSv4: Make NFSv4 ACCESS calls return attributes too... It doesn't really make sense to cache an access call without also revalidating the attributes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:38 -04:00
Trond Myklebust	af22f94ae0	NFSv4: Simplify _nfs4_do_access() Currently, _nfs4_do_access() is just a copy of nfs_do_access() with added conversion of the open flags into an access mask. This patch merges the duplicate functionality. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:34 -04:00
Trond Myklebust	cd3758e37d	NFS: Replace file->private_data with calls to nfs_file_open_context() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:31 -04:00
Chuck Lever	8fb559f87f	NFS: Eliminate nfs_refresh_verifier() nfs_set_verifier() and nfs_refresh_verifier() do exactly the same thing, so replace one with the other. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:26 -04:00
Chuck Lever	77a55a1fe8	NFS: Eliminate nfs_renew_times() The nfs_renew_times() function plants the current time in jiffies in dentry->d_time. But a call to nfs_renew_times() is always followed by another call that overwrites dentry->d_time. Get rid of the nfs_renew_times() calls. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:24 -04:00
Chuck Lever	92f6c17825	NFS: Don't call nfs_renew_times() in nfs_dentry_iput() Negative dentries need to be reverified after an asynchronous unlink. Quoth Trond: "Unfortunately I don't think that we can avoid revalidating the resulting negative dentry since the UNLINK call is asynchronous, and so the new verifier on the directory will only be known a posteriori." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:22 -04:00
Chuck Lever	bcf35617a7	NFS: Show "nointr" mount option The default "intr" setting is different for NFS and NFSv4. To avoid confusion on this issue, don't hide the "nointr" option in /proc/mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:17 -04:00
Chuck Lever	6e88e0618c	NFS: Verify server address before invoking in-kernel mount client Re-order mount option sanity checking slightly to ensure we have a valid server address before trying to do the mountd RPC call. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:14 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	113632d00a	SUNRPC: Add RDMA dependency to SUNRPC_XPRT_RDMA Add a dependency on RDMA before enabling SUNRPC_XPRT_RDMA Yes, "INFINIBAND" also turns on iWARP and other RDMA support. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:11 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	2cf7ff7a37	NFS: support RDMA mounts Adds hooks to the string-based NFS mount to support an "rdma" protocol option. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:00 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	c3a57ed747	RPCRDMA: Kconfig and header file with rpcrdma protocol definitions This file implements the configuration target, protocol template and constants for the rpcrdma transport framing, for use by the xprtrdma rpc transport implementation. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:57 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	56928edd5a	NFS - print accurate transport protocol Use the per-transport strings to display the transport protocol accurately. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:55 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	0896a725a1	NFS/SUNRPC: use transport protocol naming Instead of an { address family, raw IP protocol number }-tuple, use the newly-defined RPC identifier when creating clients in the upper layers. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:53 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	4f22ccc346	SUNRPC: mark bulk read/write data in xdrbuf Adds a flag word to the xdrbuf struct which indicates any bulk disposition of the data. This enables RPC transport providers to marshal it efficiently/appropriately, and may enable other optimizations. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:34 -04:00
Trond Myklebust	20c71f5e0f	NFSv4: Fix a bug in nfs4_validate_mount_data() The previous patch introduced a bug when copying the server address. Also clarify a copy into the auth_flavours array: currently the two size calculations are equivalent, but we may decide to change the size of auth_flavors[] at some point. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:31 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	91ea40b9c6	NFS: use in-kernel mount argument structure for nfsv4 mounts The user-visible nfs4_mount_data does not contain sufficient data to describe new mount options, and also is now a legacy structure. Replace it with the internal nfs_parsed_mount_data for nfsv4 in-kernel use. Signed-off-by: Tom Talpey <tmt@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:28 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	2283f8d6ed	NFS: use in-kernel mount argument structure for nfsv[23] mounts The user-visible nfs_mount_data does not contain sufficient data to describe new mount options, and also is now a legacy structure. Replace it with the internal nfs_parsed_mount_data for nfsv[23] in-kernel use. Signed-off-by: Tom Talpey <tmt@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:26 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	6b18eaa082	NFS: move nfs_parsed_mount_data structure definition In preparation for rearranging the nfs mount argument passing, make the nfs_parsed_mount_data struct visible across nfs kernel files. Signed-off-by: Tom Talpey <tmt@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:23 -04:00
Chuck Lever	817cb9d43d	NFSD: Convert printk's to dprintk's in NFSD's nfs4xdr Due to recent edict to remove or replace printk's that can flood the system log. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:14 -04:00
Chuck Lever	e159a08b6a	LOCKD: Convert printk's to dprintk's in lockd XDR routines Due to recent edict to remove or replace printk's that might flood the system log. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:12 -04:00
Chuck Lever	fe82a183ca	NFS: Convert printk's to dprintk's in fs/nfs/nfs?xdr.c Due to recent edict to replace or remove printk's that can be triggered en masse by remote misbehavior. Left a few that only occur just before a BUG. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:09 -04:00
Chuck Lever	0ac83779fa	NFS: Add new 'mountaddr=' mount option I got the 'mounthost=' option wrong - it shouldn't look for an address value, but rather a hostname value. However, the in-kernel mount client and NFS client cannot resolve a hostname by themselves; they rely on user-land to pass in the resolved address. Create a new mount option that does take an address so that the mount program's address can be passed in. The mount hostname is now ignored by the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:06 -04:00
James Lentini	aad7000735	[NFS] [PATCH] NFS: initialize default port in kernel mount client If no mount server port number is specified, the previous change to the kernel mount client inadvertently allows the NFS server's port number to be the used as the mount server's port number. If the user specifies an NFS server port (-o port=x), the mount will fail. The fix below sets the mount server's port to 0 if no mount server port is specified by the user. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:04 -04:00
Chuck Lever	efd8340bb1	NFS: Kernel mount client should use async bind Simplify the in-kernel mount client by using autobind instead of an explicit call to rpc_getport_sync. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:01 -04:00
Jeff Layton	ddc01c0813	[NFS] [PATCH] NFS: show addr=ipaddr in /proc/mounts rather than A minor thing, but useful when working with a server with multiple addrs. This looks like it might also be necessary if Miklos' effort to eliminate /etc/mtab ever comes to fruition. When displaying mount options in /proc/mounts, the kernel prints "addr=hostname". This info is redundant since we already have the hostname displayed as part of the "device" section of the mount. This patch changes it to display the IP address to which the socket is connected. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:39 -04:00
Christoph Hellwig	f8cf3678f4	[NFS] [PATCH] nfs: tiny makefile cleanup no need to set up foo-objs these days. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:36 -04:00
Fabio Olive Leite	c7e1596111	Re: [NFS] [PATCH] Attribute timeout handling and wrapping u32 jiffies I would like to discuss the idea that the current checks for attribute timeout using time_after are inadequate for 32bit architectures, since time_after works correctly only when the two timestamps being compared are within 2^31 jiffies of each other. The signed overflow caused by comparing values more than 2^31 jiffies apart will flip the result, causing incorrect assumptions of validity. 2^31 jiffies is a fairly large period of time (~25 days) when compared to the lifetime of most kernel data structures, but for long lived NFS mounts that can sit idle for months (think that for some reason autofs cannot be used), it is easy to compare inode attribute timestamps with very disparate or even bogus values (as in when jiffies have wrapped many times, where the comparison doesn't even make sense). Currently the code tests for attribute timeout by simply adding the desired amount of jiffies to the stored timestamp and comparing that with the current timestamp of obtained attribute data with time_after. This is incorrect, as it returns true for the desired timeout period and another full 2^31 range of jiffies. In testing with artificial jumps (several small jumps, not one big crank) of the jiffies I was able to reproduce a problem found in a server with very long lived NFS mounts, where attributes would not be refreshed even after touching files and directories in the server: Initial uptime: 03:42:01 up 6 min, 0 users, load average: 0.01, 0.12, 0.07 NFS volume is mounted and time is advanced: 03:38:09 up 25 days, 2 min, 0 users, load average: 1.22, 1.05, 1.08 # ls -l /local/A/foo/bar /nfs/A/foo/bar -rw-r--r-- 1 root root 0 Dec 17 03:38 /local/A/foo/bar -rw-r--r-- 1 root root 0 Nov 22 00:36 /nfs/A/foo/bar # touch /local/A/foo/bar # ls -l /local/A/foo/bar /nfs/A/foo/bar -rw-r--r-- 1 root root 0 Dec 17 03:47 /local/A/foo/bar -rw-r--r-- 1 root root 0 Nov 22 00:36 /nfs/A/foo/bar We can see the local mtime is updated, but the NFS mount still shows the old value. The patch below makes it work: Initial setup... 07:11:02 up 25 days, 1 min, 0 users, load average: 0.15, 0.03, 0.04 # ls -l /local/A/foo/bar /nfs/A/foo/bar -rw-r--r-- 1 root root 0 Jan 11 07:11 /local/A/foo/bar -rw-r--r-- 1 root root 0 Jan 11 07:11 /nfs/A/foo/bar # touch /local/A/foo/bar # ls -l /local/A/foo/bar /nfs/A/foo/bar -rw-r--r-- 1 root root 0 Jan 11 07:14 /local/A/foo/bar -rw-r--r-- 1 root root 0 Jan 11 07:14 /nfs/A/foo/bar Signed-off-by: Fabio Olive Leite <fleite@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:33 -04:00
Peter Staubach	4e769b934e	64 bit ino support for NFS client Hi. Attached is a patch to modify the NFS client code to support 64 bit ino's, as appropriate for the system and the NFS protocol version. The code basically just expand the NFS interfaces for routines which handle ino's from using ino_t to u64 and then uses the fileid in the nfs_inode instead of i_ino in the inode. The code paths that were updated are in the getattr method and the readdir methods. This should be no real change on 64 bit platforms. Since the ino_t is an unsigned long, it would already be 64 bits wide. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:29 -04:00
Trond Myklebust	7b159fc18d	NFS: Fall back to synchronous writes when a background write errors... This helps prevent huge queues of background writes from building up whenever the server runs out of disk or quota space, or if someone changes the file access modes behind our backs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:23 -04:00
Trond Myklebust	34901f70d1	NFS: Writeback optimisation Schedule writes using WB_SYNC_NONE first, then come back for a second pass using WB_SYNC_ALL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:21 -04:00
Trond Myklebust	ed90ef51a3	NFS: Clean up NFS writeback flush code The only user of nfs_sync_mapping_range() is nfs_getattr(), which uses it to flush out the entire inode without sending a commit. We therefore replace nfs_sync_mapping_range with a more appropriate helper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:18 -04:00
Trond Myklebust	f758c88519	NFS: Clean up nfs_writepages() Just call write_cache_pages directly instead of hacking the writeback control structure in order to find out if we were called from writepages() or directly from the VM. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:13 -04:00
Trond Myklebust	9cccef9505	NFS: Clean up write code... The addition of nfs_page_mkwrite means that We should no longer need to create requests inside nfs_writepage() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:11 -04:00
Trond Myklebust	94387fb1aa	NFS: Add the helper nfs_vm_page_mkwrite This is needed in order to set up a proper nfs_page request for mmapped files. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:15:08 -04:00
Trond Myklebust	a6d8543042	NLM: Fix a memory leak in nlmsvc_testlock The recent fix for a circular lock dependency unfortunately introduced a potential memory leak in the event where the call to nlmsvc_lookup_host fails for some reason. Thanks to Roel Kluin for spotting this. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-09 12:38:26 -07:00
Yan Zheng	87e2831c3f	AIO: fix cleanup in io_submit_one(...) When IOCB_FLAG_RESFD flag is set and iocb->aio_resfd is incorrect, statement 'goto out_put_req' is executed. At label 'out_put_req', aio_put_req(..) is called, which requires 'req->ki_filp' set. Signed-off-by: Yan Zheng<yanzheng@21cn.com> Cc: Zach Brown <zach.brown@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-08 12:58:14 -07:00
David Woodhouse	8fb870df5a	[JFFS2] Trigger garbage collection when very_dirty_list size becomes excessive With huge amounts of free space, we weren't bothering to GC for while a while, and pathological numbers of obsolete nodes were accumulating, seriously affecting performance on NAND flash (OLPC trac #3978) Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-10-06 15:12:58 -04:00
Linus Torvalds	66b1f1a982	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6 * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: Blackfin arch: fix PORT_J BUG for BF537/6 EMAC driver reported by Kalle Pokki <kalle.pokki@iki.fi> Blackfin arch: gpio pinmux and resource allocation API required by BF537 on chip ethernet mac driver Blackfin arch: add some missing syscall binfmt_flat: checkpatch fixing minimum support for the blackfin relocations Binfmt_flat: Add minimum support for the Blackfin relocations	2007-10-03 15:34:07 -07:00
Sunil Mushran	bda0233b89	ocfs2: Unlock mutex in local alloc failure case The fs was not unlocking the local alloc inode mutex in the code path in which it failed to find a window of free bits in the global bitmap. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-10-03 11:14:45 -07:00
Paul Mackerras	70f227d884	Merge branch 'linux-2.6' into for-2.6.24	2007-10-03 15:33:17 +10:00
Linus Torvalds	7572395767	Fix possible splice() mmap_sem deadlock Nick Piggin points out that splice isn't being good about the mmap semaphore: while two readers can nest inside each others, it does leave a possible deadlock if a writer (ie a new mmap()) comes in during that nesting. Original "just move the locking" patch by Nick, replaced by one by me based on an optimistic pagefault_disable(). And then Jens tested and updated that patch. Reported-by: Nick Piggin <npiggin@suse.de> Tested-by: Jens Axboe <jens.axboe@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-01 13:17:28 -07:00
Tim Shimmin	564256c9e0	Revert "[XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer." This reverts commit `b394e43e99`. Lachlan McIlroy says: It tried to fix an issue where log replay is replaying an inode cluster initialisation transaction that should not be replayed because the inode cluster on disk is more up to date. Since we don't log file sizes (we rely on inode flushing to get them to disk) then we can't just replay all the transations in the log and expect the inode to be completely restored. We lose file size updates. Unfortunately this fix is causing more (serious) problems than it is fixing. SGI-PV: 969656 SGI-Modid: xfs-linux-melb:xfs-kern:29804a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-10-01 07:59:03 -07:00
Trond Myklebust	54af3bb543	NFS: Fix an Oops in encode_lookup() It doesn't look as if the NFS file name limit is being initialised correctly in the struct nfs_server. Make sure that we limit whatever is being set in nfs_probe_fsinfo() and nfs_init_server(). Also ensure that readdirplus and nfs4_path_walk respect our file name limits. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-28 15:36:42 -07:00
Trond Myklebust	255129d1e9	NLM: Fix a circular lock dependency in lockd The problem is that the garbage collector for the 'host' structures nlm_gc_hosts(), holds nlm_host_mutex while calling down to nlmsvc_mark_resources, which, eventually takes the file->f_mutex. We cannot therefore call nlmsvc_lookup_host() from within nlmsvc_create_block, since the caller will already hold file->f_mutex, so the attempt to grab nlm_host_mutex may deadlock. Fix the problem by calling nlmsvc_lookup_host() outside the file->f_mutex. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-26 09:22:04 -07:00
Linus Torvalds	e4b42be77e	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: [PATCH] WE : Add missing auth compat-ioctl [PATCH] softmac: Fix inability to associate with WEP networks	2007-09-26 08:55:54 -07:00
Jeff Garzik	2aee619865	Merge branch 'fixes-jgarzik' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes	2007-09-25 15:47:12 -04:00
Evgeniy Dushistov	f9b7cba1b8	ufs: fix sun state Different types of ufs hold state in different places, to hide complexity of this, there is ufs_get_fs_state, it returns state according to "UFS_SB(sb)->s_flags", but during mount ufs_get_fs_state is called, before setting s_flags, this cause message for ufs types like sun ufs: "fs need fsck", and remount in readonly state. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-25 08:51:04 -07:00
Andy Lowe	59d8235be2	[JFFS2] Fix unpoint length Fix a couple of instances in JFFS2 where the unpoint() routine is being called with the wrong length in cases where the point() routine truncated a request. Signed-off-by: Andy Lowe <alowe@mvista.com> Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-09-23 18:41:17 +01:00
Linus Torvalds	6110e02b97	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] fix valid but harmless sparse warning [XFS] fix filestreams on 32-bit boxes	2007-09-22 12:56:13 -07:00
Andrew Morton	576bb9ced2	binfmt_flat: checkpatch fixing minimum support for the blackfin relocations Cc: Bernd Schmidt <bernd.schmidt@analog.com> Cc: David McCullough <davidm@snapgear.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Miles Bader <miles.bader@necel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bryan Wu <bryan.wu@analog.com>	2007-10-03 23:43:57 +08:00
Bernd Schmidt	f9720205d1	Binfmt_flat: Add minimum support for the Blackfin relocations Add minimum support for the Blackfin relocations, since we don't have enough space in each reloc. The idea is to store a value with one relocation so that subsequent ones can access it. Actually, this patch is required for Blackfin. Currently if BINFMT_FLAT is enabled, git-tree kernel will fail to compile. Signed-off-by: Bernd Schmidt <bernd.schmidt@analog.com> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Cc: David McCullough <davidm@snapgear.com> Cc: Greg Ungerer <gerg@snapgear.com> Cc: Miles Bader <miles.bader@necel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-03 23:41:43 +08:00
Linus Torvalds	73e83dc300	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Pack vote message and response structures ocfs2: Don't double set write parameters ocfs2: Fix pos/len passed to ocfs2_write_cluster ocfs2: Allow smaller allocations during large writes	2007-09-21 09:52:20 -07:00
Jean Tourrilhes	d59952d532	[PATCH] WE : Add missing auth compat-ioctl Johannes just found that we are missing a compat-ioctl declaration. The fix is trivial. As previous patches for compat-ioctl, this should also go to stable. More info : http://marc.info/?l=linux-wireless&m=119029667902588&w=2 Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-09-21 11:26:33 -04:00
Sunil Mushran	813d974c53	ocfs2: Pack vote message and response structures The ocfs2_vote_msg and ocfs2_response_msg structs needed to be packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this, we had inadvertantly broken 32/64 bit cross mounts. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:10 -07:00
Mark Fasheh	5c26a7b70f	ocfs2: Don't double set write parameters The target page offsets were being incorrectly set a second time in ocfs2_prepare_page_for_write(), which was causing problems on a 16k page size kernel. Additionally, ocfs2_write_failure() was incorrectly using those parameters instead of the parameters for the individual page being cleaned up. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:10 -07:00
Mark Fasheh	db56246c69	ocfs2: Fix pos/len passed to ocfs2_write_cluster This was broken for file systems whose cluster size is greater than page size. Pos needs to be incremented as we loop through the descriptors, and len needs to be capped to the size of a single cluster. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:09 -07:00
Mark Fasheh	415cb80037	ocfs2: Allow smaller allocations during large writes The ocfs2 write code loops through a page much like the block code, except that ocfs2 allocation units can be any size, including larger than page size. Typically it's equal to or larger than page size - most kernels run 4k pages, the minimum ocfs2 allocation (cluster) size. Some changes introduced during 2.6.23 changed the way writes to pages are handled, and inadvertantly broke support for > 4k page size. Instead of just writing one cluster at a time, we now handle the whole page in one pass. This means that multiple (small) seperate allocations might happen in the same pass. The allocation code howver typically optimizes by getting the maximum which was reserved. This triggered a BUG_ON in the extend code where it'd ask for a single bit (for one part of a > 4k page) and get back more than it asked for. Fix this by providing a variant of the high level allocation function which allows the caller to specify a maximum. The traditional function remains and just calls the new one with a maximum determined from the initial reservation. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-20 15:06:09 -07:00
Davide Libenzi	b8fceee17a	signalfd simplification This simplifies signalfd code, by avoiding it to remain attached to the sighand during its lifetime. In this way, the signalfd remain attached to the sighand only during poll(2) (and select and epoll) and read(2). This also allows to remove all the custom "tsk == current" checks in kernel/signal.c, since dequeue_signal() will only be called by "current". I think this is also what Ben was suggesting time ago. The external effect of this, is that a thread can extract only its own private signals and the group ones. I think this is an acceptable behaviour, in that those are the signals the thread would be able to fetch w/out signalfd. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-20 13:19:59 -07:00
Christoph Hellwig	1bc5858d0d	[XFS] fix valid but harmless sparse warning The new xlog_recover_do_reg_buffer checks call be16_to_cpu on di_gen which is a 32bit value so sparse rightly complains. Fortunately the warning is harmless because we don't care for the value, but only whether it's non-NULL. Due to that fact we can simply kill the endian swaps on this and the previous di_mode check entirely. SGI-PV: 969656 SGI-Modid: xfs-linux-melb:xfs-kern:29709a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-20 19:40:40 +10:00
Eric Sandeen	bcc7b445ef	[XFS] fix filestreams on 32-bit boxes xfs_filestream_mount() sets up an mru cache with: err = xfs_mru_cache_create(&mp->m_filestream, lifetime, grp_count, (xfs_mru_cache_free_func_t)xfs_fstrm_free_func); but that cast is causing problems... typedef void (xfs_mru_cache_free_func_t)(unsigned long, void); but: void xfs_fstrm_free_func( xfs_ino_t ino, fstrm_item_t item) so on a 32-bit box, it's casting (32, 32) args into (64, 32) and I assume it's getting garbage for item, which subsequently causes an explosion. With this change the filestreams xfsqa tests don't oops on my 32-bit box. SGI-PV: 967795 SGI-Modid: xfs-linux-melb:xfs-kern:29510a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-20 19:40:19 +10:00
Paul Mackerras	0ce49a3945	Merge branch 'linux-2.6'	2007-09-20 10:09:27 +10:00
Linus Torvalds	a78feb7c8a	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer. [XFS] Ensure file size updates have been completed before writing inode to disk. [XFS] On-demand reaping of the MRU cache	2007-09-19 11:40:13 -07:00
Eric Sandeen	ef2b02d3e6	ext34: ensure do_split leaves enough free space in both blocks The do_split() function for htree dir blocks is intended to split a leaf block to make room for a new entry. It sorts the entries in the original block by hash value, then moves the last half of the entries to the new block - without accounting for how much space this actually moves. (IOW, it moves half of the entry count not half of the entry space). If by chance we have both large & small entries, and we move only the smallest entries, and we have a large new entry to insert, we may not have created enough space for it. The patch below stores each record size when calculating the dx_map, and then walks the hash-sorted dx_map, calculating how many entries must be moved to more evenly split the existing entries between the old block and the new block, guaranteeing enough space for the new entry. The dx_map "offs" member is reduced to u16 so that the overall map size does not change - it is temporarily stored at the end of the new block, and if it grows too large it may be overwritten. By making offs and size both u16, we won't grow the map size. Also add a few comments to the functions involved. This fixes the testcase reported by hooanon05@yahoo.co.jp on the linux-ext4 list, "ext3 dir_index causes an error" Thanks to Andreas Dilger for discussing the problem & solution with me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Tested-by: Junjiro Okajima <hooanon05@yahoo.co.jp> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Alexey Dobriyan	49af7ee181	nfs: fix oops re sysctls and V4 support NFS unregisters sysctls only if V4 support is compiled in. However, sysctl table is not V4 specific, so unregister it always. Steps to reproduce: [build nfs.ko with CONFIG_NFS_V4=n] modrobe nfs rmmod nfs ls /proc/sys Unable to handle kernel paging request at ffffffff880661c0 RIP: [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 PGD 203067 PUD 207063 PMD 7e216067 PTE 0 Oops: 0000 [1] SMP CPU 1 Modules linked in: lockd nfs_acl sunrpc Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2 RIP: 0010:[<ffffffff802af8e3>] [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 RSP: 0018:ffff81007fd93e78 EFLAGS: 00010286 RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0 RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40 RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0 R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280 FS: 00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000) Stack: ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40 ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a 2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640 Call Trace: [<ffffffff80283f30>] filldir+0x0/0xf0 [<ffffffff80283f30>] filldir+0x0/0xf0 [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0 [<ffffffff80284376>] sys_getdents+0x96/0xe0 [<ffffffff8020bb3e>] system_call+0x7e/0x83 Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b RIP [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350 RSP <ffff81007fd93e78> CR2: ffffffff880661c0 Kernel panic - not syncing: Fatal exception Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Eric Sandeen	3d82abae95	dir_index: error out instead of BUG on corrupt dx dirs Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable errors with helpful warnings. With help catching other asserts from Duane Griffin <duaneg@dghda.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Duane Griffin <duaneg@dghda.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-19 11:24:18 -07:00
Michael Ellerman	e55014923e	[POWERPC] spufs: Cleanup ELF coredump extra notes logic To start with, arch_notes_size() etc. is a little too ambiguous a name for my liking, so change the function names to be more explicit. Calling through macros is ugly, especially with hidden parameters, so don't do that, call the routines directly. Use ARCH_HAVE_EXTRA_ELF_NOTES as the only flag, and based on it decide whether we want the extern declarations or the empty versions. Since we have empty routines, actually use them in the coredump code to save a few #ifdefs. We want to change the handling of foffset so that the write routine updates foffset as it goes, instead of using file->f_pos (so that writing to a pipe works). So pass foffset to the write routine, and for now just set it to file->f_pos at the end of writing. It should also be possible for the write routine to fail, so change it to return int and treat a non-zero return as failure. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-09-19 15:12:19 +10:00
Lachlan McIlroy	b394e43e99	[XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer. SGI-PV: 969656 SGI-Modid: xfs-linux-melb:xfs-kern:29676a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-18 20:16:00 +10:00
Lachlan McIlroy	776a75fa5c	[XFS] Ensure file size updates have been completed before writing inode to disk. SGI-PV: 968767 SGI-Modid: xfs-linux-melb:xfs-kern:29675a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-18 20:12:51 +10:00
David Chinner	65de556756	[XFS] On-demand reaping of the MRU cache Instead of running the mru cache reaper all the time based on a timeout, we should only run it when the cache has active objects. This allows CPUs to sleep when there is no activity rather than be woken repeatedly just to check if there is anything to do. SGI-PV: 968554 SGI-Modid: xfs-linux-melb:xfs-kern:29305a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-17 16:42:02 +10:00
Jeff Garzik	a2ca44c30d	Merge branch 'fixes-jgarzik' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes	2007-09-15 19:29:07 -04:00
Masakazu Mokuno	53c5725581	As struct iw_point is bi-directional payload, we should copy back the content on return from ioctl calls Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-09-14 14:35:38 -04:00
Linus Torvalds	577107e8e4	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Fix calculation of i_blocks during truncate [PATCH] ocfs2: Fix a wrong cluster calculation. [PATCH] ocfs2: fix mount option parsing ocfs2: update docs for new features	2007-09-11 17:23:16 -07:00
Pavel Emelyanov	0e2f6db88a	Leases can be hidden by flocks The inode->i_flock list contains the leases, flocks and posix locks in the specified order. However, the flocks are added in the head of this list thus hiding the leases from F_GETLEASE command, from time_out_leases() and other code that expects the leases to come first. The following example will demonstrate this: #define _GNU_SOURCE #include <unistd.h> #include <fcntl.h> #include <stdio.h> #include <sys/file.h> static void show_lease(int fd) { int res; res = fcntl(fd, F_GETLEASE); switch (res) { case F_RDLCK: printf("Read lease\n"); break; case F_WRLCK: printf("Write lease\n"); break; case F_UNLCK: printf("No leases\n"); break; default: printf("Some shit\n"); break; } } int main(int argc, char **argv) { int fd, res; fd = open(argv[1], O_RDONLY); if (fd == -1) { perror("Can't open file"); return 1; } res = fcntl(fd, F_SETLEASE, F_WRLCK); if (res == -1) { perror("Can't set lease"); return 1; } show_lease(fd); if (flock(fd, LOCK_SH) == -1) { perror("Can't flock shared"); return 1; } show_lease(fd); return 0; } The first call to show_lease() will show the write lease set, but the second will show no leases. Fix the flock adding so that the leases always stay in the head of this list. Found during making the flocks pid-namespaces aware. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:27 -07:00
Alexey Dobriyan	dd23aae4f5	Fix select on /proc files without ->poll Taneli Vähäkangas <vahakang@cs.helsinki.fi> reported that commit `786d7e1612` aka "Fix rmmod/read/write races in /proc entries" broke SBCL + SLIME combo. The old code in do_select() used DEFAULT_POLLMASK, if couldn't find ->poll handler. The new code makes ->poll always there and returns 0 by default, which is not correct. Return DEFAULT_POLLMASK instead. Steps to reproduce: install emacs, SBCL, SLIME emacs M-x slime in inferior-lisp buffer [watch it doing "Connecting to Swank on port X.."] Please, apply before 2.6.23. P.S.: why SBCL can't just read(2) /proc/cpuinfo is a mystery. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: T Taneli Vahakangas <vahakang@cs.helsinki.fi> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:20 -07:00
Andreas Gruenbacher	1a1a1a758b	afs: mntput called before dput dput must be called before mntput here. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-By: David Howells <dhowells@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:19 -07:00
Jan Kara	9c3013e9b9	quota: fix infinite loop If we fail to start a transaction when releasing dquot, we have to call dquot_release() anyway to mark dquot structure as inactive. Otherwise we end in an infinite loop inside dqput(). Signed-off-by: Jan Kara <jack@suse.cz> Cc: xb <xavier.bru@bull.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-11 17:21:19 -07:00
Mark Fasheh	e535e2efd2	ocfs2: Fix calculation of i_blocks during truncate We were setting i_blocks too early - before truncating any allocation. Correct things to set i_blocks after the allocation change. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:39:46 -07:00
tao.ma@oracle.com	30b8548f2c	[PATCH] ocfs2: Fix a wrong cluster calculation. In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated by the byte length only. This may cause some problems if we start to write at some position in the end of one cluster and last to a second cluster while the "len" is smaller than a cluster size. In that case, we have to write 2 clusters actually. So we have to take the start position into consideration also. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:39:05 -07:00
Tiger Yang	c0123adef6	[PATCH] ocfs2: fix mount option parsing For some mount option types, ocfs2_parse_options() will try to access sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that hasn't been allocated yet and will cause a kernel crash. Fix this by storing options in a struct which can then get pushed into the ocfs2_super once it's been allocated later. If we need more options which store to the ocfs2_super in the future, we can just fields to this struct. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:38:48 -07:00
Mark Fasheh	10b0845bed	ocfs2: update docs for new features Update documentation listing ocfs2 features to reflect the current state of the file system. Add missing descriptions for some mount options which ocfs2 supports. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-09-11 11:38:25 -07:00
Neil Brown	b8da0d1c27	knfsd: Validate filehandle type in fsid_source fsid_source decided where to get the 'fsid' number to return for a GETATTR based on the type of filehandle. It can be from the device, from the fsid, or from the UUID. It is possible for the filehandle to be inconsistent with the export information, so make sure the export information actually has the info implied by the value returned by fsid_source. Signed-off-by: Neil Brown <neilb@suse.de> Cc: "Luiz Fernando N. Capitulino" <lcapitulino@gmail.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-10 18:57:47 -07:00
Neil Brown	a1033be72c	knfsd: Fixed problem with NFS exporting directories which are mounted on. Recent changes in NFSd cause a directory which is mounted-on to not appear properly when the filesystem containing it is exported. exp_get now returns -ENOENT rather than NULL and when commit `5d3dbbeaf5` removed the NULL checks, it didn't add a check for -ENOENT. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-10 18:57:47 -07:00
Eric Sandeen	5995cb7d80	[XFS] fix nasty quota hashtable allocation bug This git mod: `77e4635ae1` converted to a "greedy" allocation interface, but for the quota hashtables it switched from allocating XFS_QM_HASHSIZE (nr of elements) xfs_dqhash_t's to allocating only XFS_QM_HASHSIZE bytes - quite a lot smaller! Then when we converted hsize "back" to nr of elements (the division line) hsize went to 0. This was leading to oopses when running any quota tests on the Fedora 8 test kernel, but the problem has been there for almost a year. SGI-PV: 968837 SGI-Modid: xfs-linux-melb:xfs-kern:29354a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-05 14:51:04 +10:00
Christoph Hellwig	265c1fac38	[XFS] fix sparse shadowed variable warnings - in xfs_probe_cluster rename the inner len to pg_len. There's no harm here because the outer len isn't used after the inner len comes into existence but it keeps the code clean. - in xfs_da_do_buf remove the inner i because they don't overlap and they are both the same type. SGI-PV: 968555 SGI-Modid: xfs-linux-melb:xfs-kern:29311a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-05 14:50:26 +10:00
Christoph Hellwig	ee5c80239d	[XFS] fix ASSERT and ASSERT_ALWAYS - remove the != 0 inside the unlikely in ASSERT_ALWAYS because sparse now complains about comparisons between pointers and 0 - add a standalone ASSERT implementation because defining it to ASSERT_ALWAYS means the string is expanded before the token passing stringification. This way we get the actual content of the assertion in the assfail message and don't overflow sparse's stringification buffer leading to sparse error messages. SGI-PV: 968555 SGI-Modid: xfs-linux-melb:xfs-kern:29310a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-05 14:49:30 +10:00
Christoph Hellwig	34521c5e49	[XFS] Fix sparse warning in kmem_shake_allow We can't return a masked result of a __bitwise type. Compare it to 0 first to keep the behaviour without the warning. SGI-PV: 968555 SGI-Modid: xfs-linux-melb:xfs-kern:29309a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-05 14:48:00 +10:00
Christoph Hellwig	4b80916b29	[XFS] Fix sparse NULL vs 0 warnings Sparse now warns about comparing pointers to 0, so change all instance where that happens to NULL instead. SGI-PV: 968555 SGI-Modid: xfs-linux-melb:xfs-kern:29308a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-05 14:47:33 +10:00
David Chinner	8da22d7a36	[XFS] Set filestreams object timeout to something sane. SGI-PV: 968554 SGI-Modid: xfs-linux-melb:xfs-kern:29303a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-09-05 14:47:10 +10:00
Linus Torvalds	b1330031b7	Merge branch 'for_linus' of git://git.linux-nfs.org/pub/linux/nfs-2.6	2007-09-04 00:45:54 -07:00
Jason Lunz	fc0e01974c	[JFFS2] fix write deadlock regression I've bisected the deadlock when many small appends are done on jffs2 down to this commit: commit `6fe6900e1e` Author: Nick Piggin <npiggin@suse.de> Date: Sun May 6 14:49:04 2007 -0700 mm: make read_cache_page synchronous Ensure pages are uptodate after returning from read_cache_page, which allows us to cut out most of the filesystem-internal PageUptodate calls. I didn't have a great look down the call chains, but this appears to fixes 7 possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd. All depending on whether the filler is async and/or can return with a !uptodate page. It introduced a wait to read_cache_page, as well as a read_cache_page_async function equivalent to the old read_cache_page without any callers. Switching jffs2_gc_fetch_page to read_cache_page_async for the old behavior makes the deadlocks go away, but maybe reintroduces the use-before-uptodate problem? I don't understand the mm/fs interaction well enough to say. [It's fine. dwmw2.] Signed-off-by: Jason Lunz <lunz@falooley.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-09-02 18:18:38 +01:00
Trond Myklebust	1b3b4a1a2d	NFS: Fix a write request leak in nfs_invalidate_page() Ryusuke Konishi says: The recent truncate_complete_page() clears the dirty flag from a page before calling a_ops->invalidatepage(), ^^^^^^ static void truncate_complete_page(struct address_space mapping, struct page page) { ... cancel_dirty_page(page, PAGE_CACHE_SIZE); <--- Inserted here at kernel 2.6.20 if (PagePrivate(page)) do_invalidatepage(page, 0); ---> will call a_ops->invalidatepage() ... } and this is disturbing nfs_wb_page_priority() from calling nfs_writepage_locked() that is expected to handle the pending request (=nfs_page) associated with the page. int nfs_wb_page_priority(struct inode inode, struct page page, int how) { ... if (clear_page_dirty_for_io(page)) { ret = nfs_writepage_locked(page, &wbc); if (ret < 0) goto out; } ... } Since truncate_complete_page() will get rid of the page after a_ops->invalidatepage() returns, the request (=nfs_page) associated with the page becomes a garbage in nfs_inode->nfs_page_tree. ------------------------ Fix this by ensuring that nfs_wb_page_priority() recognises that it may also need to clear out non-dirty pages that have an nfs_page associated with them. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:54 -04:00
Chuck Lever	7d1cca7299	NFS: change NFS mount error return when hostname/pathname too long According to the mount(2) man page, the proper error return code for the mount(2) system call when the special device name or the mounted-on directory name is too long is ENAMETOOLONG. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:40 -04:00
Chuck Lever	350c73af6a	NFS: Off-by-one length error in string handling The hostname was getting truncated in the new text-based NFS mount API. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:40 -04:00
Chuck Lever	fdc6e2c8c0	NFS: Return a real error code from mount(2) Don't filter the return code from the in-kernel rpcbind or NFS mount clients. Return the real error code so that callers of the new NFS text-based mount API can apply a useful retry strategy. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:39 -04:00
Chuck Lever	fdb66ff4ac	NFS: mount option parser chokes on proto= The new text-based NFS mount option parsing logic doesn't recognize any valid transport protocols due to a silly mistake in the protocol token matching logic. This prevents basic mount requests such as: mount.nfs server:/export /mnt -o proto=tcp from working with the new text-based NFS mount API. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:38 -04:00
Trond Myklebust	deee9369b9	NFSv4: Ensure that we pass the correct dentry to nfs4_intent_set_file This patch fixes an Oops that was reported by Gabriel Barazer. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:38 -04:00
Trond Myklebust	65bbf6bdbb	NFSv4: Fix a typo in _nfs4_do_open_reclaim This should fix the following Oops reported by Jeff Garzik: kernel BUG at fs/nfs/nfs4xdr.c:1040! invalid opcode: 0000 [1] SMP CPU 0 Modules linked in: nfs lockd sunrpc af_packet ipv6 cpufreq_ondemand acpi_cpufreq battery floppy nvram sg snd_hda_intel ata_generic snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_page_alloc e1000 firewire_ohci ata_piix i2c_core sr_mod cdrom sata_sil ahci libata sd_mod scsi_mod ext3 jbd ehci_hcd uhci_hcd Pid: 16353, comm: 10.10.10.1-recl Not tainted 2.6.23-rc3 #1 RIP: 0010:[<ffffffff88240980>] [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330 RSP: 0018:ffff8100467c5c60 EFLAGS: 00010202 RAX: ffff81000f89b8b8 RBX: 00000000697a6f6d RCX: ffff81000f89b8b8 RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8100467c5c80 RBP: ffff8100467c5c80 R08: ffff81000f89bc30 R09: ffff81000f89b83f R10: 0000000000000001 R11: ffffffff881e79e0 R12: ffff81003cbd1808 R13: ffff81000f89b860 R14: ffff81005fc984e0 R15: ffffffff88240af0 FS: 0000000000000000(0000) GS:ffffffff8052a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002adb9e51a030 CR3: 000000007ea7e000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process 10.10.10.1-recl (pid: 16353, threadinfo ffff8100467c4000, task ffff8100038ce780) Stack: ffff81004aeb6a40 ffff81003cbd1808 ffff81003cbd1808 ffffffff88240b5d ffff81000f89b8bc ffff81005fc984e8 ffff81000f89bc30 ffff81005fc984e8 0000000300000000 0000000000000000 0000000000000000 ffff81003cbd1800 Call Trace: [<ffffffff88240b5d>] :nfs:nfs4_xdr_enc_open_noattr+0x6d/0x90 [<ffffffff881e74b7>] :sunrpc:rpcauth_wrap_req+0x97/0xf0 [<ffffffff88240af0>] :nfs:nfs4_xdr_enc_open_noattr+0x0/0x90 [<ffffffff881df57a>] :sunrpc:call_transmit+0x18a/0x290 [<ffffffff881e5e7b>] :sunrpc:__rpc_execute+0x6b/0x290 [<ffffffff881dff76>] :sunrpc:rpc_do_run_task+0x76/0xd0 [<ffffffff882373f6>] :nfs:_nfs4_proc_open+0x76/0x230 [<ffffffff88237a2e>] :nfs:nfs4_open_recover_helper+0x5e/0xc0 [<ffffffff88237b74>] :nfs:nfs4_open_recover+0xe4/0x120 [<ffffffff88238e14>] :nfs:nfs4_open_reclaim+0xa4/0xf0 [<ffffffff882413c5>] :nfs:nfs4_reclaim_open_state+0x55/0x1b0 [<ffffffff882417ea>] :nfs:reclaimer+0x2ca/0x390 [<ffffffff88241520>] :nfs:reclaimer+0x0/0x390 [<ffffffff8024e59b>] kthread+0x4b/0x80 [<ffffffff8020cad8>] child_rip+0xa/0x12 [<ffffffff8024e550>] kthread+0x0/0x80 [<ffffffff8020cace>] child_rip+0x0/0x12 Code: 0f 0b eb fe 48 89 ef c7 00 00 00 00 02 be 08 00 00 00 e8 79 RIP [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330 RSP <ffff8100467c5c60> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:37 -04:00
Trond Myklebust	560aef7450	NFS: Fix use of cancel_delayed_work_sync in nfs_release_automount_timer Doh! We can't use cancel_delayed_work_sync because we may have been called from an unmount that was being performed by nfs_automount_task. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-09-01 10:14:36 -04:00
Trond Myklebust	e89a5a43b9	NFS: Fix the mount regression This avoids the recent NFS mount regression (returning EBUSY when mounting the same filesystem twice with different parameters). The best I can do given the constraints appears to be to have the kernel first look for a superblock that matches both the fsid and the user-specified mount options, and then spawn off a new superblock if that search fails. Note that this is not the same as specifying nosharecache everywhere since nosharecache will never attempt to match an existing superblock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-31 20:26:45 -07:00
David Gibson	dec4ad86c2	hugepage: fix broken check for offset alignment in hugepage mappings For hugepage mappings, the file offset, like the address and size, needs to be aligned to the size of a hugepage. In commit `68589bc353`, the check for this was moved into prepare_hugepage_range() along with the address and size checks. But since BenH's rework of the get_unmapped_area() paths leading up to commit `4b1d89290b`, prepare_hugepage_range() is only called for MAP_FIXED mappings, not for other mappings. This means we're no longer ever checking for an aligned offset - I've confirmed that mmap() will (apparently) succeed with a misaligned offset on both powerpc and i386 at least. This patch restores the check, removing it from prepare_hugepage_range() and putting it back into hugetlbfs_file_mmap(). I'm putting it there, rather than in the get_unmapped_area() path so it only needs to go in one place, than separately in the half-dozen or so arch-specific implementations of hugetlb_get_unmapped_area(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Cc: Adam Litke <agl@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-31 01:42:23 -07:00
Ryusuke Konishi	2aeb3db17f	eCryptfs: fix possible fault in ecryptfs_sync_page This will avoid a possible fault in ecryptfs_sync_page(). In the function, eCryptfs calls sync_page() method of a lower filesystem without checking its existence. However, there are many filesystems that don't have this method including network filesystems such as NFS, AFS, and so forth. They may fail when an eCryptfs page is waiting for lock. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-31 01:42:23 -07:00
Jan Kara	f5cc15dac5	Fix possible NULL pointer dereference in udf_table_free_blocks() Fix possible NULL pointer dereference when freeing blocks in case table of free space is used. Also fix handling of the case when we need to move extent from one block to another one to make space for indirect extent. BTW: Nobody seem to have ever used this code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-31 01:42:22 -07:00
Jan Kara	bcec44770c	UDF: handle wrong superblock better If UDF superblock is incorrect, we can fail to find a table of free / allocated space and consequently Oops. Handle this situation more gracefully by ignoring the broken UDF partition. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-31 01:42:22 -07:00
Andrew Morton	060d11b0b3	revert "eCryptfs: fix lookup error for special files" This patch got appied twice. Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-31 01:42:22 -07:00
Linus Torvalds	d0797b39dc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: tweak the sched_runtime_limit tunable sched: skip updating rq's next_balance under null SD sched: fix broken SMT/MC optimizations sched: accounting regression since rc1 sched: fix sysctl directory permissions sched: sched_clock_idle_[sleep\|wakeup]_event()	2007-08-23 21:38:39 -07:00
Linus Torvalds	0542170dec	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: fix bad error path in conversion routines 9p: remove deprecated v9fs_fid_lookup_remove() 9p: update maintainers and documentation 9p: fix use after free	2007-08-23 21:38:21 -07:00
Linus Torvalds	de80af4cc9	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: sysfs: don't warn on removal of a nonexistent binary file HOWTO: latest lxr url address changed HOWTO: korean translation of Documentation/HOWTO Fix Off-by-one in /sys/module/*/refcnt sysfs: fix locking in sysfs_lookup() and sysfs_rename_dir()	2007-08-23 21:34:43 -07:00
Eric Van Hensbergen	fbcb7599e4	9p: remove deprecated v9fs_fid_lookup_remove() This patch removes the v9fs_fid_lookup_remove which is no longer used. Based on original patch from Adrian Bunk <bunk@stusta.de> which used #if 0 to isolate the code. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-08-23 10:13:45 -05:00
Christian Borntraeger	efe567fc82	sched: accounting regression since rc1 Fix the accounting regression for CONFIG_VIRT_CPU_ACCOUNTING. It reverts parts of commit `b27f03d4bd` by converting fs/proc/array.c back to cputime_t. The new functions task_utime and task_stime now return cputime_t instead of clock_t. If CONFIG_VIRT_CPU_ACCOUTING is set, task->utime and task->stime are returned directly instead of using sum_exec_runtime. Patch is tested on s390x with and without VIRT_CPU_ACCOUTING as well as on i386. [ mingo@elte.hu: cleanups, comments. ] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-08-23 15:18:02 +02:00
David Woodhouse	ac0c955d50	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-08-23 10:43:14 +01:00
Oleg Nesterov	abd96ecb29	exec: kill unsafe BUG_ON(sig->count) checks de_thread: if (atomic_read(&oldsighand->count) <= 1) BUG_ON(atomic_read(&sig->count) != 1); This is not safe without the rmb() in between. The results of two correctly ordered __exit_signal()->atomic_dec_and_test()'s could be seen out of order on our CPU. The same is true for the "thread_group_empty()" case, __unhash_process()'s changes could be seen before atomic_dec_and_test(&sig->count). On some platforms (including i386) atomic_read() doesn't provide even the compiler barrier, in that case these checks are simply racy. Remove these BUG_ON()'s. Alternatively, we can do something like BUG_ON( ({ smp_rmb(); atomic_read(&sig->count) != 1; }) ); Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-22 19:52:47 -07:00
Ian Kent	1864f7bd58	autofs4: deadlock during create Due to inconsistent locking in the VFS between calls to lookup and revalidate deadlock can occur in the automounter. The inconsistency is that the directory inode mutex is held for both lookup and revalidate calls when called via lookup_hash whereas it is held only for lookup during a path walk. Consequently, if the mutex is held during a call to revalidate autofs4 can't release the mutex to callback the daemon as it can't know whether it owns the mutex. This situation happens when a process tries to create a directory within an automount and a second process also tries to create the same directory between the lookup and the mkdir. Since the first process has dropped the mutex for the daemon callback, the second process takes it during revalidate leading to deadlock between the autofs daemon and the second process when the daemon tries to create the mount point directory. After spending quite a bit of time trying to resolve this on more than one occassion, using rather complex and ulgy approaches, it turns out that just delaying the hashing of the dentry until the create operation works fine. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-22 19:52:46 -07:00
Oleg Nesterov	f9ee228bdc	signalfd: make it group-wide, fix posix-timers scheduling With this patch any thread can dequeue its own private signals via signalfd, even if it was created by another sub-thread. To do so, we pass "current" to dequeue_signal() if the caller is from the same thread group. This also fixes the scheduling of posix timers broken by the previous patch. If the caller doesn't belong to this thread group, we can't handle __SI_TIMER case properly anyway. Perhaps we should forbid the cross-process signalfd usage and convert ctx->tsk to ctx->sighand. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Roland McGrath <roland@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-22 19:52:46 -07:00
Ryusuke Konishi	df06846416	eCryptfs: fix lookup error for special files When ecryptfs_lookup() is called against special files, eCryptfs generates the following errors because it tries to treat them like regular eCryptfs files. Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags [0x8000] Error opening lower_file to read header region Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95] Valid metadata not found in header region or xattr region; treating file as unencrypted For instance, the problem can be reproduced by the steps below. # mkdir /root/crypt /mnt/crypt # mount -t ecryptfs /root/crypt /mnt/crypt # mknod /mnt/crypt/c0 c 0 0 # umount /mnt/crypt # mount -t ecryptfs /root/crypt /mnt/crypt # ls -l /mnt/crypt This patch fixes it by adding a check similar to directories and symlinks. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-22 19:52:44 -07:00
Alan Stern	5f1835da79	sysfs: don't warn on removal of a nonexistent binary file This patch (as960) removes the error message and stack dump logged by sysfs_remove_bin_file() when someone tries to remove a nonexistent file. The warning doesn't seem to be needed, since none of the other file-, symlink-, or directory-removal routines in sysfs complain in a comparable way. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Tejun Heo <htejun@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-08-22 14:35:36 -07:00
Tejun Heo	6cb52147b2	sysfs: fix locking in sysfs_lookup() and sysfs_rename_dir() sd children list walking in sysfs_lookup() and sd renaming in sysfs_rename_dir() were left out during i_mutex -> sysfs_mutex conversion. Fix them. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-08-22 14:35:34 -07:00
Andrew Morton	f4e35647f5	[JFFS2] fix printk warning in jffs2_block_check_erase() fs/jffs2/erase.c: In function 'jffs2_block_check_erase': fs/jffs2/erase.c:355: warning: format '%08x' expects type 'unsigned int', but argument 3 has type 'long unsigned int' and fs/jffs2/erase.c: In function 'jffs2_erase_pending_blocks': fs/jffs2/erase.c:404: warning: 'bad_offset' may be used uninitialized in this function Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-22 12:41:48 +01:00
David Woodhouse	9ed437c50d	[JFFS2] Fix ACL vs. mode handling. When POSIX ACL support was enabled, we weren't writing correct legacy modes to the medium on inode creation, or when the ACL was set. This meant that the permissions would be incorrect after the file system was remounted. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-22 12:39:19 +01:00
Zach Brown	848c4dd515	dio: zero struct dio with kzalloc instead of manually This patch uses kzalloc to zero all of struct dio rather than manually trying to track which fields we rely on being zero. It passed aio+dio stress testing and some bug regression testing on ext3. This patch was introduced by Linus in the conversation that lead up to Badari's minimal fix to manually zero .map_bh.b_state in commit: `6a648fa721` It makes the code a bit smaller. Maybe a couple fewer cachelines to load, if we're lucky: text data bss dec hex filename 3285925 568506 1304616 `5159047` 4eb887 vmlinux 3285797 568506 1304616 5158919 4eb807 vmlinux.patched I was unable to measure a stable difference in the number of cpu cycles spent in blockdev_direct_IO() when pushing aio+dio 256K reads at ~340MB/s. So the resulting intent of the patch isn't a performance gain but to avoid exposing ourselves to the risk of finding another field like .map_bh.b_state where we rely on zeroing but don't enforce it in the code. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-20 22:50:25 -07:00
David Woodhouse	b574864333	JFFS2 locking regression fix. Commit `a491486a20` introduced a locking problem in JFFS2 -- we up() the alloc_sem when we weren't previously holding it. This leads to all kinds of fun behaviour later. There was a _reason_ for the if (1 /* alternative path needs testing */ \|\| which the above-mentioned commit removed :) Discovered and debugged by Giulio Fedel <giulio.fedel@andorsystems.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-20 22:44:27 -07:00
Linus Torvalds	edd5f25f74	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Check return code on failed alloc [CIFS] Update CIFS project web site [CIFS] Fix hang in find_writable_file	2007-08-18 09:30:07 -07:00
Marcel Holtmann	d2d56c5f51	Reset current->pdeath_signal on SUID binary execution This fixes a vulnerability in the "parent process death signal" implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd. and iSEC Security Research. http://marc.info/?l=bugtraq&m=118711306802632&w=2 Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-18 09:29:07 -07:00
Cyrill Gorcunov	5e6e623275	[CIFS] Check return code on failed alloc Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-08-18 00:15:20 +00:00
Steven Whitehouse	d18c4d687d	[GFS2] Revert remounting w/o acl option leaves acls enabled This reverts commit `569a7b6c2e`. The code was correct originally. The default setting for ACLs after a remount should be to be the same as before the remount. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:34:40 +01:00
Steven Whitehouse	b9af7ca6d3	[GFS2] Fix setting of inherit jdata attr Due to a mix up between the jdata attribute and inherit jdata attribute it has not been possible to set the inherit jdata attribute on directories. This is now fixed and the ioctl will report the inherit jdata attribute for directories rather than the jdata attribute as it did previously. This stems from our need to have the one bit in the ioctl attr flags mean two different things according to whether the underlying inode is a directory or not. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:34:11 +01:00
Steven Whitehouse	a867bb28c1	[GFS2] Fix incorrect error path in prepare_write() The error path in prepare_write() was incorrect in the (very rare) event that the transaction fails to start. The following prevents a NULL pointer dereference, Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:33:44 +01:00
Steven Whitehouse	6eefaf61f6	[GFS2] Fix incorrect return code in rgrp.c The following patch fixes a bug where 0 was being used as a return code to indicate "nothing to do" when in fact 0 was a valid block location which might be returned by the function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:33:15 +01:00
Bob Peterson	24c7387333	[GFS2] soft lockup in rgblk_search This patch seems to fix the problem described in bugzilla bug 246114. It was written by Steve Whitehouse with some tweaking by me. The code was looping in the relatively new section of code designed to search for and reuse unlinked inodes. In cases where it was finding an appropriate inode to reuse, it was looping around and finding the same block over and over because a "<=" check should have been a "<" when comparing the goal block to the last unlinked block found. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:32:43 +01:00
Bob Peterson	bdcb88562c	[GFS2] soft lockup detected in databuf_lo_before_commit This is part 2 of the patch for bug #245832, part 1 of which is already in the git tree. The problem was that sdp->sd_log_num_databuf was not always being protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf (which it is supposed to reflect) was protected. That meant there was a timing window during which gfs2_log_flush called databuf_lo_before_commit and the count didn't match what was really on the linked list in that window. So when it ran out of items on the linked list, it decremented total_dbuf from 0 to -1 and thus never left the "while(total_dbuf)" loop. The solution is to protect the variable sdp->sd_log_num_databuf so that the value will always match the contents of the linked list, and therefore the number will never go negative, and therefore, the loop will be exited properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:32:04 +01:00
David Teigland	3650925893	[DLM] fix basts for granted PR waiting CW Fix a long standing bug where a blocking callback would be missed when there's a granted lock in PR mode and waiting locks in both PR and CW modes (and the PR lock was added to the waiting queue before the CW lock). The logic simply compared the numerical values of the modes to determine if a blocking callback was required, but in the one case of PR and CW, the lower valued CW mode blocks the higher valued PR mode. We just need to add a special check for this PR/CW case in the tests that decide when a blocking callback is needed. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:31:02 +01:00
Patrick Caulfield	9e5f2825a8	[DLM] More othercon fixes The last patch to clean out 'othercon' structures only fixed half the problem. The attached addresses the other situations too, and fixes bz#238490 Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:30:36 +01:00
Jesper Juhl	1a2bf2eefb	[DLM] Fix memory leak in dlm_add_member() when dlm_node_weight() returns less than zero There's a memory leak in fs/dlm/member.c::dlm_add_member(). If "dlm_node_weight(ls->ls_name, nodeid)" returns < 0, then we'll return without freeing the memory allocated to the (at that point yet unused) 'memb'. This patch frees the allocated memory in that case and thus avoids the leak. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:30:04 +01:00
Patrick Caulfield	01c8cab258	[DLM] zero unused parts of sockaddr_storage When we build a sockaddr_storage for an IP address, clear the unused parts as they could be used for node comparisons. I have seen this occasionally make sctp connections fail. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:29:27 +01:00
David Teigland	41684f9547	[DLM] fix NULL ls usage Fix regression in recent patch "[DLM] variable allocation" which attempts to dereference an "ls" struct when it's NULL. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:28:44 +01:00
Patrick Caulfield	25720c2d73	[DLM] Clear othercon pointers when a connection is closed This patch clears the othercon pointer and frees the memory when a connnection is closed. This could cause a small memory leak when nodes leave the cluster. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-08-14 10:28:05 +01:00
Linus Torvalds	886c818348	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: set non-default s_time_gran during mount ocfs2: Retry sendpage() if it returns EAGAIN ocfs2: Fix rename/extend race [2.6 patch] ocfs2_insert_extent(): remove dead code ocfs2: Fix max offset calculations ocfs2: check ia_size limits in setattr ocfs2: Fix some casting errors related to file writes ocfs2: use s_maxbytes directly in ocfs2_change_file_space() ocfs2: Restrict inode changes in ocfs2_update_inode_atime()	2007-08-11 16:01:34 -07:00
Ryusuke Konishi	a75de1b379	eCryptfs: fix error handling in ecryptfs_init ecryptfs_init() exits without doing any cleanup jobs if ecryptfs_init_messaging() fails. In that case, eCryptfs leaves sysfs entries, leaks memory, and causes an invalid page fault. This patch fixes the problem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-11 15:47:40 -07:00
Ryusuke Konishi	202a21d691	eCryptfs: fix lookup error for special files When ecryptfs_lookup() is called against special files, eCryptfs generates the following errors because it tries to treat them like regular eCryptfs files. Error opening lower file for lower_dentry [0xffff810233a6f150], lower_mnt [0xffff810235bb4c80], and flags [0x8000] Error opening lower_file to read header region Error attempting to read the [user.ecryptfs] xattr from the lower file; return value = [-95] Valid metadata not found in header region or xattr region; treating file as unencrypted For instance, the problem can be reproduced by the steps below. # mkdir /root/crypt /mnt/crypt # mount -t ecryptfs /root/crypt /mnt/crypt # mknod /mnt/crypt/c0 c 0 0 # umount /mnt/crypt # mount -t ecryptfs /root/crypt /mnt/crypt # ls -l /mnt/crypt This patch fixes it by adding a check similar to directories and symlinks. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-11 15:47:40 -07:00
Badari Pulavarty	6a648fa721	direct-io: fix error-path crashes Need to initialize map_bh.b_state to zero. Otherwise, in case of a faulty user-buffer its possible to go into dio_zero_block() and submit a page by mistake - since it checks for buffer_new(). http://marc.info/?l=linux-kernel&m=118551339032528&w=2 akpm: Linus had a (better) patch to just do a kzalloc() in there, but it got lost. Probably this version is better for -stable anwyay. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Joe Jin <joe.jin@oracle.com> Acked-by: Zach Brown <zach.brown@oracle.com> Cc: gurudas pai <gurudas.pai@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-08-11 15:47:40 -07:00
Mark Fasheh	e0dceaf0a4	ocfs2: set non-default s_time_gran during mount We need to manually set this to '1' during mount, otherwise inode_setattr() will chop off the nanosecond portion of our timestamps. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:58 -07:00
Sunil Mushran	ce17204ae6	ocfs2: Retry sendpage() if it returns EAGAIN Instead of treating EAGAIN, returned from sendpage(), as an error, this patch retries the operation. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:38 -07:00
Sunil Mushran	480214d71f	ocfs2: Fix rename/extend race If one process is extending a file while another is renaming it, there exists a window when rename could flush the old inode's stale i_size to disk. This patch recognizes the fact that rename is only updating the old inode's ctime, so it ensures only that value is flushed to disk. Signed-off-by: Sunil Mushran <sunil.musran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:27:10 -07:00
Adrian Bunk	6a18380e7d	[2.6 patch] ocfs2_insert_extent(): remove dead code This patch removes some now dead code. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:26:03 -07:00
Mark Fasheh	5a25403175	ocfs2: Fix max offset calculations ocfs2_max_file_offset() was over-estimating the largest file size for several cases. This wasn't really a problem before, but now that we support sparse files, it needs to be more accurate. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:49 -07:00
Mark Fasheh	ce76fd30ce	ocfs2: check ia_size limits in setattr We have to manually check the requested truncate size as the check in vmtruncate() comes too late for Ocfs2. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:38 -07:00
Mark Fasheh	7c08d70c69	ocfs2: Fix some casting errors related to file writes ocfs2_align_clusters_to_page_index() needs to cast the clusters shift to pgoff_t and ocfs2_file_buffered_write() needs loff_t when calculating destination start for memcpy. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:27 -07:00
Mark Fasheh	a00cce356b	ocfs2: use s_maxbytes directly in ocfs2_change_file_space() There's no need to recalculate things via ocfs2_max_file_offset() as we've already done that to fill s_maxbytes, so use that instead. We can also un-export ocfs2_max_file_offset() then. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:25:07 -07:00
Mark Fasheh	c11e9fafb3	ocfs2: Restrict inode changes in ocfs2_update_inode_atime() ocfs2_update_inode_atime() calls ocfs2_mark_inode_dirty() to push changes from the struct inode into the ocfs2 disk inode. The problem is, ocfs2_mark_inode_dirty() might change other fields, depending on what happened to the struct inode. Since we don't always have locking to serialize changes to other fields (like i_size, etc), just fix things up to only touch the atime field. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-08-09 17:23:50 -07:00
Linus Torvalds	8b80fc02b8	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: SUNRPC: Replace flush_workqueue() with cancel_work_sync() and friends NFS: Replace flush_scheduled_work with cancel_work_sync() and friends SUNRPC: Don't call gss_delete_sec_context() from an rcu context NFSv4: Don't call put_rpccred() from an rcu callback NFS: Fix NFSv4 open stateid regressions NFSv4: Fix a locking regression in nfs4_set_mode_locked() NFS: Fix put_nfs_open_context SUNRPC: Fix a race in rpciod_down()	2007-08-09 08:38:14 -07:00
David Woodhouse	09b3fba562	[JFFS2] Correct cleanmarker checks -- we should use only 8 bytes Commit `a7a6ace140` revamped the OOB handling but accidentally switched to 12-byte cleanmarkers, which is incompatible with what 'flash_eraseall -j' will do. So using flash_eraseall -j and then trying to mount the 'empty' flash will fail, because the cleanmarkers aren't recognised. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-09 17:28:20 +08:00
Trond Myklebust	3d39c691ff	NFS: Replace flush_scheduled_work with cancel_work_sync() and friends This will avoid deadlocks of the form: stack backtrace: [<c0104fda>] show_trace_log_lvl+0x1a/0x30 [<c0105c02>] show_trace+0x12/0x20 [<c0105d15>] dump_stack+0x15/0x20 [<c013ee42>] __lock_acquire+0xc22/0x1030 [<c013f2b1>] lock_acquire+0x61/0x80 [<c012edd9>] flush_workqueue+0x49/0x70 [<c012ee0d>] flush_scheduled_work+0xd/0x10 [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs] [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs] [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs] [<c017b38d>] deactivate_super+0x7d/0xa0 [<c018f94b>] mntput_no_expire+0x4b/0x80 [<c018fd94>] expire_mount_list+0xe4/0x140 [<c0191219>] mark_mounts_for_expiry+0x99/0xb0 [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs] [<c012e61b>] run_workqueue+0x12b/0x1e0 [<c012f05b>] worker_thread+0x9b/0x100 [<c0131c72>] kthread+0x42/0x70 [<c0104c0f>] kernel_thread_helper+0x7/0x18 ======================= Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-08-07 16:12:50 -04:00
Trond Myklebust	905f8d16e3	NFSv4: Don't call put_rpccred() from an rcu callback Doing so would require us to introduce bh-safe locks into put_rpccred(). This patch fixes the lockdep complaint reported by Marc Dietrich: inconsistent {softirq-on-W} -> {in-softirq-W} usage. swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes: (rpc_credcache_lock){-+..}, at: [<c01dc487>] _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at: [<c013e870>] __lock_acquire+0x650/0x1030 [<c013f2b1>] lock_acquire+0x61/0x80 [<c02db9ac>] _spin_lock+0x2c/0x40 [<c01dc487>] _atomic_dec_and_lock+0x17/0x60 [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc] [<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc] [<dced3fd4>] a0 [sunrpc] [<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc] [<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc] [<dced65b3>] svc_register+0x93/0x160 [sunrpc] [<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc] [<dced7053>] svc_create+0x13/0x20 [sunrpc] [<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs] [<dcf48f36>] nfs_get_client+0x176/0x390 [nfs] [<dcf49181>] nfs4_set_client+0x31/0x190 [nfs] [<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs] [<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs] [<c017b444>] vfs_kern_mount+0x94/0x110 [<c0190a62>] do_mount+0x1f2/0x7d0 [<c01910a6>] sys_mount+0x66/0xa0 [<c0104046>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff irq event stamp: 5277830 hardirqs last enabled at (5277830): [<c017530a>] kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829): [<c01752d2>] kmem_cache_free+0x52/0xc0 softirqs last enabled at (5277798): [<c0124173>] __do_softirq+0xa3/0xc0 softirqs last disabled at (5277817): [<c01241d7>] do_softirq+0x47/0x50 other info that might help us debug this: no locks held by swapper/0. stack backtrace: [<c0104fda>] show_trace_log_lvl+0x1a/0x30 [<c0105c02>] show_trace+0x12/0x20 [<c0105d15>] dump_stack+0x15/0x20 [<c013ccc3>] print_usage_bug+0x153/0x160 [<c013d8b9>] mark_lock+0x449/0x620 [<c013e824>] __lock_acquire+0x604/0x1030 [<c013f2b1>] lock_acquire+0x61/0x80 [<c02db9ac>] _spin_lock+0x2c/0x40 [<c01dc487>] _atomic_dec_and_lock+0x17/0x60 [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc] [<dcf6bf83>] nfs_free_delegation_callback+0x13/0x20 [nfs] [<c012f9ea>] __rcu_process_callbacks+0x6a/0x1c0 [<c012fb52>] rcu_process_callbacks+0x12/0x30 [<c0124218>] tasklet_action+0x38/0x80 [<c0124125>] __do_softirq+0x55/0xc0 [<c01241d7>] do_softirq+0x47/0x50 [<c0124605>] irq_exit+0x35/0x40 [<c0112463>] smp_apic_timer_interrupt+0x43/0x80 [<c0104a77>] apic_timer_interrupt+0x33/0x38 [<c02690df>] cpuidle_idle_call+0x6f/0x90 [<c01023c3>] cpu_idle+0x43/0x70 [<c02d8c27>] rest_init+0x47/0x50 [<c03bcb6a>] start_kernel+0x22a/0x2b0 [<00000000>] 0x0 ======================= Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-08-07 15:15:57 -04:00
Trond Myklebust	45328c354e	NFS: Fix NFSv4 open stateid regressions Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been previously opened in these modes. Also Fix the calculation of the mode in nfs4_close_prepare. We should only issue an OPEN_DOWNGRADE if we're sure that we will still be holding the correct open modes. This may not be the case if we've been doing delegated opens. Finally, there is no need to adjust the open mode bit flags in nfs4_close_done(): that has already been done in nfs4_close_prepare(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-08-07 15:13:19 -04:00
Trond Myklebust	ba683031fa	NFSv4: Fix a locking regression in nfs4_set_mode_locked() We don't really need to clear &state->inode_states inside nfs4_set_mode_locked, and doing so without holding the inode->i_lock would in any case be a bug... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-08-07 15:13:18 -04:00
Trond Myklebust	5e11934d13	NFS: Fix put_nfs_open_context We need to grab the inode->i_lock atomically with the last reference put in order to remove the open context that is being freed from the nfsi->open_files list. Fix by converting the kref to a standard atomic counter and then using atomic_dec_and_lock()... Thanks to Arnd Bergmann for pointing out the problem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-08-07 15:13:17 -04:00
Masakazu Mokuno	313b0d3d86	[PATCH] remove duplicated ioctl entries in compat_ioctl.c This patch removes some duplicated wireless ioctl entries in the array 'struct ioctl_trans ioctl_start[]' of fs/compat_ioctl.c These entries are registered twice like: COMPATIBLE_IOCTL(SIOCGIWPRIV) and HANDLE_IOCTL(SIOCGIWPRIV, do_wireless_ioctl) Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-08-06 15:06:03 -04:00
David Woodhouse	b8e3ec30c2	[JFFS2] Print correct node offset when complaining about broken data CRC Debugging the hardware problems in OLPC trac #1905 would be a whole lot easier if the correct node offsets were printed for the offending nodes. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-02 21:43:46 +01:00
David Woodhouse	7b687707d7	[JFFS2] Fix suspend failure with JFFS2 GC thread. The try_to_freeze() call was in the wrong place; we need it in the signal-pending loop now that a pending freeze also makes signal_pending() return true. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-02 21:43:03 +01:00
David Woodhouse	71c2339775	[JFFS2] Deletion dirents should be REF_NORMAL, not REF_PRISTINE. Otherwise they'll never actually get garbage-collected. Noted by Jonathan Larmour. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-02 21:39:50 +01:00
Joakim Tjernlund	5bd5c03c31	[JFFS2] Prevent oops after 'node added in wrong place' debug check jffs2_add_physical_node_ref() should never really return error -- it's an internal debugging check which triggered. We really need to work out why and stop it happening. But in the meantime, let's make the failure mode a little less nasty. Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-02 21:36:35 +01:00
David Woodhouse	3ca135e16a	[JFFS2] LZO compression should default off for compatibility. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-08-02 16:32:02 +01:00
David Woodhouse	440fdb53b4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-08-01 11:23:57 +01:00
Cyrill Gorcunov	ca76d2d803	UDF: fix UID and GID mount option ignorance This patch fix weird behaviour of UDF mounting procedure. To get UID changed (for now) we have to type mount -t udf -o uid=some_user,uid=ignore /dev/device /mnt/moun_point and specifying two uid at once is strange a bit. So with the patch we are able to mount without additional 'uid=ignore' option. The same for GID option is done. This patch will not break current mount scheme (with two option). Btw this does fix (I hope) the following [BUG 6124] mount of UDF fs ignores UID and GID options http://bugzilla.kernel.org/show_bug.cgi?id=6124 Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael <auslands-kv@gmx.de> Cc: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:43 -07:00
Christoph Hellwig	0af1a45046	rename setlease to generic_setlease Make it a little more clear that this is the default implementation for the setleast operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:43 -07:00
david m. richter	9700382c3c	VFS: fix a race in lease-breaking during truncate It is possible that another process could acquire a new file lease right after break_lease() is called during a truncate, but before lease-granting is disabled by the subsequent get_write_access(). Merely switching the order of the break_lease() and get_write_access() calls prevents this race. Signed-off-by: David M. Richter <richterd@citi.umich.edu> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:42 -07:00
Robert P. J. Day	d7ef970baf	NCP: delete test of long-deceased CONFIG_NCPFS_DEBUGDENTRY Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Petr Vandrovec <petr@vandrovec.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:41 -07:00
Kirill Kuvaldin	817794e0df	isofs: mounting to regular file may succeed It turned out that mounting a corrupted ISO image to a regular file may succeed, e.g. if an image was prepared as follows: $ dd if=correct.iso of=bad.iso bs=4k count=8 We then can mount it to a regular file: # mount -o loop -t iso9660 bad.iso /tmp/file But mounting it to a directory fails with -ENOTDIR, simply because the root directory inode doesn't have S_IFDIR set and the condition in graft_tree() is met: if (S_ISDIR(nd->dentry->d_inode->i_mode) != S_ISDIR(mnt->mnt_root->d_inode->i_mode)) return -ENOTDIR This is because the root directory inode was read from an incorrect block. It's supposed to be read from sbi->s_firstdatazone, which is an absolute value and gets messed up in the case of an incorrect image. In order to somehow circumvent this we have to check that the root directory inode is actually a directory after all. Signed-off-by: Kirill Kuvaldin <kuvkir@epsmu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:41 -07:00
Alexey Dobriyan	5ea473a1df	Fix leaks on /proc/{*/sched,sched_debug,timer_list,timer_stats} On every open/close one struct seq_operations leaks. Kudos to /proc/slab_allocators. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:40 -07:00
David Howells	ff8e210a95	AFS: fix file locking Fix file locking for AFS: () Start the lock manager thread under a mutex to avoid a race. () Made the locking non-fair: New readlocks will jump pending writelocks if there's a readlock currently granted on a file. This makes the behaviour similar to Linux's VFS locking. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:40 -07:00
J. Bruce Fields	4a4b88317a	knfsd: eliminate unnecessary -ENOENT returns on export downcalls A succesful downcall with a negative result (which indicates that the given filesystem is not exported to the given user) should not return an error. Currently mountd is depending on stdio to write these downcalls. With some versions of libc this appears to cause subsequent writes to attempt to write all accumulated data (for which writes previously failed) along with any new data. This can prevent the kernel from seeing responses to later downcalls. Symptoms will be that nfsd fails to respond to certain requests. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:38 -07:00
J. Bruce Fields	0a725fc4d3	nfsd4: idmap upcalls should use unsigned uid and gid We shouldn't be using negative uid's and gid's in the idmap upcalls. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:38 -07:00
Jeff Layton	749997e512	knfsd: set the response bitmask for NFS4_CREATE_EXCLUSIVE RFC 3530 says: If the server uses an attribute to store the exclusive create verifier, it will signify which attribute by setting the appropriate bit in the attribute mask that is returned in the results. Linux uses the atime and mtime to store the verifier, but sends a zeroed out bitmask back to the client. This patch makes sure that we set the correct bits in the bitmask in this situation. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:38 -07:00
Mingming Cao	dd54567a83	"ext4_ext_put_in_cache" uses __u32 to receive physical block number Yan Zheng wrote: > I think I found a bug in ext4/extents.c, "ext4_ext_put_in_cache" uses > "__u32" to receive physical block number. "ext4_ext_put_in_cache" is > used in "ext4_ext_get_blocks", it sets ext4 inode's extent cache > according most recently tree lookup (higher 16 bits of saved physical > block number are always zero). when serving a mapping request, > "ext4_ext_get_blocks" first check whether the logical block is in > inode's extent cache. if the logical block is in the cache and the > cached region isn't a gap, "ext4_ext_get_blocks" gets physical block > number by using cached region's physical block number and offset in > the cached region. as described above, "ext4_ext_get_blocks" may > return wrong result when there are physical block numbers bigger than > 0xffffffff. > You are right. Thanks for reporting this! Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: Yan Zheng <yanzheng@21cn.com> Cc: <stable@kernel.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:37 -07:00
David Howells	2e92a3baee	NOMMU: Fix SYSV IPC SHM Fix the SYSV IPC SHM to work with the changes applied by the new fault handler patches when CONFIG_MMU=n. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:36 -07:00
David S. Miller	8163904e66	[SPARC]: Mark SBUS framebuffer ioctls as IGNORE in compat_ioctl.c They are handled in a ->compat_ioctl() handler, so it's just noise when compat_ioctl.c warns which occurs when they are used on non-SBUS framebuffer devices. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-30 00:27:36 -07:00
Mark Fortescue	3961bae0ac	[PARTITION]: Sun/Solaris VTOC table corrections Start doing VTOC validation before using its contents. The validation is adjusted so as not to break existing setups that do not set the VTOC version, sanity and partition count entries. VTOC tables with more than 8 partitions will NOT be used. Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-30 00:27:31 -07:00
Mark Fortescue	b84d879639	[PARTITION] MSDOS: Fix Sun num_partitions handling. Correct the Solaris x86 number of partitions (slices) is a way that is backward compatible with the earlier size. This works without a new VTOC structure definition as the timestamp and v_asciilabel fields in the VTOC are not used by the kernel yet. Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-30 00:27:28 -07:00
Alexey Dobriyan	4e950f6f01	Remove fs.h from mm.h Remove fs.h from mm.h. For this, 1) Uninline vma_wants_writenotify(). It's pretty huge anyway. 2) Add back fs.h or less bloated headers (err.h) to files that need it. As result, on x86_64 allyesconfig, fs.h dependencies cut down from 3929 files rebuilt down to 3444 (-12.3%). Cross-compile tested without regressions on my two usual configs and (sigh): alpha arm-mx1ads mips-bigsur powerpc-ebony alpha-allnoconfig arm-neponset mips-capcella powerpc-g5 alpha-defconfig arm-netwinder mips-cobalt powerpc-holly alpha-up arm-netx mips-db1000 powerpc-iseries arm arm-ns9xxx mips-db1100 powerpc-linkstation arm-assabet arm-omap_h2_1610 mips-db1200 powerpc-lite5200 arm-at91rm9200dk arm-onearm mips-db1500 powerpc-maple arm-at91rm9200ek arm-picotux200 mips-db1550 powerpc-mpc7448_hpc2 arm-at91sam9260ek arm-pleb mips-ddb5477 powerpc-mpc8272_ads arm-at91sam9261ek arm-pnx4008 mips-decstation powerpc-mpc8313_rdb arm-at91sam9263ek arm-pxa255-idp mips-e55 powerpc-mpc832x_mds arm-at91sam9rlek arm-realview mips-emma2rh powerpc-mpc832x_rdb arm-ateb9200 arm-realview-smp mips-excite powerpc-mpc834x_itx arm-badge4 arm-rpc mips-fulong powerpc-mpc834x_itxgp arm-carmeva arm-s3c2410 mips-ip22 powerpc-mpc834x_mds arm-cerfcube arm-shannon mips-ip27 powerpc-mpc836x_mds arm-clps7500 arm-shark mips-ip32 powerpc-mpc8540_ads arm-collie arm-simpad mips-jazz powerpc-mpc8544_ds arm-corgi arm-spitz mips-jmr3927 powerpc-mpc8560_ads arm-csb337 arm-trizeps4 mips-malta powerpc-mpc8568mds arm-csb637 arm-versatile mips-mipssim powerpc-mpc85xx_cds arm-ebsa110 i386 mips-mpc30x powerpc-mpc8641_hpcn arm-edb7211 i386-allnoconfig mips-msp71xx powerpc-mpc866_ads arm-em_x270 i386-defconfig mips-ocelot powerpc-mpc885_ads arm-ep93xx i386-up mips-pb1100 powerpc-pasemi arm-footbridge ia64 mips-pb1500 powerpc-pmac32 arm-fortunet ia64-allnoconfig mips-pb1550 powerpc-ppc64 arm-h3600 ia64-bigsur mips-pnx8550-jbs powerpc-prpmc2800 arm-h7201 ia64-defconfig mips-pnx8550-stb810 powerpc-ps3 arm-h7202 ia64-gensparse mips-qemu powerpc-pseries arm-hackkit ia64-sim mips-rbhma4200 powerpc-up arm-integrator ia64-sn2 mips-rbhma4500 s390 arm-iop13xx ia64-tiger mips-rm200 s390-allnoconfig arm-iop32x ia64-up mips-sb1250-swarm s390-defconfig arm-iop33x ia64-zx1 mips-sead s390-up arm-ixp2000 m68k mips-tb0219 sparc arm-ixp23xx m68k-amiga mips-tb0226 sparc-allnoconfig arm-ixp4xx m68k-apollo mips-tb0287 sparc-defconfig arm-jornada720 m68k-atari mips-workpad sparc-up arm-kafa m68k-bvme6000 mips-wrppmc sparc64 arm-kb9202 m68k-hp300 mips-yosemite sparc64-allnoconfig arm-ks8695 m68k-mac parisc sparc64-defconfig arm-lart m68k-mvme147 parisc-allnoconfig sparc64-up arm-lpd270 m68k-mvme16x parisc-defconfig um-x86_64 arm-lpd7a400 m68k-q40 parisc-up x86_64 arm-lpd7a404 m68k-sun3 powerpc x86_64-allnoconfig arm-lubbock m68k-sun3x powerpc-cell x86_64-defconfig arm-lusl7200 mips powerpc-celleb x86_64-up arm-mainstone mips-atlas powerpc-chrp32 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-29 17:09:29 -07:00
David Miller	778f3dd5a1	Fix procfs compat_ioctl regression It is important to only provide the compat_ioctl method if the downstream de->proc_fops does too, otherwise this utterly confuses the logic in fs/compat_ioctl.c and we end up doing the wrong thing. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-28 19:42:22 -07:00
Linus Torvalds	8e8ef2971b	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: docbook: add pipes, other fixes blktrace: use cpu_clock() instead of sched_clock() bsg: Fix build for CONFIG_BLOCK=n [patch] QUEUE_FLAG_READFULL QUEUE_FLAG_WRITEFULL comment fix	2007-07-28 19:31:13 -07:00
Tony Luck	7a6c813594	[IA64] Fix build failure in fs/quota.c `b716395e2b` added code to handle a compatability issue with 32bit quota tools, but the new compat routines are only needed when CONFIG_COMPAT=y (and with this set to 'n' there are compilation problems since some new typedefs are not visible). Reported by Doug Chapman. Fix tuned by a cast of thousands (Andi, Andreas, Arthur, HPA, Willy) Signed-off-by: Tony Luck <tony.luck@intel.com>	2007-07-27 15:40:13 -07:00
Randy Dunlap	79685b8dee	docbook: add pipes, other fixes Fix some typos in pipe.c and splice.c. Add pipes API to kernel-api.tmpl. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-27 08:08:51 +02:00
Eric Sandeen	780dcdb211	fix inode_table test in ext234_check_descriptors ext[234]_check_descriptors sanity checks block group descriptor geometry at mount time, testing whether the block bitmap, inode bitmap, and inode table reside wholly within the blockgroup. However, the inode table test is off by one so that if the last block in the inode table resides on the last block of the block group, the test incorrectly fails. This is because it tests the last block as (start + length) rather than (start + length - 1). This can be seen by trying to mount a filesystem made such as: mkfs.ext2 -F -b 1024 -m 0 -g 256 -N 3744 fsfile 1024 which yields: EXT2-fs error (device loop0): ext2_check_descriptors: Inode table for group 0 not in group (block 101)! EXT2-fs: group descriptors corrupted! There is a similar bug in e2fsprogs, patch already sent for that. (I wonder if inside(), outside(), and/or in_range() should someday be used in this and other tests throughout the ext filesystems...) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:35:17 -07:00
Davide Libenzi	098284020c	make timerfd return a u64 and fix the __put_user Davi fixed a missing cast in the __put_user(), that was making timerfd return a single byte instead of the full value. Talking with Michael about the timerfd man page, we think it'd be better to use a u64 for the returned value, to align it with the eventfd implementation. This is an ABI change. The timerfd code is new in 2.6.22 and if we merge this into 2.6.23 then we should also merge it into 2.6.22.x. That will leave a few early 2.6.22 kernels out in the wild which might misbehave when a future timerfd-enabled glibc is run on them. mtk says: The difference would be that read() will only return 4 bytes, while the application will expect 8. If the application is checking the size of returned value, as it should, then it will be able to detect the problem (it could even be sophisticated enough to know that if this is a 4-byte return, then it is running on an old 2.6.22 kernel). If the application is not checking the return from read(), then its 8-byte buffer will not be filled -- the contents of the last 4 bytes will be undefined, so the u64 value as a whole will be junk. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Davi Arnaut <davi@haxent.com.br> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:35:17 -07:00
Ulrich Drepper	f50cadaa8f	tiny signalfd cleanup This is probably a leftover from a time when the return wasn't there yet. Now the extra assignment is just irritating. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:33:06 -07:00
Al Viro	87588dd666	more reiserfs endianness annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:11:58 -07:00
Al Viro	ad690ef9e6	xfs ioctl __user annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:11:57 -07:00
Al Viro	ca5c8cde93	lockd and nfsd endianness annotation fixes Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:11:56 -07:00
Steve French	a403a0a370	[CIFS] Fix hang in find_writable_file Caused by unneeded reopen during reconnect while spinlock held. Fixes kernel bugzilla bug #7903 Thanks to Lin Feng Shen for testing this, and Amit Arora for some nice problem determination to narrow this down. Acked-by: Dave Kleikamp <shaggy@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-07-26 15:54:16 +00:00
Jens Axboe	3836df6b52	ocfs2: bad kunmap_atomic() kunmap_atomic() takes the virtual address, not the mapped page as argument. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-24 16:02:55 -07:00
Linus Torvalds	b2e961eb2e	Merge branch 'request-queue-t' of git://git.kernel.dk/linux-2.6-block * 'request-queue-t' of git://git.kernel.dk/linux-2.6-block: [BLOCK] Add request_queue_t and mark it deprecated [BLOCK] Get rid of request_queue_t typedef	2007-07-24 12:26:44 -07:00
Ulrich Drepper	0d786d4a27	fallocate syscall interface deficiency The fallocate syscall returns ENOSYS in case the filesystem does not support the operation and expects the userlevel code to fill in. This is good in concept. The problem is that the libc code for old kernels should be able to distinguish the case where the syscall is not at all available vs not functioning for a specific mount point. As is this is not possible and we always have to invoke the syscall even if the kernel doesn't support it. I suggest the following patch. Using EOPNOTSUPP is IMO the right thing to do. Cc: Amit Arora <aarora@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-24 12:24:58 -07:00
Jens Axboe	165125e1e4	[BLOCK] Get rid of request_queue_t typedef Some of the code has been gradually transitioned to using the proper struct request_queue, but there's lots left. So do a full sweet of the kernel and get rid of this typedef and replace its uses with the proper type. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-24 09:28:11 +02:00
David Woodhouse	39fe5434cb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-07-23 10:20:10 +01:00
Al Viro	41089644c1	fix broken handling of port=... in NFS option parsing Obviously broken on little-endian; fortunately, the option is not frequently used... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [ Hey, sparse is wonderful, but even better than sparse is having people like Al that actually _run_ it and fix bugs using it. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-22 11:15:18 -07:00
Ravikiran G Thirumalai	c3508f8f34	x86_64: Avoid too many remote cpu references due to /proc/stat Too many remote cpu references due to /proc/stat. On x86_64, with newer kernel versions, kstat_irqs is a bit of a problem. On every call to kstat_irqs, the process brings in per-cpu data from all online cpus. Doing this for NR_IRQS, which is now 256 + 32 * NR_CPUS results in (256+3263) 63 remote cpu references on a 64 cpu config. /proc/stat is parsed by common commands like top, who etc, causing lots of cacheline transfers This statistic seems useless. Other 'big iron' arches disable this. AK: changed to remove for all SMP setups AK: add comment Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 18:37:09 -07:00
Andrew Morton	d4e3cc387e	revert "PIE randomization" There are reports of this causing userspace failures (http://lkml.org/lkml/2007/7/20/421). Revert. Cc: Jan Kratochvil <honza@jikos.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Ulrich Kunitz <kune@deine-taler.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Bret Towe" <magnade@gmail.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:14 -07:00
J. Bruce Fields	3e63516c82	knfsd: fix typo in export display, print uid and gid as unsigned For display purposes, treat uid's and gid's as unsigned ints for now. Also fix a typo. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:14 -07:00
Jan Harkes	d3fec424b2	coda: remove CODA_STORE/CODA_RELEASE upcalls This is an variation on the patch sent by Christoph Hellwig which kills file_count abuse by the Coda kernel module by moving the coda_flush functionality into coda_release. However part of reason we were using the coda_flush callback was to allow Coda to pass errors that occur during writeback from the userspace cache manager back to close(). As Al Viro explained on linux-fsdevel, it is impossible to guarantee that such errors can in fact be returned back to the caller. There are many cases where the last reference to a file is not released by the close system call and it is also impossible to pick some close as a 'last-close' and delay it until all other references have been destroyed. The CODA_STORE/CODA_RELEASE upcall combination is clearly a broken design, and it is better to remove it completely. Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:14 -07:00
Cyrill Gorcunov	28de7948a8	UDF: coding style conversion - lindent fixups This patch fixes up sources after conversion by Lindent. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-21 17:49:14 -07:00
Jens Axboe	6a860c979b	splice: fix bad unlock_page() in error case If add_to_page_cache_lru() fails, the page will not be locked. But splice jumps to an error path that does a page release and unlock, causing a BUG() in unlock_page(). Fix this by adding one more label that just releases the page. This bug was actually triggered on EL5 by gurudas pai <gurudas.pai@oracle.com> using fio. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-20 09:07:01 -07:00
David Howells	bd6dc742a4	AFS: Use patched rxrpc_kernel_send_data() correctly Fix afs_send_simple_reply() to accept a greater-than-zero return value from rxrpc_kernel_send_data() as being a successful return rather than thinking it an error and aborting the call. rxrpc_kernel_send_data() previously returned zero incorrectly when it worked successfully, but has been patched to return the number of bytes it transmitted. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-20 08:54:14 -07:00
Nick Piggin	1833633803	fix some conversion overflows Fix page index to offset conversion overflows in buffer layer, ecryptfs, and ocfs2. It would be nice to convert the whole tree to page_offset, but for now just fix the bugs. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-20 08:44:19 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Al Viro	5f47c7eac6	coda breakage a) switch by loff_t == __cmpdi2 use. Replaced with a couple of obvious ifs; update of ->f_pos in the first one makes sure that we do the right thing in all cases. b) block_signals() and unblock_signals() are globals on UML. Renamed coda ones; in principle UML probably ought to do rename as well, but that's another story. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 16:29:55 -07:00
Linus Torvalds	fdb64f93b3	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Fix inode size update before data write in xfs_setattr [XFS] Allow punching holes to free space when at ENOSPC [XFS] Implement ->page_mkwrite in XFS. [FS] Implement block_page_mkwrite. Manually fix up conflict with Nick's VM fault handling patches in fs/xfs/linux-2.6/xfs_file.c Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 14:41:33 -07:00
Linus Torvalds	3e1f900bff	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: NFSv4: handle lack of clientaddr in option string NFSv4: debug print ntohl(status) in nfs client callback xdr code SUNRPC: Clean up the sillyrename code NFS: Introduce struct nfs_removeargs+nfs_removeres NFS: Use dentry->d_time to store the parent directory verifier. SUNRPC: move bkl locking and xdr proc invocation into a common helper NFSv4: Fix the nfsv4 readlink reply buffer alignment NFSv4: Fix the readdir reply buffer alignment NFSv4: More NFSv4 xdr cleanups NFSv4: Try to recover from getfh failures in nfs4_xdr_dec_open NFSv4: 'constify' lookup arguments. NFSv4: Don't fail nfs4_xdr_dec_open if decode_restorefh() failed NFSv4: Fix open state recovery NFSD/SUNRPC: Fix the automatic selection of RPCSEC_GSS	2007-07-19 14:33:41 -07:00
Linus Torvalds	f745bb1c73	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: ->fallocate() support	2007-07-19 14:16:44 -07:00
Jeff Layton	0a87cf128f	NFSv4: handle lack of clientaddr in option string If a NFSv4 mount is attempted with string based options, and the option string doesn't contain a clientaddr= option, the kernel will currently oops. Check for this situation and return a proper error. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-19 15:21:40 -04:00

... 3 4 5 6 7 ...

6599 Commits